S3 Dynamic Storage

Page last updated:

Overview

Dynamic Storage, the Cloud Storage of Swisscom, is a pay-per-use object storage optimized for storing large data sets for your applications (e.g. backups, archives, pictures etc.). You can access your stored data through a RESTful API.

Swisscom Dynamic Storage supports the S3 API:

Amazon S3 API:

  • Very well known on the market
  • State of the art, de facto standard
  • Provides a big community
  • Has hundreds of scripts and tools available

Integrating the Service With Your App

After the creation of the service and the binding of the service to an application, the environment variable VCAP_SERVICES is created. The information about the credentials is stored in this variable:

{
  "dynstrg": [
    {
      "credentials": {
        "accessHost": "ds11s3.swisscom.com",
        "namespaceHost": "<Namespace-ID>.ds11s3ns.swisscom.com",
        "accessKeyID": "<Namespace-AccountName>",
        "secretAccessKey": "40-Characters"
      },
      "label": "dynstrg",
      "name": "testecs",
      "plan": "usage",
      "tags": []
    }
  ]
}

Usage vs. Usage-encrypted Plan

For Dynamic Storage, the Cloud Storage of Swisscom, two different plans are available:

cf marketplace -s dynstrg
Getting service plan information for service dynstrg as admin...
OK

service plan      description                                                             free or paid
usage             Usage based object storage with 1 GB included storage space             paid
usage-encrypted   Usage based encrypted object storage with 1 GB included storage space   paid

Both plans, usage and usage-encrypted, from developers’ and users’ perspective are the same in terms of handling.

With usage-encrypted plan, the Service Broker sets a flag on your namespace. With that, your created bucket and its content is encrypted in the storage box layer. The encryption is done by auto-generated certificates in the storage system, and is implemented with software (every encryption causes a little performance degradation).

If someone would be able to steal a physical harddisk of the storage box (very unlikely scenario because of very strict data center security processes), your data would be readable in cleartext without the usage-encrypted plan.

Sample Application

Swisscom: Dynamic Storage How-To

Deleting a Service Instance

Before deleting a Dynamic Storage service instance, you have to remove all its buckets and objects. Otherwise, the deletion will fail.

To do so, you can use the S3cmd tool and run the following command for each bucket:

$ s3cmd del --recursive s3://bucket-name

In case you don’t have any credentials for your Dynamic Storage instance, you can get them by creating a service key.

ECS endpoints / access points

Trust Level URL Usecase
Public https://ds11s3.swisscom.com Customers which would like to connect through internet to our service

How to access Dynamic Storage with the Amazon S3 API

Amazon S3 Swisscom S3 Compatible Storage
Access Access Key ID Access Key ID
Access Key Secret Access Key Secret Access Key
Bucket Path Style http[s]://s3.amazonaws.com/<bucket> https://ds11s3.swisscom.com/\
Bucket Virtual Host Style http[s]://<bucket>.s3.amazonaws.com https://<bucket>.ds11s3.swisscom.com
Namespace-style URL - https://<namespace>.ds11s3NS.swisscom.com/<bucket>

Important: For security reasons, we only support HTTPS (443) and no HTTP (80)!

Bucket/Object naming conventions

Bucket Name

Please follow these conventions when naming S3 buckets in ECS:

  • Names must be between 3 and 63 characters long.
  • Names must be DNS compliant: lowercase alphanumeric characters ([a-z0-9]) and hyphens (-).

Object Name

The following rules apply to the naming of ECS S3 objects:

  • Cannot be null or an empty string
  • Length range is 1-1024 (unicode char)
  • No validation on characters!

Namespace-style URL

In comparison to Amazon S3, we are using namespaces allowing you to create bucket names which are unique only within your namespace. This means, that your bucket names don’t have to be globally unique. On the other hand, you need to use namespace-style URL (e.g. https://8bb3085l69c42879b4aga5ce7016ffff.ds11s3NS.swisscom.com/mybucket/myobject) to access your content for static webpages.

Using Amazon S3 compatible software on Dynamic Storage

Most of the software which has been developed to use Amazon S3 API will work with Dynamic Storage.

The unsupported operations, rarely used, are listed below.

The challenge arises often on the configuration side. Sometimes the Amazon URL and ports may be hard-coded, the software may use the path style or the virtual style.

Authentication

From the AWS S3 docs: The Amazon S3 REST API uses a custom HTTP scheme based on a keyed-HMAC (Hash Message Authentication Code) for authentication. To authenticate a request, first you concatenate selected elements of the request to form a string. You then use your AWS secret access key to calculate the HMAC of that string. Informally, we call this process “signing the request,” and the output of the HMAC algorithm “the signature” because it simulates the security properties of a real signature. Finally, you add this signature as a parameter of the request by using the syntax shown in this section. When the system receives an authenticated request, it fetches the AWS secret access key that you claim to have. Then it uses it in the same way to compute a signature for the message it received. Finally, it compares the calculated signature against the signature presented by the requester. If the two signatures match, the system concludes that the requester has access to the AWS secret access key and therefore acts with the authority of the principal to whom the key was issued. If the two signatures do not match, the request is dropped and the system responds with an error message.

When using ECS, the Access Key ID corresponds to and the Secret Access Key corresponds to the Shared Secret Access Key.

The Amazon S3 API operations

The following operations are supported

Signatures

  • Signature v2
  • Signature v4

Bucket operations

  • GET Bucket
  • GET Bucket ACL
  • GET Bucket Location
  • GET Bucket Versioning
  • GET Service
  • GET Bucket lifecycle
  • GET Bucket policy
  • List Multipart Uploads
  • PUT Bucket
  • PUT Bucket ACL
  • PUT Bucket lifecycle
  • PUT Bucket versioning
  • PUT Bucket policy
  • DELETE Bucket
  • DELETE Bucket lifecycle
  • DELETE Bucket policy

Object operations

  • DELETE Object
  • DELETE Multiple Objects
  • GET Object
  • GET Object ACL
  • GET Object versions
  • HEAD Object
  • POST Object
  • POST Object restore
  • PUT Object
  • PUT Object ACL

  • Multipart Upload: Initiate, Upload, Complete, Abort, List parts

More details

To get more information about the underlying EMC ECS product, please consult their REST API documentation.

The following operations are unsupported: (these functions are used very rarely)

Bucket operations

  • DELETE Bucket tagging
  • DELETE Bucket website
  • GET Bucket notification
  • GET Bucket logging
  • GET Bucket requestPayment
  • GET Bucket website
  • PUT Bucket logging
  • PUT Bucket notification
  • PUT Bucket requestPayment
  • PUT Bucket tagging
  • PUT Bucket website

Object operations

  • GET Object torrent

Using an S3 compatible URL, Dynamic Storage can easily be used as a backend.

Best Practices

App Design

  • Don’t require low latency
    • Don’t execute serial workflows. Process many operations in parallel.
  • Be prepared to handle failures
    • Most SDKs have built-in retry mechanisms to handle this for you
  • Avoid listing buckets during normal workflows
    • S3 API can only list serially and only returns 1000 results per page. This means 1000 API calls per 1 million objects.
    • Best suited for periodic cleanup/consistency tasks

Namespaces, Buckets and Objects

  • Bucket names must be unique within a namespace
  • For best performance, we recommend to have less than 1000 buckets in a namespace (AWS limits to 100, the standard S3 API cannot paginate)
  • Namespace and Bucket names should be DNS compatible
    • They can appear in a DNS record if using virtually-hosted buckets
    • In particular, no underscores, no capital letters & not accessible through DNS
  • The amount of objects is not a problem
    • No need to enforce limits like in a traditional FS
    • ECS is designed to handle trillions of objects
  • Ideally, the object size should be >100 KB
    • The ECS internal buffer size is 2 MB, you will see better performance if you align your object size to that
  • S3 has no support for directory structures
    • Uses a flat structure
    • Some drivers and apps emulate a directory structure (prefix+delimiter)
  • Use metadata search (i.e. date range) for listing objects
    • Be prepared to paginate bucket listings
    • Align the range to your app and request only what is needed
    • Marker, NextMarker, MaxKeys

Large Objects

  • Use multipart uploads for large objects (>100 MB)
    • Allows you to pause and resume uploads
    • Improves throughput and parallel uploads
    • For size < 1 GB, use multiple of 2 MB (e.g. 8 MB)
    • For size > 1 GB, use exactly 128 MB part size (to fill a full chunk)
  • Use Byte Range Reads for parallel downloads of large objects
  • In Java, use the TransferManager
  • In .NET, use TransferUtility

Traffic

  • You can bypass your application server with pre-signed URLs
    • Authentication encoded into the URL itself, executed by client
    • Users will download and upload object directly from/to ECS as “you”
    • Object update frequency should be low
    • Object storage platforms are not designed for transactional workloads and performance impact
    • Ideally, WORM content (e.g. sensor data, images, videos)
  • Avoid HEAD + PUT for conditional overwrite
    • Understand your reasons for doing so; it’s generally not necessary
    • If you must, use PUT with If-None-Match / If-Match instead.
    • Use If-None-Match: * to conditionally create if object doesn’t exist.
    • Use If-Match: {Etag} to conditionally overwrite an object if Etag matches.
  • Use the object copy operation
    • Instead of downloading and uploading the object again
  • Beware concurrent write requests for the same object
    • 200 OK means transaction committed
    • Order not guaranteed
    • If order needs to be guaranteed, use Conditional PUTs (ECS extension)
  • Consider using client-side compression for slow links and compressible content
    • ECS Java SDK supports ZIP and LZMA

ECS Extensions

  • Metadata Search
    • Offload some searching from your database
    • Support for “<,>,<=,>=,=,!=, AND/OR”
    • Enabled at bucket level at creation time
  • Byte Range uploads
    • Update objects partially
    • AWS objects are immutable
  • Atomic append
    • Useful for multi-client streams (e.g. syslog, sensor data)

Retention and Expiration

  • Retention means you cannot update, overwrite, or delete the object until the retention period ends
  • There are two ways to assign retention:
    • At the bucket level (compatible with generic S3)
    • Explicit retention period at an object level
  • Expiration will automatically delete the object when it expires
    • In S3 this is set using “Bucket Lifecycle”.
  • Retention overrides Expiration
    • An object will never get expired until its retention ends
    • Retention can be redefined at any time

Security

  • Grant as few permissions as possible
    • e.g. no need to grant read-write permissions if your app only needs to read data
  • Use client-side encryption for maximum protection
    • But keep your master key secure
    • If you lose your master key, you lose your data
    • Pre-signed URLs will not work unless the receiver knows your encryption keys.

Data Management

  • Use separate buckets for your environments
    • e.g. dev, test, staging, production
    • Easier for developers and DevOps to manage

Implementation

  • Use a SDK for your programming language
    • No need to reinvent the wheel
  • Use ECS Object Client if you want to take advantage of ECS features
    • Currently available for Java only
  • Use AWS SDK if you want to maintain compatibility with AWS

CORS

In order to access your bucket objects directly from an SPA (without having your backend in between), you can specify CORS policies per bucket by using the S3 API, for example https://docs.aws.amazon.com/AmazonS3/latest/dev/cors.html. There are also S3 GUI clients that support setting CORS.

Be aware that any access and secret keys that are included in the SPA end up in the user’s browser where it can be easily read by any end user.

After having configured CORS on your bucket, you need to access it using the namespaceHost endpoint found in the service binding credentials (see example above). Also, you need to use path-style in your S3 library.

Example with the AWS S3 JS SDK:

AWS.config.update({
  accessKeyId: '1bbd085e69c44879b4aea5cedeadf00d-bucketowner',
  secretAccessKey: '<redacted>',
  s3ForcePathStyle: true
});
var s3Endpoint = new AWS.Endpoint('https://1bbd085e69c44879b4aea5cedeadf00d.ds11s3ns.swisscom.com');
var s3Client = new AWS.S3({endpoint: s3Endpoint});

// Create params for S3.createBucket
var readParams = {
  Bucket : 'cors-bucket',
};

// call S3 to list the bucket contents
s3Client.listObjects(readParams, function(err, data) {
  if (err) {
    console.log("Error", err);
  } else {
    console.log("Success", data);
  }
});

Be aware that listing buckets in a browser (which needs CORS) is not supported - also not by AWS s3 - since CORS is always set on bucket level.

View the source for this page in GitHub