In this blog we will discuss Amazon Web Services Simple Storage Service, or AWS S3.
The necessity to move data from on prem to cloud is growing data every day. Companies cannot predict the amount of capacity they will need in future and maintaining data warehouses and repositories can be challenging. Additionally, legacy applications may be able to keep up with future requirements, therefore rendering them difficult to maintain.
Also read: Amazon Web Services Simple Storage Service (S3) Security
Also read: Amazon Web Services Simple Storage Service (S3) Operations
Cloud storage is durable, secure, and scalable. Encryptions helps keep the data secure while versioning the data into multiple copies supports security durability. Cloud storage is highly scalable as it can automatically scales the storage according to the requirement and user can only pay for the used storage.
AWS provides robust data-oriented services like S3, RDS, Redshift, Snowflake, and EBS. There can be a combination of bucket storages and file storages. User can leverage data migration tools and combine them with schema conversation tools for data transformations using json if the data has relational sources or apis.
- Amazon S3 is a simple key, value object store designed for the internet
- S3 provides unlimited storage space and works on the pay-as-you-use model. Service rates get cheaper as the usage volume increases
- S3 offers an extremely durable, highly available, and infinitely scalable data storage infrastructure at very low costs
- S3 is Object-level storage (not Block level storage) and cannot be used to host OS or dynamic websites
- S3 resources for e.g. buckets and objects are private by default
What are Buckets and Objects in S3?
S3 Bucket:
- A Bucket is a container for Objects stored in S3
- Buckets help organize the S3 namespace
- A Bucket is owned by the AWS account that creates it and helps identify the account responsible for storage and data transfer charges
- S3 Bucket names are globally unique, regardless of the AWS region in which it was created, and the namespace is shared by all AWS accounts
- Even though S3 is a global service, Buckets are created within a region specified during the creation of the Bucket
- Every Object is contained in a Bucket
- There is no limit to the number of Objects that can be stored in a Bucket and no difference in performance whether a single Bucket or multiple Buckets are used to store all the Objects
- The S3 data model is a flat structure i.e. there are no hierarchies or folders within the Buckets. However, logical hierarchy can be inferred using the key name prefix e.g. Folder1/Object1
- Restrictions
- 100 Buckets (soft limit) and a maximum of 1,000 Buckets can be created in each AWS account
- Bucket names should be globally unique and DNS compliant
- Bucket ownership is not transferable
- Buckets cannot be nested and cannot have a Bucket within another Bucket
- Once created, Bucket name and region cannot be changed
- Empty or a non-empty Bucket can be deleted
- S3 allows retrieval of 1,000 Objects and provides pagination support
Objects:
- Objects are the fundamental entities stored in an S3 Bucket
- An Object is uniquely identified within a Bucket by a key name and version ID (if S3 versioning is enabled on the Bucket)
- Objects consist of Object data, metadata, and others
- Key is the Object name and a unique identifier for an Object
- Value is actual content stored
- Metadata is the data about the data and is a set of name-value pairs that describe the object for e.g. content-type, size, last modified. Custom metadata can also be specified at the time the Object is stored
- Version ID for an Object helps to uniquely identify an object within a Bucket in combination with the key
- Sub-resources help provide additional information for an Object
- Access Control Information helps control access to the Objects
- S3 Objects allow two kinds of metadata:
- System metadata
- Metadata such as the Last-Modified date is controlled by the system. Only S3 can modify the value
- System metadata that user can control, for e.g., the storage class, encryption configured for the object
- User-defined metadata
- User-defined metadata can be assigned during uploading the object or after the object has been uploaded
- User-defined metadata is stored with the object and is returned when an object is downloaded
- S3 does not process user-defined metadata
- User-defined metadata must begin with the prefix “x-amz-meta,“ otherwise S3 will not set the key-value pair as you define it
- Object metadata cannot be modified after the object is uploaded and it can be only modified by performing a copy operation and setting the metadata
- Objects belonging to a Bucket that reside in a specific AWS region never leave that region, unless explicitly copied using Cross Region replication
- Each Object can be up to 5 TB in size
- An Object can be retrieved partially or as a whole
- With Object Versioning enabled, current and previous versions of an object can be retrieved
To conclude, S3 provides the three key features of cloud computing: durability, availability, and scalability. Using Buckets and Objects, users can store and perform data transformation and data management very efficiently.
Apexon offers comprehensive cloud consulting and engineering capabilities to support customers’ digital initiatives including cloud strategy, migration, service discovery, and public/private cloud optimization. Our partnerships with AWS, Azure and GCP also equip us to unearth the full potential of these platforms for our clients. If you’re interested in learning more, check out Apexon’s Cloud Native Platform Engineering services or get in touch directly using the form below.