storage – Scale the Cloud

What is Amazon S3?

Amazon S3 stands for Amazon Simple Storage Service. It is cloud-based storage that can store an unlimited amount of data and fetch that data anytime over the web. S3 is built on Amazon’s global infrastructure making it highly scale-able, reliable, fast, and economical.

What can you do with AWS S3?

With S3 you can do several things:

Use it to backup data
Host a highly available website without managing servers
Use it to version copies of stored files
Use it as a serverless key-value data store in applications

Where is your data stored?

Your data is stored in a region
A region represents a general geographic area
Each region contains multiple availability zones. Often referred to as an AZ.
An availability zone is an isolated facility. That is your data is distributed across multiple facilities in the region you choose. In the event of a disaster at one facility, a copy of your data will exist in another facility and remain available.
There are current 18 geographic regions and 54 availability zones
You select your region when you create a bucket. A bucket is a container for your data.

Key concepts

S3 is object based storage (an object is like a file)
Objects are stored in buckets
An object can be up to 5TB in size
There is no limit on bucket size or total data in your account
Permissions can be set on buckets or individual objects
There is no hierarchy – that is you can’t create a bucket inside of another bucket.

No heirarchy, BUT…

You can represent folders logically by including a prefix in the object name. For example:

/my-bucket/my-folder/my-object.txt

In the above example, my-folder/my-object.txt is the full object name. Using a folder name as a prefix, gives us a logical way to organize our objects into folders and sub-folders.

What makes up an object?

An object is made up of:

Key – this is the name of the object (like a filename) and is used to retrieve an object
Value – this is any sequence of bytes you want to store (like the contents of a file)
VersionID
- The version id is a string generated by AWS when you add and object to a bucket. If versioning is turned a new object is created on every update with an updated version id.
- Together the key and version id uniquely identify an object in a bucket
Metadata
- Metadata consists of name-value pairs with which you can store information about the object.
- There is system meta data automatically created by AWS such as creatin date, size, storage class, etc
- And, there is user meta data. User meta data is custom name-value pairs you add to an object
Sub-resources
- Subresources consist of additional information such as ACL information (account owner) or torrent file information.

For more information on objects and all meta data available see the following links:

How secure is your data?

AWS infrastructure and services meet numberous compliance standards and regulation (PCI compliance and others)
S3 supports SSL for encryption of data in transit, and optionally you can encrypt your data at rest
Your data does not leave the region you specify
See the following for more detailed info on AWS security and compliance:
- https://aws.amazon.com/compliance/programs/
- https://aws.amazon.com/security/

It is really important to note that even though AWS infrastructure meets many compliance and regulatory requirements, your organization must still complete the necessary steps to ensure you meet necessary compliance. The fact that AWS meets many compliance requirements makes your road to compliance and/or certification smoother.

Storage Tiers in a Nutshell

Storage tiers are mainly used to provide the most cost-effective storage for your data based on how frequently it is accessed and how quickly you need to retrieve it.

Standard Storage – best when you need to access your data quickly, and frequently
Standard Infrequent Access (Standard IA) –
- Use this when you need quick access, but you don’t need frequent access
- Has a minimum of 128K object size and a minimum storage time-frame of 30 days
- Glacier
  - Use this for long term archive and backup
  - Minimum billable duration is 90 days
  - Can take minutes up to hours to retrieve (depending on options you specify, you pay more for decreased retreival time)

S3 Pricing in a Nutshell

Data transfer IN to S3 is free
Data transfer OUT costs
You pay for amount of data stored, per GB cost goes down the more data you have stored.
Pricing can vary across different regions

For all the details on pricing see: https://aws.amazon.com/s3/pricing/

What can you do with AWS S3?

You can:

Use it to backup data
Host a highly available website without managing servers
Use it to version copies of stored files
Use it as a serverless key – value data store in applications

Tag: storage

Amazon S3 Quick Overview

What is Amazon S3?

What can you do with AWS S3?

Where is your data stored?

Key concepts

No heirarchy, BUT…

What makes up an object?

How secure is your data?

Storage Tiers in a Nutshell

S3 Pricing in a Nutshell

What can you do with AWS S3?