Pages

Monday 24 December 2018

AWS Storage Services: S3, Glacier, EBS, EFS, FSx and Storage Gateway

AWS provides following services under storage section:

1. S3
2. Glacier
3. EFS
4. FSx
5. Storage Gateway

Following are some basic and important points about AWS Storage services:

S3

1. Cloud storage service like Dropbox and Google Drive.

2. Object based storage not block level (like EBS and EFS). Data is treated as object. Single object in S3 can be uploaded up to 5TB in multi-part. You cannot install OS and software on it.

3. Buckets: Data is stored in buckets which are similar to Windows folders. Bucket name must be in lower case and alphanumeric. Bucket name must be unique globally. By default bucket is private.

4. Versioning: Versioning takes more space as each version is saved individually in same or different bucket. Versioning must be done for cross-region replication. Once versioning is enabled, it can’t be disabled, only suspended.

5. Storage Class
  • Standard (Frequently accessed data, minimum storage duration: 30 days,  99.999999999% durability (11 times 9))
  • Intelligent-Tiering (Long-lived data with changing or unknown access patterns)
  • Standard-IA (Long-lived, infrequently accessed data, minimum storage duration: 30 days, 99.999999999% durability (11 times 9))
  • One Zone-IA (Long-lived, infrequently accessed, non-critical data)
  • Glacier (Data archiving with retrieval times ranging from minutes to hours, minimum storage duration: 90 days)
  • Reduced Redundancy (Not recommended, Frequently accessed, non-critical data which even if get lost, it does not hamper you)
6. Encryption: 
  • SSE-S3 (uses AES 256 encryption methods)
  • SSE-KMS (Key Management Service)
  •  SSE-C (Client Side Encryption)
7. Bucket URL syntax: https://s3.regionname.amazonaws.com/bucketname/objectname

8. Eventual Consistency: When we upload a new file to S3, it becomes available immediately, but when we perform overwrite and delete operation, there is some delay which is known as eventual consistency. When a file upload to S3 is successful, it returns HTTP200 status.

9. Security: Data is secured using ACL (Access Control List) and Bucket Policies at Bucket or Object level. You can write custom bucket policies using JSON.

10. Data Transfer Acceleration: Enables quick upload of data to S3 bucket over long distance using CloudFront.

11. Lifecycle Management: You can manage transition of file from one storage class to another using Lifecycle rules. For example, you can move a file from Standard Storage Class to Infrequently Access Storage Class after some days (min 30 days) if it is not frequently used now. Similarly, if you want to archive this file after some days (min 30 days), you can further move it to Glacier.

12. Static Website Hosting: You can host static website and customize the URL using Route53.

Glacier 

1. Data Backup and Archive

2. Types of data: Hot Data (which we need on daily basis), Cold Data (which we don’t need on daily basis, archive this data to Glacier).

3. Delay in retrieval time and may take 3-5 hours.

4. Minimum storage duration in Glacier is 90 days. Archives deleted before 90 days incur a pro-rated charge equal to the storage charge for the remaining days.

EBS

1. Elastic Block Storage (just like Hard Disk of your laptop and can only be used by mounting on an EC2 instance unlike S3).

2. Backup of EBS volumes is called Snapshot and is done in incremental fashion. You can also take point-in-time snapshots of your EBS volumes and save it on S3.

3. To take backup of Root EBS (where OS is running), you must stop it first for data integrity.

4. Root EBS can’t be encrypted and “Delete on Termination” is checked by default.

5. To share snapshots between AWS accounts, make sure snapshots MUST NOT be encrypted.

6. Multiple Availability Zone is NOT supported.

7. Cannot attach one EBS volume to multiple EC2. Use EFS for this.

8. RAID0, RAID1 and RAID10 (combination of both) are preferred. RAID5 is discouraged.

9. EBS Volume Types
  • General Purpose (SSD) (gp2) volumes can burst to 3000 IOPS, and deliver a consistent baseline of 3 IOPS/GiB. 
  • Provisioned IOPs (SSD) (io1) volumes can deliver up to 64000 IOPS, and are best for EBS-optimized instances. 
  • Max Throughput Optimized HDD (ST1) – For frequent accessed data
  • Max Cold HDD (SC1) – For IA (in-frequent accessed data)
  • Magnetic volumes, previously called standard volumes, deliver 100 IOPS on average, and can burst to hundreds of IOPS. Lowest cost
EFS

1. Elastic File System somewhat like EBS. 

2. EFS can be mounted on several EC2 instances and on-premise servers at the same time unlike EBS.

3. EFS currently only works with Linux, not with Windows.

4. EBS has fixed amount of storage while EFS can be scaled whenever required.

5. Coming soon, the Amazon EFS Infrequent Access storage class.

6. EBS and EFS cannot be used as an origin for CDN unlike S3.

7. EBS and EFS are faster than S3 as these are directly mounted on EC2.

Storage Gateway

1. Integrates on-premise datacenter storage with cloud storage.

2. It connects to AWS storage services, such as Amazon S3, Amazon Glacier, and Amazon EBS, providing storage for files, volumes, and virtual tapes in AWS.

3. Storage Gateway is downloaded and installed at on-premise.

4. Caching and monitoring of data using Storage Gateway.

5. File Gateway: Simple file storage using NFS (Network File System) protocol.

6. Volume Gateway: Hard disk / block storage, cached mode (frequent access data is in cache of Volume Gateway and entire data is in cloud) and storage mode (entire data is in data center and asynchronously backed up to cloud).

7. Tape Gateway

No comments:

Post a Comment

About the Author

I have more than 10 years of experience in IT industry. Linkedin Profile

I am currently messing up with neural networks in deep learning. I am learning Python, TensorFlow and Keras.

Author: I am an author of a book on deep learning.

Quiz: I run an online quiz on machine learning and deep learning.