There are 3 options offered by Amazon but choose the right storage based on requirements.
1. Amazon EBS – Block storage
There are 5 different types of EBS volumes. While creating an EBS volume select the right EBS storage type based on performance and durability needs. Provisioned IOPS are expensive and it is recommended that they are used only on critical systems where there is a need for high performance i.e Databases.
In our self managed Hadoop cluster we use gp2 storage to just store Hadoop cluster logs. For data storage, we use object storage. These are the few standards we follow internally to keep EBS costs low.
- Create EBS volumes with minimal buffer size than required (max 20%).
- Keep application logs on a dedicated disk.
- Delete unattached EBS volumes regularly.
- Enable the “Delete on Termination” option on non-critical instances.
- Delete obsolete Snapshots
- Use LVM.
- Terminate Instances that are not in use. Just stopping them will only reduce instance costs but EBS costs will prevail.
2. Amazon S3
Object storage: There are 6 storage classes that are designed to accommodate different access requirements. This guide by Amazon will help you to choose the right class for your use case.
- Setup right policy for the data archival or deletion
- Enable vpc endpoint to reduce S3 data transfer cost.
3. Amazon EFS – file storage
- There are multiple classes that are designed to accommodate frequent or in-frequent data access requirements.
- Setup right policy for the data migration from general purpose to infrequent access class storage to reduce cost drastically.
- Provision throughput use when absolutely required.