10:22
Amazon’s New I/O Offerings
1 August 2012
Amazon has introduced a couple of new I/O-related offerings in AWS, both aimed at addressing the notoriously poor I/O performance of EBS.
The first is the EC2 High I/O Quadruple Extra Large Instance. This is a standard Quad XL instance with two 1TB SSD-backed volumes directly attached to the instance. Although Amazon does not quote I/O performance on this configuration, it should be quite speedy… under good conditions.
Before you race to deploy your database on this configuration, howver, remember:
- You are sharing the physical hardware with other users. You don’t get the SSDs all to yourself. How good your performance will be will depend heavily on the other tenants on the hardware.
- This is ephemeral storage. It does not persist if the instance is shut down, and it can disappear if Amazon reprovisions the hardware. You must set this up with (monitored) streaming replication if you are running PostgreSQL, as you have no strong guarantee as to the integrity of the storage.
- Of course, you pay for it. A High I/O instance is about 72% more than a standard Quad XL instance, based on on-demand pricing.
The next product offering is Provisioned IOPS on EBS. This allows you to guarantee a certain number of I/O operations per second, up to 1,000 IOP/s. This should go a long way towards reducing the uncertainty around EBS, but it also comes with some caveats:
- 1,000 IOP/s is based on 16KB blocks, and decreases as that block size increases. This means that 1,000 IOP/s per second is about 16MB/s. That’s about 1/5th the speed of a 7200 RPM SATA drive. Ths is not, shall we say, super-impressive I/O performance. (You can increase this by striping the EBS volumes, at the cost of losing snapshotting.)
- This costs more, of course. An “EBS-optimized” Quad XL instance is an extra $0.05 per hour. Of course, you pay for the I/O, too.
- The storage is also 25% more than a standard EBS volume.
- This offers no latency guarantees (for a 1,000 IOP/s provisioning, the IOP/s guarantee only applies if your I/O queue length is 5 requests or more, that is to say, saturated).
So, these products are far from useless, but they are incremental, not revolutionary.
There are 4 comments.
Marti Raudsepp at 14:30, 1 August 2012:
> That’s about 1/5th the speed of a 7200 RPM SATA drive
That’s somewhat misleading. It’s correct when measuring sequential throughput only (depending on density and offset of the drive). But in a random access workload, 7200RPM can only deliver ~120 IOPS, so that’s 1/8 of what Amazon offers.
Xof at 14:35, 1 August 2012:
Well, I’d argue that it is still a fair comparison. Amazon only guarantees the 1,000 IOP/s if you keep the channel to the EBS server saturated. That’s much less likely with random I/O than sequential I/O, and Amazon does not publish latency numbers, nor do they guarantee any particular latency characteristics.
And, of course, if you got 1,000 IOP/s out of a modern SSD, you’d send it back.
Alex at 19:32, 1 August 2012:
the High I/O Instances disks are not that ephemeral since they survive to a reboot of the instance, and amazon gives you sometime if they force you to shutdown your server. i’m not saying it’s the best solution in the word but it’s a little better than the ephemeral storage on the other instances…
Xof at 19:35, 1 August 2012:
Well, “ephemeral” is Amazon’s term, not mine; they are *exactly* like the ephemeral storage on all instances. They survive reboots, but they do not survive beyond the particular instance, *and* they are not guaranteed to persist if Amazon migrates your instance to a new physical machine, which can happen all the time. You simply must assume they could vanish at any moment.