My presentation from PGCon 2012, PostgreSQL on AWS with Reduced Tears, is now up.
There are 8 comments.
I would add that EC2 instances are the luck of the draw. I observed that CPU performance varies by a factor of 2x for m1.large instances and a factor of 5x for I/O. Usually, performance correlates with CPU speed (having your instance run on newer generation hardware with faster drives and more cache, running newer versions of Xen) and a simple way to get better performance is to start a bunch of instances, keep only those with higher CPU speeds, drop the runts and repeat the process.
My guess (and AWS Kremlinology is a fun past-time!) is that what you are seeing is the utilization of the particular machines by other customers. Although Amazon doesn’t say so, it wouldn’t surprise me if you get better performance out of lightly used physical machines… right up until they move someone onto them.
I’ve seen instance storage speed is much more consistent than EBS. If you RAID it, it’s pretty fast (and fast enough for me without RAID).
Even if the instance storage can go away at any time, that’s why you have replication, right?
Well, everything on Amazon is fast until it’s not.
(And, as a side note, RAIDing instance storage wouldn’t do anything useful, since they are just logical RAID stripes on the same physical volume.)
We’ve found a company called Zadara Storage that looks like it’ll solve the problems with EBS, at least up to 80MB/s of disk throughput. We’re testing them out now but they seem promising.
Interesting: iSCSI to their cloud-based servers. My first throught are: (a) They’re still dependent on the shared network connectivity, so that’s going to be a limitation; (b) the latency is going to dependent on the network topology to their servers, and might be quite high. (iSCSI is problematic for high transaction rates even when the two machines are dedicated and next to each other in the data center.)
Are you sure that doing raid0 on instance storage won’t buy you anything?
I’ve seen a few benchmarks that suggest otherwise. http://bioteam.net/2010/07/boot-ephemeral-ebs-storage-performance-on-amazon-cc1-4xlarge-instance-types/ is one.
> Are you sure that doing raid0 on instance storage won’t buy you anything?
As a general rule, yes; at any particular time, it might. Any set of benchmarks on AWS that were done on a single instance, in a narrow timeframe, are meaningless. Amazon does not guarantee that your two ephemeral volumes will even be on different physical disks.
And even that benchmark shows striped RAID on EBS outperforming ephemeral storage.
the build is christophe pettus' software development blog. it has an rss feed. christophe does application and database consulting through PostgreSQL Experts.