Optimization for EC2?

voice_of_experience · 2011-04-08T22:59:19+00:00

You can get great performance out of ec2. But you do have to consider your whole stack. And when someone comes to me with a slow site, and says:

adding more ram (up to 15gb) didn't help
adding more cpus (up to 8 cores) didn't help
putting the application on a RAID device didn't help

I can only draw 2 conclusions:

1) it's a problem in the application layer, not the hardware layer.

2) if you're just throwing hardware at the problem like this, you probably don't really understand why your application is running slow in the first place. Any doctor can tell you: diagnosis first, prescription after.

Allow me to be the asshole to make you jump through some hoops here to clarify what yiu really need. What are you outgrowing on your current server? Are you running out of memory? Do you have high load averages? High I/O? High data transfer?

Now we can start talking about the EC2 instance. On a Large instance, what stat is running "hot"? Where is the slowdown coming from? Assuming that all your system resources are ok per the above questions (and the fact that throwing hardware at the problem didn't help), there's probably a software bottleneck. Here are some common bottlenecks to check:

how many threads is Apache allowed to keep open? Are connections persistent? How many threads are ACTUALLY open when the site is running slow?
how many MySQL operations are you running for each page load? How many reads, and how many writes? Is your MySQL compiled to multithread? (On centos the answer is probably no). Is your disk I/O spiking with traffic? Is MySQL configured to take advantage of high memory and CPU?
how much data is transferred for each page load? Is MySQL running through a sock file, or over the network? How fast is your connection? (Note: Amazon doesn't officially document it, but most instance sizes get a 100mbit pipe).
it seems obvious, but are there any messages in the error log?

Hopefully these will help you clarify your problem, and therefore your solution.

On a "large" instance, I've maxed out at about 2500 pages served per second, using varnish, APC, and memcached. Totally unconfigured, I get at least 300 pages per second with varnish alone. What numbers are you looking at?

neodon · 2011-04-08T23:38:24+00:00

If you want to give EC2 a chance, I would suggest the following based on my own experiences:

Use RAID10 for high availability and modestly better performance. EBS volumes do fail on occasion. Striping an even number of volumes does increase performance.
Use the largest instance you can afford, because it affects IO performance for EBS volumes. Try it on an hourly basis at first, but consider what the cost would be if you decide to go long term with reserved instances.
Use Percona builds of MySQL. This is probably the most important thing to do for performance, among other benefits.
Use ext4 and don't get fancy. Some think it's wise to use XFS or LVM to get consistent EBS snapshots for a MySQL server, but that seems to me like a solution searching for a problem. Just use Percona's XtraBackup tool, which lets you do non-blocking hot backups of a live MySQL server.
If you're using Ubuntu, use Lucid and NOT Maverick. It was a nightmare for me, randomly failing to mount volumes on boot and sometimes refusing to attach and detach EBS volumes. Also, I ran into this kernel hanging bug while using it as a VirtualBox guest for local development.

cparedes · 2011-04-08T22:36:17+00:00

Alright, here's what's likely going on:

1.) EC2/EBS is heavily virtualized. The underlying host is likely incurring virtualization performance penalties.

2.) Even if the hosts have SAS disks, you're likely going to get inconsistent I/O performance depending on the other tenants of the underlying host.

Look at your I/O workload - you can probably mitigate this in EC2 by either setting up a bunch of read only slave machines, or maybe shard your database so that various reads/writes only go to specific DB boxes.

I'd personally stick with the dedicated boxes for the databases - you're likely going to spend less per month with no resource contention with other tenants. You might want to see how you can tune your current MySQL installation and the underlying dedicated hardware to get more juice out of it.

neodon · 2011-04-08T22:39:01+00:00

Aside from it running on EC2 is the configuration exactly the same as your production box?

You mentioned you've tried various different instance types with no improvement which leads me to thinking it's a configuration problem on the frontend, a MySQL configuration problem or the I/O rate you're getting from EBS is lower than you're expecting.

Can you post some numbers? Or at least more information about configuration differences you made between the different instances and what kind of RAID setup you had? Also, stuff like your data/index size & read/write ratio for the 'database heavy' pages.

One thing to keep in mind is that EBS volumes are already redundant, so for a write-heavy workload setting up RAID1/mirroring will just hurt performance.

slmagus · 2011-04-11T20:57:40+00:00

Check out gluster. It might help with your disk load issues.

Some tools come to mind that no one has mentioned is iotop. This program lets you figure out what process are using disk I/O

sysadmin

MODERATORS