Over the past two days I’ve been linked to by Daring Fireball and BoingBoing. I’m running WordPress on a virtual server from SliceHost with 512 megs of RAM. Because I’m an incompetent systems administrator and just run everything with the defaults, the server did not react well to the additional traffic. Here’s a list of the things I did to whip the server into shape.
The first problem was that the load average on the server was spiking and it was becoming non-responsive. Even logging into the server via SSH took minutes, and Web pages weren’t really loading at all. WordPress was not using the database efficiently and the database load was killing the server. I attacked this problem by taking advantage of caching.
I discovered that query caching was not properly enabled in MySQL. The cache was enabled, but the cache size was set to zero, so nothing was being cached. After tweaking things a bit, I wound up giving MySQL a 10 megabyte cache. (You can read about setting up the MySQL cache in this article.) Since my server often runs into RAM problems, I didn’t want to allocate too much RAM to a new feature.
I also set up WordPress to use caching as well, using the DB Cache Reloaded plugin. I like it a bit better than plugins that cache entire pages like WP Super Cache. Those plugins are probably worth it for really big sites that get millions of hits a day, but my traffic is relatively low most of the time, so my goal is just to make sure it doesn’t blow up entirely when traffic spikes. DB Cache Reloaded does the job.
That made things work, mostly. However, I also ran into problems with MySQL going away. In those cases, WordPress just generates a page saying it can’t connect to the database. I’ve seen that happen during traffic spikes before, and I’m still not sure what causes it. My guess is that it’s because of some kind of lock contention issue. WordPress uses MyISAM tables, which don’t support row level locking. I may switch them over to InnoDB over the holiday. I had to log in and restart MySQL a couple of times over the past 24 hours, but it hasn’t happened again yet today.
Once I stopped overtaxing the database, things started slowing down because Apache was spawning so many processes that it used all of the memory on the server. Basically, when Apache spawns more than 50 processes, the server starts getting low on memory, which slows things down, which causes Apache to take longer to serve requests, which causes even more processes to be spawned as incoming requests pile up until the server grinds to a stop. I looked at my Apache configuration and saw that it was allowed to spawn as many as 150 processes. Given that they consume about 25 megabytes of memory each, this did not work well with my puny server. Cranking the
MaxClients setting down to 25 did the trick here.
When I changed that setting, I also lowered the
KeepAliveTimeout setting to 5 seconds. When
KeepAlive is enabled, the server allows the browser to submit multiple requests over the same connection if it asks to do so. When a browser opens a persistent connection, it maintains its claim on the process that is serving its requests until the browser closes the connection or the timeout duration is exceeded. Because I lowered the number of processes, I lowered the timeout so that ill-behaving browsers don’t block other people from connecting if they’re not actually going to request more content.
Things are working better right now, and I’d be much happier if I knew what was causing the intermittent failures I am seeing with MySQL.
I should also probably do a better job of monitoring the server. The only diagnostic tool I used throughout the process was “top” and reloading the home page to see if the server was responsive.