Simpler, Faster, Stronger: MySQL Optimization Tips - page 2
Faster, Better, Stronger!
SQL query tuning in this case involves looking at the query plan, i.e. how the database will find your data. If there is sorting, can we remove the sorting? Can we join two tables in a different order to make it faster? Can we rewrite IN statements as EXISTS, as they often perform better? These types of rewrites are specific to your database platform and version, so if you've just upgraded your database, your queries can perform differently (faster or slower).
Rethink the problem
This is probably the most difficult to envision. It involves thinking outside the proverbial box. I had a client some number of years ago whose application synced data from the field via PDAs. Since their infancy as a company they had always done fairly instant uploads of data. Users were immediately be able to report and summarize their results. However, as their business grew, the servers required to provide this type of premium service grew astronomically.
My suggestion was to provide a gold, silver and bronze level of service where users would get instant, hourly, and daily uploads of data respectively, and would pay a premium if they needed instant access. In the end most users were fine with silver or bronze level of data uploads, and the ones who needed gold were willing to pay more, thus funding the higher level of servers required to fill that need. This arrangement also lent itself nicely to partitioning users and allocating hardware as well.
Hardware is usually the first place people look to get more bang for their buck. If I add a bigger engine to my car, or more farmers to a farm, I go faster and get more work done. This intuitive notion is taken to computing. Unfortunately, it doesn't always hold.
In the parlance of computer scientists, hardware additions will get you typically only one order of magnitude speedup, that is 2x 4x 8x but typically less that 10x. That's not to say it's not worth considering. However, we need to balance the cost versus the gain.
When reviewing your hardware, the first stop should be the I/O subsystem. Are you using RAID 5? If so, switch it out for RAID 10. How many disks are in your RAID array? Only four or six? Can you get a RAID array that can handle 10 or more disks? Such an array will be much faster because you have more spindles, more tiny little heads on more platters working together to read data for you faster. Also, verify that you don't have any network bottlenecks associated with your disk subsystem. These can hamper the throughput of your array because data has to be moved from the array to main memory and back and forth.
Also, consider how much main memory you have. More main memory can mean larger buffer caches, thus enabling your database to keep more index and data blocks in memory longer, or do larger sorts in main memory. Not only will you be able to allocate the database kernel and caches more memory, but you'll be able to give more memory to your sessions, or allow more overall sessions coming from your webservers, and thus handle more webservers overall.
What about CPUs? Typically, a 2 CPU SMP box is better than a single CPU box, and 4 CPUs are better than 2 and so on. That's because of Unix's multi-process architecture, so each CPU can handle a different process. Moreover, this all works great for multi-process architecture applications. Except for one tiny little problem. MySQL is a single process multi-threaded architecture. So adding more CPUs will help but only to a point. The MySQL development team is looking at these concurrency issues. The architecture will evolve to scale better on large SMP boxes. For the time being, however the recommended method is to run multiple MySQL instances on different ports on the same box. This can give you better concurrency; however, those separate instances cannot mount the same database (except using NDB Storage engine). Therefore, it may complicate the application. We'll get to that soon enough though.
Your application uses SQL to talk to the database, but also to frame the problem to be solved. This is the often-overlooked area to tune, often providing the largest bang for your buck. You may also look over your hardware, from disk I/O subsystem, CPUs and main memory to get some big improvements there as well.
In part two of this article, we'll talk about scaling by adding additional servers. When and where is it appropriate and how it works.
» See All Articles by Columnist Sean Hull
Article courtesy of Database Journal