Blog
Posts tagged with "benchmarking".
Don't touch that dial
Sid
02 Dec 2008 09:03
Last week, Ed Bateman and I went to a Make IT Scale seminar hosted by Sun and acardia in London. Most of the speakers were pretty good, but in particular Mike Griffiths from Proven Scaling gave an excellent whistle-stop tour around scaling MySQL.
At the moment, it’s a subject close to my heart as we’re designing a data mart that will back a Sales Decision Support System and we want our system to be future proof. As well as the tips he gave along the way, the main point that stuck with me is the amount of planning required to do the job properly. I’m already close to the time I originally allocated to the exercise and I reckon I’m around halfway through – and that’s after having done similar exercises before.
The one thing I didn’t remember him mentioning the need to resist the temptation to fix things using new features you’ve read about – i.e. before benchmarking and even knowing whether there is even a problem! It’s a real temptation, and psychologically very difficult to avoid. You read the new features, the use case in the documentation sounds just like yours and so it’s already a solution to a problem you might not have.
After the words of warning, it’s perhaps silly to recommend a book crammed with just those type of examples, but High Performance MySQL is by far the best book on MySQL that I’ve read. It takes you through the architecture of the product in fantastic detail and I reckon is a compulsory buy for people designing applications against the database. I’m reading it and enjoying it, and at the same time trying hard not to fiddle with the design and configuration until I know there’s a problem. I’ll let you know how I get on!
Performance tuning - configuring MySQL
Sid
16 Dec 2008 16:41
We’ve been benchmarking and tuning the MySQL database for a decision support data mart product we’re currently developing and have had some interesting results.
The initial configuration that ships with the MySQL server isn’t necessarily suitable for every requirement(!) and so we’ve been experimenting with our lab server to try to find the optimal set up. (We’re using INNODB tables only so if you’re using MyISAM tables or a mix then there are different configurations that should be tested.)
We’re expecting data volumes to be fairly large for the data mart (i.e. millions of rows) and want to ensure that the it performs well. Our lab server is a lower specification that we’d recommend for the product but we have got some good results by tuning the buffer pool size and the query cache.
MySQL uses the buffer pool to cache both index and row data. In a denormalised data mart this is likely to give the biggest improvement in query times (as there usually is a lot of data in both!), and our lab database was no exception.
We managed to reduce the times of our benchmark queries by an order of magnitude when we set the size of the pool to 25% of avaiable memory. This is conservative given the usual recommendations to set to 80% but our lab machine also runs a web server. The buffer pool is configured through the innodb_buffer_pool_size parameter in the my.ini / my.cnf file.
We also saw a large improvement (several orders of magnitude) by enabling the query cache. The query cache stores queries and their results. It’s important to realise that it will only use a query from the cache if it’s exactly the same. Given that we are using named queries and will be querying data for the same month a lot of the time, it made sense to enable this. The query cache is enabled by setting the query_cache_size to a non-zero value.
We found it made the first run of the query slightly longer (as it puts the query in to the cache) but then subsequent runs very fast (e.g. 1800ms to 78ms). The trick is/will be to size the query cache to minimise the number of queries that are evicted (find this from the Qcache_lowmem_prunes variable).
We also use an application cache on the web server for our reference data so it won’t make sense to cache these queries as they will be run only once each server restart. Exclude queries from the cache by adding the SQL_NO_CACHE clause to the SELECT statement (e.g. SELECT SQL_NO_CACHE name, FROM person) but you will need to make sure that the query_cache_type system variable is set correctly.
As mentioned in the previous post the High Performance MySQL book has been very useful, as has the detailed documentation
on the MySQL site.

