Performance Zone is brought to you in partnership with:

I specialise MySQL Server performance as well as in performance of application stacks using MySQL, especially LAMP. Web sites handling millions of visitors a day dealing with terabytes of data and hundreds of servers is king of applications I love the most. Peter is a DZone MVB and is not an employee of DZone and has posted 272 posts at DZone. You can read more from them at their website. View Full User Profile

Why do we care about MySQL Performance at High Concurrency?

02.27.2013
| 2160 views |
  • submit to reddit
In many MySQL Benchmarks we can see performance compared with rather high level of concurrency. In some cases reaching 4,000 or more concurrent threads which hammer databases as quickly as possible resulting in hundreds or even thousands concurrently active queries.

The question is how common is it in production ? The typical metrics to use for concurrency defined as number of queries being actually processed is “Threads_Running” which you can easily see for your production system:

 root@smt2:/mnt/data/ mysqladmin extended -i1 | grep Threads_running
| Threads_running                               | 60          |
| Threads_running                               | 61          |
| Threads_running                               | 63          |
| Threads_running                               | 62          |
| Threads_running                               | 62          |
| Threads_running                               | 46          |

In my experience most of the systems out in the field will run with concurrency no more than low tens with normal load. Many have monitoring set up to kick in if number of threads running jumps over 50-100 and stays there for any significant amount of time.

So if people do not really run MySQL with high concurrency does it really matter how MySQL performs with high concurrency or is it just marketing gimmicks to promote new software versions ?

Performance at High Concurrency is important but for other reasons. Real world systems are different from benchmarks in terms they typically do not have fixed concurrency, instead they have to serve requests as they come in what can be close to “random arrivals” but in fact can be a lot more complicated in the practice. There is the chance both for burst of queries to come and hit database server at almost the same time (often when there is some stall happens on external system, such as memcached server) or the database server itself to experience the “micro stall” which can cause the buildup of the queries. Such build ups can happen very quickly.

Imagine for example some high volume Web system. It well might have 100 of Web servers each having Apache configured to run up to 100 apache children concurrently each of which might open a connection to MySQL server… this ends up with very high number of up to 10K connections and potentially running queries. Now imagine typically we have 30K queries/sec coming in from the Web level with some 1ms average latency which requires just around 30 queries to be ran at the same time. Imagine now database stalls just for 100ms – which you most likely will not even see with naked eye. This will results with expected 3000 queries to be backed up considering our inflow rate, which are quite likely to come from 1000 or more connections.

This is when performance at High Concurrency is the difference between life and death. Take a look at these graphs The Blue line of MySQL Community Server has just 1/4th of its Peak performance of concurrency of 1000 while Red Line of MySQL Enterprise Edition with Thread Pool plugin remains very close to the peak.

In practice this will mean one server will be able to process the backlog very quickly and recover after such stall the other server will be depressed and might be unable to serve the inflow of requests getting higher and higher under water. Some systems may never recover in this case until Web server is restarted or load removed from them other way, others will recover but taking a lot more time and with a lot more user impact – after all slower responses will cause users to submit less requests to the system reducing the load.

So yes. The performance at high concurrency matters as it helps systems in distress. But performance at low and medium concurrency matters too as this is what will define system performance during its normal operation.

P.S I think it would be interested for people to see how high concurrency people are running servers in the field. If you can run the command above for your loaded production servers (but which are not in distress) I think it would be very interesting.



Published at DZone with permission of Peter Zaitsev, author and DZone MVB. (source)

(Note: Opinions expressed in this article and its replies are the opinions of their respective authors and not those of DZone, Inc.)