We recently noticed some odd behavior with one of our servers during a deployment. Several machines were removed from the traffic pool to receive updates, and during this time, the other machines were shouldering the extra load. We have significant extra capacity, and all the servers were responding fine, except one. This one server was seeing increased latency, despite the fact that it was the same hardware as all the others and receiving the same traffic levels.
As we started to investigate why it was performing differently, the one thing that emerged was that it had hyper-threading disabled.
Since hyper-threading was first announced almost 10 years ago, I have been very skeptical of it. The idea that by tricking the operating system into thinking it had twice as many CPUs will lead to a performance boost just feels…wrong. You don’t have two CPUs, so don’t fool the users into thinking they have twice as much horsepower as is really available.
Tests of hyper-threading performance were equivocal, to say the least. They always spoke about only certain kinds of applications would benefit, and other applications might see a decrease. To top it off, the success stories talked about performance boosts in the 20% range. While this is good, when the fact that you are tricking the OS into thinking there are twice as many CPUs kind of implies a bigger boost.
What is also bothersome about this type of data is that it is not clear how relevant it will be to a web application. When talking about how fast an application can perform CPU intensive tasks, the numbers are meaningful. However, on a web server where each request is assigned an individual thread, there is no boost. The requests themselves are not executing faster. At best, it should allow a server to have increased capacity, since it would be able to process more concurrent requests without degradation. But this isn’t want the benchmark data was trying to measure.
The official data aside, what has always made me really skeptical was strange behavior I would see on hyper-threaded systems. If you looked at their CPU graphs in Task Manager, you would see that only half the CPUs were being utilized. While the OS wasn’t supposed to know there was a difference, clearly there was one.
We decided to run some tests. We slowly increased traffic to two servers – the one with hyper-threading disabled, and a normal one. Just as during our deployment, we saw that the server without hyper-threading started to increase latency when we doubled the load, while the other server handled the extra load without any issues.
We then re-enabled hyper-threading on our bad server and repeated the experiment. Our server started behaving just like all of its peers.
Clearly, hyper-threading can boost the server capacity for a web-based application.
Nothing like data to break down stereotypes.
