Bandwidth throttling with faban

I often use faban for performance related work. Nearly always I have used it while working on APIs which are called by other servers (as opposed to humans, who linger between mouse clicks) and where bandwidth use is not a significant factor (the processing time of the request outweighs the request/response time by orders of magnitude). For these requirements it has always worked well to run the faban tests with zero think time and letting it issue requests as fast as the server can handle.

Recently, however, I’ve been looking into a system where the request and/or response bodies are quite large, so the bulk of the total request time is consumed by the data transmission over the network. This creates a bit of a problem because in the lab the faban machine and the server being tested (“SUT”) are wired together via gigabit ethernet so there is a decent amount of bandwidth between them. While that sounds like a good problem to have, the reality is that in production the end users are coming in over the internet and have far lower bandwidth.

Thus, the testing is not very realistic. Faban can saturate the server with just a few users uploading a gigabit speeds even though I know the server can handle far more users when each one is uploading at much slower speeds over the internet.

Turns out faban has the capability to throttle the upload and/or download bandwidth over a given socket. As far as I could find this is not documented anywhere, I found it by accident while looking at the code when I was considering various solutions.

Here’s one way (there may be other ways) to use it:

ctx = DriverContext.getContext();
com.sun.faban.driver.engine.DriverContext engine =
    (com.sun.faban.driver.engine.DriverContext)ctx;

// Set desired speed in K per second, or -1 to disable throttling
engine.setUploadSpeed(uploadKBps);
engine.setDownloadSpeed(downloadKBps);

As of this writing the latest faban version is 1.0.2. In this version the upload throttling works fine but downloads (i.e. reading the response body) can hang if throttling is enabled. I filed a bug with a fix that is working reliably for me. If you try this with 1.0.2 (or earlier, probably) then you’ll need to apply that change and rebuild faban.

 

What is your cache hit rate?

While this may sound like an obvious metric to check, I’m often seeing that developers don’t verify the cache hit rate on their code under realistic conditions. The end result is a server which performs worse than if it had no cache at all.

We all know the benefits of keeping a local cache.. relatively cheap to keep and it saves having to make more expensive calls to obtain the data from wherever it ultimately resides. Just don’t forget that keeping that cache, while cheap, takes non-zero CPU and memory resources. The code must get more benefit from it than the cost of maintaining the cache, otherwise it is a net loss.

I was recently reviewing a RESTful service which kept a cache of everything it processed. The origin retrieval was relatively expensive so this seemed like a good idea. However, given the size of the objects being processed vs. the amount of RAM allocated to this service in production, my first question was what’s the cache hit rate?

Developers didn’t know, but felt that as long as it saves any back-end hit it must help, right?

A good rule of thumb is that anything that isn’t being measured is probably misbehaving… and this turned out to be no exception.

Testing under some load (I like faban for this) showed the server was allocating and quickly garbage collecting tens of thousands of buffer objects per second for the cache. Hit rate you ask? Zero!

Merely commenting out the cache gave a quick 10% boost in overall throughput.

So that’s my performance tip of the day.. be aware of you cache hit rates!