In a previous blog, I described our use of the Stax Cloud to perform load tests for ESME. Well, Daniel Koller has just finished his analysis of the results of this initial test series and has made a few interesting discoveries. For the ESME team, these initial tests were primarily focused on gaining experience with Stax as an environment to perform such tests (including the use of clusters) as well as creating a test bed for future ESME tests. The tests used both the REST-API and the Web UI.
Now you may be thinking the main focus should be on measuring ESME performance. True. However, the ESME version that we used is based on a older Scala library with a known memory bug. Thus, the test results are not really representative of the current ESME code base.
If you look at other microblogging platforms (including Twitter) – irregardless of whether they are focused on the enterprise or not, you’ll be hard pressed to find any load test results that are published and available to potential users. However, such tests are usually mandatory for IT projects, especially for larger companies with many potential users. Thus, we would like start publishing our results so that those interested in ESME can make better sizing decisions.
We will soon be starting a new round of load tests based on the Apache code base and a new Scala library, so expect a blog in the future with these results.
For those interested in performing the tests themselves, Daniel has also made the test scripts available
10 Comments until now
Great slides! I like especially the use of R.
Always wanted to use R to evaluate performance test results.
Looking forward to see your script.
Overall it seems to me that the degradation in performance is caused by a memory limitation.
Most likely because of the leak in Scala, after some time the only thing that the system does is running the Garbage Collector.
This would also explain why 3 nodes perform better. I would guess that it would just take longer before 3 nodes also degrade in performance.
Do you do one login, one step and then one logoff? I think it would be more realistic to do more steps between logon and logoff.
I guess you are aware that it would be better to use different users. This will of course make the load test much more complicated …
I guess you don’t use any think time (pause between interaction steps). Typically you want to have think time
(for example 10 seconds +- 5). This will be more realistic in terms of memory usage, because to get the same load you will need more users.
For better monitoring it seems to me that the only option is to use some server based tool like http://www.glassbox.com/glassbox/Home.html, because
STAX/Amazon does not allow file access. No heap dumps
Regards,
Markus
Do you think there might some special considerations for microblogging tests or should just consider them to be normal java-based web applications.
D.
Hi,
regarding R: can provide you with the R script…brought it work for with quite limited effort…
regarding steps: we currently do login, send message and the logoff: the java api implementation (on client side) provides currently no methods to follow/unfollow/get messages.
regarding break times: there is think time of some seconds, with a random factor.
thx for the glassbox hint, will take a look at it,
Kind regards,
Daniel
I don’t think that microblogging apps are very special in general. IHMO you always want model your user behaviour as good as possible. What could be special for microblogging is that there might be more peaks than in a usual web app. So you probably want to have more tests for the peeks (stress testing is always a good idea).You still would want to test the average behaviour.
You also need to have an understanding of what your users would perceive as “good performance”.
For example the users might still think performance is good if messages send to others appear within 5 Minutes, but they may also do not want the main page to be loaded in more than 2 seconds.
An idea would be to do an analysis based on peoples behaviour on twitter.
Maybe there’s already some data available out there on the Internet.
BTW,
It’s maybe a good idea to first test on a local machine and not in the cloud, because on the local machine you can do heap dumps, analyze GC log files check OS performance counters use profilers (might work over the Internet, but might be a pain to use because of latency).
@Markus, if you can tell us exactly what you’d like Stax to be able to enable in the JVM and export from the server for analyzing heap dumps and GC log files, we’ll work to enable it for your cloud runs too. A general write up of how you typically run and analyze perf locally would help us understand your needs.
BTW, for perf runs, you can access the local FS (I’d recommend using the Servlet API’s javax.servlet.context.tempdir variable), it just won’t stick around when you redeploy the app.
Hi Spike,
Garbage Collector monitoring is absolutely required.
Could be monitored through JMX, like many other performance indicators, but that would I guess require to open up a port. Not sure whether it would work for people behind a firewall (probably not).
An alternative would be to http://www.tagtraum.com/gcviewer.html that can read GC log files.
But that would require that this file can be retrieved from the cloud.
You say there’s an API for retrieving files, so this could work (assuming the option for GC logging can be set).
For heap dumps also file access is necessary because currently http://www.eclipse.org/mat/ does not have a web interface(there’s only some prototype), JHAT has a web interface, but, pretty much sucks IMHO.
This problem could be solved using the same API for File access.
As I already mentioned there are free monitoring tools such as Glassbox(http://www.glassbox.com/glassbox/Home.html) or commercial ones such as http://www.jinspired.com/products/jxinsight/.
Other OS performance counters such as CPU load, and IO activity are also crucial.
Regards,
Markus
[...] is interesting is the comparison of these results with the initial load tests that were based on an older version of Scala with a notorious memory bug. In the first tests, the [...]
[...] the past, we’ve blogged about our performance tests but in the chaos of our various development cycles we never really [...]
Add your Comment!