Comparison of Rails Deployment Stacks
November 25th, 2009
Why?
Why write this post in the first place?
To showoff your cool stuff?
To deride the other perfectly good software?
No, the only reason I am writing this post is to back up our statement that WebROaR is generally much faster than all other comparable deployment stacks. The aim is to provide real world data and insights in to our benchmarking procedure for everyone. Honestly, there is not much fun in performing this long experiment, as we programmers do have very short attention spans. (Oh wait, let me see HN/programming reddit once again ...).
The results of this comparison should be just taken as an indicator and should act only as one of the many data points in coming up with your own conclusion about Rails deployment stacks. If you are seriously interested to know the performance numbers of your application, please do try this experiment at home. :-)
Benchmarking Basics
I can't put across the basics more 'eloquently' than this Zed Shaw essay and would recommend giving it a read. There are no 'eye-opening', 'lift the darkness' kind of insights in that essay, but at times we do make obvious blind mistakes while benchmarking that can be easily avoided. I hope we haven't made many of those in our experiment.
We tried taking care of the following aspects for this experiment:
Ensure the rest of the environment is exactly the same (as much as possible) for all deployment stacks being tested i.e. same hardware, same software versions, same test application and same tester as well. :-)
Use the right benchmarking tool that provides relevant and useful statistics.
Do not make it a "Hello world" application comparison that is potentially not useful in any which way in real world.
Allow each server stack to initialize/warm-up as required before running a performance test over it.
We selected httperf for conducting this experiment for it's ability to provide the relevant statistics. The data presented below should vouch for its usefulness.
I would also recommend this PeepCode screencast that provides a very good introduction to benchmarking with httperf. (Disclaimer - It's not free and we don't get any affiliate money referring it :-) )
Test Environment
Server Machine
CPU - Intel Core 2 Duo 2.8 GHz
RAM - 1GB
OS - Ubuntu 8.1(Intrepid) Desktop
Kernel - Linux 2.6.27-7-generic
Ruby - 1.8.7(2008-08-11 patchlevel 72) (Ubuntu Package)
Client Machine
CPU - Intel P-IV 2.66GHz
Memory - 1 GB
OS - Debian 5.0 (lenny)
Kernel - Linux 2.6.26-1-686
Benchmarking Tool
httperf -0.9.0 compiled without DEBUG and without TIME_SYSCALLS
Network Connectivity
100 MBPS LAN (1 hop between two machines)
Application
Name: Instiki
Source: http://github.com/parasew/instiki
Commit: 77dcb015f38d75d32df6fceb52ce4a2737845991
Rails Version: 2.3.4
Database: SQLite 3
URL: /wiki/list
We have tried to use a URL that has database interaction so as to simulate a typical scenario of a web application. This would ensure most parts of the Rails stack are involved in the test. Please note that caching is not enabled at all for this URL.
Deployment Stacks
We chose the following servers for this comparison.
Apache 2.2.9 (mpm-prefork) (Ubuntu Default) + mod-proxy-balancer + Mongrel 1.1.5 Cluster (6)
Apache 2.2.9 (mpm-prefork) (Ubuntu Default) + Passenger 2.2.7 (MaxPoolSize = 6)
Apache 2.2.9 (mpm-prefork) (Ubuntu Default) + mod-proxy-balancer + Thin 1.2.5 Cluster (6)
WebROaR v0.2.3 (Set 6 maximum workers for the deployed application)
Since the focus of our test is to compare maximum requests/sec output by each of the stacks, and the specific request is being served out by Rails from the database, replacing Apache with Nginx should not give us too dramatically different results. (As I mentioned earlier, these tests are just indicative of what performance level you can generally expect from your server.)
We would be happy if someone can repeat the test with Nginx as well.
Procedure
Strategy
We would like to find out the maximum requests/sec output by each server when the selected application URL is bombarded with requests from httperf.
Consider this command:
httperf --hog --server ABC --num-conns 2000 --num-calls 10 --uri /wiki/list --rate 11
It tries to make 2000 connections to the server ABC and sends 10 requests through each, for the URL /wiki/list. The total number of requests made are 2000*10 = 20,000. The 'Demanded Request Rate' is the product of rate and num-calls parameters i.e. 11 * 10 = 110 requests/sec.
The output would look similar to the following if the server is able to handle this load:
httperf --hog --client=0/1 --server=ABC --port=80 --uri=/wiki/list --rate=11
--send-buffer=4096 --recv-buffer=16384 --num-conns=2000 --num-calls=10
Maximum connect burst length: 1
Total: connections 2000 requests 20000 replies 20000 test-duration 181.970 s
Connection rate: 11.0 conn/s (91.0 ms/conn, <=13 concurrent connections)
Connection time [ms]: min 107.4 avg 321.2 max 1724.8 median 238.5 stddev 228.1
Connection time [ms]: connect 0.2
Connection length [replies/conn]: 10.000
Request rate: 109.9 req/s (9.1 ms/req)
Request size [B]: 75.0
Reply rate [replies/s]: min 97.2 avg 109.9 max 117.4 stddev 4.0 (36 samples)
Reply time [ms]: response 31.8 transfer 0.3
Reply size [B]: header 465.0 content 4930.0 footer 0.0 (total 5395.0)
Reply status: 1xx=0 2xx=20000 3xx=0 4xx=0 5xx=0
CPU time [s]: user 29.39 system 151.45 (user 16.2% system 83.2% total 99.4%)
Net I/O: 587.1 KB/s (4.8*10^6 bps)
Errors: total 0 client-timo 0 socket-timo 0 connrefused 0 connreset 0
Errors: fd-unavail 0 addrunavail 0 ftab-full 0 other 0
The important fields to look from the above result are:
Reply rate [replies/s]: min 97.2 avg 109.9 max 117.4 stddev 4.0 (36 samples)
This tells us that average reply rate was 109.9 RPS with a standard deviation of 4. This was measured over 36 samples by httperf that samples every 5 seconds. (This specific test ran for 181.970 seconds). For the statistically inclined, Avg RPS +- 2*Std Deviation should give you 95% of the values.
There were no errors reported.
Errors: total 0 client-timo 0 socket-timo 0 connrefused 0 connreset 0
Errors: fd-unavail 0 addrunavail 0 ftab-full 0 other 0
From this test we can conclude that the particular server under test can easily handle the demand rate of 110 RPS without any errors, and can be subject to more load.
Our aim is to find the maximum avg reply rate output by each server when subject to a series of high demanded request rates. After a specific demanded request rate test has been done, the server is restarted and warmed up before running next higher demanded rate test.
As we keep on increasing the demanded rate, the server would get saturated and its reply rate would not increase, and rather slowly start to degrade after a point. It could also start throwing errors.
We would like to find the saturation point for each server and the maximum avg reply rate output at that point.
Key Points
All possible additional applications/daemons should be stopped.
Rails environment should be set to production.
At any given point of time only one server deployment should be active on the machine, and it should run only the rails application under test.
Set Analytics to 'Disabled' for the deployed rails application on WebROaR. To ensure a fair comparison, we recommend turning off this feature to get the fastest performance for your application on WebROaR.
Ensure each server is initialized and warmed-up properly for every test i.e. it has loaded all it's resources and in case of Passenger & WebROaR they have already instantiated it's maximum number of workers. We used the following command for warming up all server stacks before running each of their tests.
httperf --hog --server ABC --num-conns 200 --num-calls 10 --uri /wiki/list --rate 20After each test, the server should be shut down, and tmp folder, log files and sessions should be cleared.The commands that can be used are rake tmp:clear, rake db:sessions:clear and rake log:clear.
To get the memory usage numbers for each server, we used this simple technique that I first saw on the benchmarking blog of the good folks at Phusion Passenger. Essentially, just after the load test use the
free -mcommand to see the free memory in the system, and then check it again after shutting down the server. The difference of these 2 numbers gives us the actual amount of memory the server stack was consuming. E.g. Iffree -mreported 300 MB free after the test, and 550 MB when the server was shutdown, the total memory usage of the server was 550-300 = 250 MB.
Results


- For this particular test, WebROaR is able to handle the maximum demanded request rate among all the deployment stacks. Even after it gets saturated its performance doesn't degrade and there are no errors.
- Passenger & Thin started giving errors, hence the graph software automatically got their reply rates to zero which is not statistically true, but emphasizes the point in a way.
- Mongrel also behaves nicely after saturation and doesn't throw any errors.
- Please note that Thin is able to handle the load very well before it breaks out at around 178. Its green line is merged with WebROaR's till that point and not visible in the image above.

The maximum RPS numbers for each of the deployment stack are:
- Apache 2.2.9 + Passenger 2.2.7 (6) - 107.3
- Apache 2.2.9 + Mongrel 1.1.5 Cluster (6) - 123.8
- Apache 2.2.9 + Thin 1.2.5 Cluster (6) - 178.4
- WebROaR 0.2.3 (6 maximum workers) -188.8
As per this test, on an average WebROaR is
~76% faster than Passenger
~52% faster than Mongrel
~6% faster than Thin
Remember, this is just one particular action of one particular application. It is not wise to derive a conclusion for each and every application based on the above result.
Let's look in to more details given to us by httperf.

The above table has the detailed numbers for test run of each deployment stack where it performed at its best. (Avg RPS +- 2*SD) range gives us a good idea of the performance of the server for 95% of its samples. WebROaR has the smallest standard deviation of 1.5 giving us a good indicator of consistency in its performance across all requests.
Also looking at the above table, we can safely say that for this test Network I/O was not a bottle neck at all.

Memory usage numbers for each of the deployment stack when they are delivering their maximum RPS are:
- Apache 2.2.9 + Passenger 2.2.7 (6) - 133 MB
- Apache 2.2.9 + Mongrel 1.1.5 Cluster (6) - 368 MB
- Apache 2.2.9 + Thin 1.2.5 Cluster (6) - 231 MB
- WebROaR 0.2.3 (6 maximum workers) -258 MB
For this test, Passenger consumed the least amount of memory followed by Thin.
More Performance Testing
We picked up 2 more open source Rails applications and tested them out using the above procedure.
Application
Name: Redmine
Source: http://github.com/edavis10/redmine
Commit: c9bfdc009baf9aa472b82543fbb3189ff9862b48
Rails Version: 2.3.4
Database: MySQL 5.0.67-0ubuntu6
URL: /projects
Application
Name: El Dorado
Source: http://github.com/trevorturk/eldorado
Commit: eac99d4b3cdd95782c3bc8e0642c7e3b0380017a
Rails Version: 2.3.3
Database: MySQL 5.0.67-0ubuntu6
URL: /users
Here are the results:

As per the tests, on an average WebROaR is
15 to 36% faster for the tested Redmine URL
9 to 39% faster for the tested El Dorado URL
than other servers.

The above memory usage graph indicates Passenger & WebROaR seem to be less memory hungry than the other 2 servers. (At least for these applications!)
More Performance Testing (with a different tool)
We also tested out the 3 applications (Instiki, Redmine and El Dorado) with 'ab'. In our opinion, 'httperf' is more reliable and provides more useful information, but we thought it would be a good idea to cross-check if 'ab' is giving similar (if not exact) results.
The following command was used to 'warmup' the server stacks:
ab -c20 -n2000 [Application URL]
The actual test:
ab -c20 -n20000 [Application URL]

The tests with 'ab' also confirmed the trends seen with 'httperf' and WebROaR does seem to be doing better than the other servers.

The memory usage numbers again corroborated the earlier finding of Passenger and WebROaR being less memory hungry than other 2 servers.
The complete set of raw results data can be downloaded from here.
Conclusion
How much should one really read in to the numbers above? Does WebROaR beat all other deployment stacks every time by these margins?
Well, we would suggest taking these numbers as indicative of how your application might perform on WebROaR. You may or may not see these exact gains (could be higher or lower), but you can safely assume that mostly there would be some decent gain performance of your application if it runs on WebROaR.
Apart from performance, we believe WebROaR brings a whole lot of simplicity and an integrated solution for Ruby on Rails™ application deployment. Do check it out and run your application on it when you have some time!
We would be happy to receive your feedback/comments/suggestions on this article.
8 Responses to “Comparison of Rails Deployment Stacks”
Sorry, comments are closed for this article.
November 26th, 2009 at 04:20 AM
I would love to see your Apache configuration. Would like to know if its prefork or worker. Would also love to know the modules which are enabled, and the balancer configuration, since webroar is running w/o a r/Proxy. Is the server a VPS or metal?
I think it's only fair to benchmark with Nginx as well, and if you want to test thin, mongrel, passenger, you can't leave out Unicorn (http://unicorn.bogomips.org/) :)
I don't know if free is a good way to test memory usage under Linux. Generally I use psmem.py (http://www.pixelbeat.org/scripts/psmem.py). Particularly, when curious about a process' footprint.
The blog also suffers from some stability issues. I guess a lot of ppl are tweeting about this sucker! :)
November 26th, 2009 at 10:42 AM
@John
The server stacks were run on a developer machine. (Complete details are given above under the section 'Test Environment'). We used Apache 2.2.9 (mpm-prefork) (Ubuntu Default) and i have uploaded the configuration details (along with list of enabled modules) here. Since we are benchmarking URLs that have full blown interaction with the database through the rails stack, replacing Apache with Nginx probably would not make too much of a difference. It takes a decent amount of time to do these benchmarks carefully and hence we decided to leave out Nginx for now. Thank you for mentioning about ps_mem.py. We would definitely have a look on it. Yes, we did have some teething problems with our VPS setup. It was a long night for us yesterday and we had run out of caffeine by the time some of the issues came up. :-). But its a nice live testbed for our server and i hope things get more stable as we go along.Thanks for your comments.
November 28th, 2009 at 02:01 PM
I'd be interested to see these results using the "bybusyness" load balancing method in Apache. I've noticed good performance gains when certain requests take longer than others--which is the case plenty of web apps.
BalancerMember http://127.0.0.1:9000
# ...
BalancerMember http://127.0.0.1:9005 lbmethod=bybusyness
December 1st, 2009 at 02:32 AM
Since Passenger is designed for optimal memory usage, I think this test penalizes its efficiency. I'd be curious to see how it performed with double the workers. Also, most people running Passenger probably get the benefits of Ruby Enterprise to use even less memory.
December 1st, 2009 at 06:36 PM
I would love to use WebROaR with JRuby. Any idea if this works?
December 1st, 2009 at 06:57 PM
Really really curious to see the Unicorn results here.
December 1st, 2009 at 09:00 PM
A user cares about response time, not server load. I'd like to suggest for your next all nighter and/or free time ;), measuring average response time, and possibly how it changes with load.
In any event, I appreciate y'all taking the time to do this benchmarking.
December 7th, 2009 at 12:39 PM
@Martin
WebROaR doesn't work with JRuby. It loads up the ruby interpreter inside a C process, so i do not foresee the JRuby thing happening in future as well with this design.
p.s. Sorry for the delay in responding to your query.