Publishing queue messages from PHP using different backends

I have looked at the state of different messaging backends recently and i ran a little benchmark to see what is the rough comparison of message publishing throughput. Results that i got are quite surprising.

UPDATE 2012.04

What i wanted to achieve was some sort of reassurance before choosing a messaging bus for my PHP project. PHP is usually a bit special as it's runtime environment is different than java / .net. I wanted to use rabbitmq because of it's routing flexibility and implementation of AMQP. After the simple benchmark i am not convinced any more if that is the best way to go for me right now.

This is not a comprehensive benchmark

Please do not get stressed out too much if my results are poor. I spent just one day playing with the backends so i do not claim expertise in optimising their performance at all. It is not a real benchmark as PHP code and message bus were both ran on the same machine so network protocol overhead was neglected.

Tools used

I have used local deployment of each backend engine with pretty much default settings. As a client i had a very simple set of scripts trying to insert 1000 messages per request. Then i ran requests from jmeter in 1, 5, 10 and 15 concurrent threads.

Why persistent messages?

I wanted to see what is the throughput on heavy insert loads. I also wanted to see what would happen if your publishers outperform consumers or if you had to disconnect consumers for a while (workers upgrade or crash). I can not afford to loose messages in such scenario so I need them to be persistent.

Why 1000 messages per execution?

I wanted to see the rough throughput and cost of posting a message, posting just a few messages would not work as most time would be spent processing php and bootstraping the engine not executing publishing code. I did not run them within one transaction though (which makes huge difference for postgres and innodb).

Versions used

  • PHP 5.3.6
  • Mysql Server 5.1.61. Client: PDO mysql used directly. Connecting once and pushing inserts into a simple 2 column table.
    • Variant 1: inserts into MyISAM table which was fast
    • Variant 2: inserts into InnoDB table with innodb_flush_log_at_trx_commit=2 to disable ACID properties of innodb, this way inserts were much faster but not as safe as default.
    • Variant 3: inserts InnoDB table while using a default transaction log fsyncing where ACID is enforced.
  • ActiveMq 5.5.1 on ubuntu's openjdk. Client: pecl stopm extension 1.0.3. Publishing as persistent messages.
  • RabbitMq 2.7.1. compiled.
    • Client1: pecl amq extension 1.0.1
    • Client2: php-amqplib taken from https://github.com/videlalvaro/php-amqplib
  • Mongodb 1.8.2 ubuntu binaries. Client pecl mongo 1.2.9. Publishing as a fire and forget insert into a collection
  • Redis 2.2.11 ubuntu's binaries. Client pecl extension 2.1.3
  • APC 3.1.9 ubuntu's binaries. Not really considered as a message queue engine but just to have a comparison
  • Memcached 1.4.7 ubuntu's binaries. Client pecl memcached extension 1.0.2. Not really considered as a message queue engine but just to have a comparison
  • CouchDB 1.0.1. Not really considered as a message queue engine but just to have a comparison

Update, i have used the binary extension for RabbitMQ this time but i still kept the amqp-php as well just to get comparison. I still think accessing RabbitMQ from PHP is not perfect, lets hope VM Ware supports the PHP driver and makes it super awesome!

Hardware and system

I have used my Intel Core 2 Duo laptop with 4GB of RAM, Ubuntu 11.10, Kernel 3.0, ext4 file system.

Backend selection

I used databases as they are common for application-level queue backends. I have used mongodb and Redis as I have seen PHP based queue implementations using them and because their performance is just amazing.

I used memcached and APC only for comparison as in general you can not go much faster than them. They are non-persistent so there is no way to compare apples and oranges, its just to get more reference points.

Results UPDATE

I have extended the test by creating a better jmeter test suite and running long tests on each backend. Each thread group was running for 2 minutes non stop so total test time was 8 minutes per backend with increasing concurrency.

My initial results were incorrect as my tests ran for short time so they did not actually force all the backends to write to the disk. As you will see below ActiveMQ was extremely fast up to the point when it started writing to the drive and throttling clients. ActiveMQ is fast as a monster as long as you write to memory and you can consume the messages more or less as fast as they come. In this scenario ActiveMQ was as fast as memcached and for the first 300 000 messages it went like a thunder. Once it ran out of memory it's performance dropped extremely though.

Full Results

First chart shows throughput of each PHP client + backend. It is throughput of persistent messages per second where my messages were really small (about 250 characters each).

Second chart shows throughput measured in the same way but using APC and memcached (the really fast backends).

The last chart shows throughput measured in the same way but using the slowest backends.

More detailed numbers

Here is my google doc with all the numbers.

Observations

First of all ActiveMQ is a very solid message bus and PHP connectivity is really fast. It does not handle persistent messages as fast as i hoped though and MongoDB is still 5 times faster at accepting messages and writing them to disk!

RabbitMQ seemed a decent while using the pecl extension but very inefficient with the PHP based AMQP client. I did not like the fact that it took some manual compilation of dependencies to get c library and pecl extension built.

Finally Redis and MongoDB have shown just amazing performance on long-time extreme write workload.

If it was not for the PHP client issues, the fact that the AMQP protocol is stateful and so complex/verbose i think i would go for RabitMQ now. Considering my initial experiences i think i will go with ActiveMQ and i give another star to MongoDB in my book :)

Download the test files

Finally you can download all the sources of all the tests i have used. Please feel free to run them yourself and provide feedback on what did i do wrong if you spot anything :)

Download the PHP inserting scripts I used

To run the PHP scripts you will need to have all the PECL extensions loaded and PHP 5.3 for some of them.

Download the jmeter test suite I used

To run the jmeter suite you will have to use the latest version of jmeter 2.6. All you have to do is to change config element and use your host name and put the php prefix name in the backend variable.

Enjoy it !

Comments

Hi there Alvaro, It is so

Hi there Alvaro, It is so nice of you to post a comment, thanks for that.

I saw your library and some of your articles and it is great stuff. It actually caught my attention and i wanted to contact you on linkedin but did not reach you.

I am not judging RabbitMQ based on this benchmark and i am not complaining about the lib either, it is a complex and verbose protocol so it is sort of a given that it will be slower. I just wanted to see what would be the difference more or less and get some feedback on what are experiences of others.

You are 100% right that comparing to memcache is not fair but memcache is not really a message queue. I used it to have some point of reference to be able to see how much does it cost to send stuff over the wire and then put MQs in perspective.

I was very surprised how fast mongo and activemq were but i will have to repeat the test over local network and smash the queue for longer time as i was running the test for less than a minute so i am not sure if all the backends started writing to disk full speed or were they still able to buffer in memory. Maybe i will give it a go this weekend. The thing is i don't have another machine at home right now so cant make a proper test yet.

My machine is core 2 duo with 4 gb of memory, it is a laptop so it is not comparable with any server hardware. Plus the test was on local memory so based on implementation some solutions could be faster, it is not a proper benchmark, i am aware of it.

I will try to post the scripts if anyone is interested and the jmeter test suite - will try to do it this weekend. Drop me a line i am interested in your experience with AMQP in general :)

2012-03-30 02:08
admin

Hi, DISCLAIMER: I'm the

Hi,

DISCLAIMER: I'm the maintainer of php-amqplib and autor of RabbitMQ in Action

First it would be nice to know more details about your benchmark, like the size of messages and also the hardware used, type of discs and so on. Also is the benchmark code posted somewhere like in github?

Now in the particular case of php-amqplib vs the PECL extension I have to tell you that yes, the pure PHP implementation is very slow compared to the C one. In my Mac I can easily do 8000 messages per second using the PECL extension. With Java you can easily do 12K messages. Try running "make benchmark" from the php-amqplib project root to see your results, in my machine I get this:

Publishing 4000 msgs with 1KB of content:
php benchmark/producer.php 4000
3.4347190856934
Consuming 4000:
php benchmark/consumer.php
Pid: 20804, Count: 4000, Time: 3.9529

The time there is in seconds. So 3.4 seconds to publish 4000 msgs.

Another point is that you can't really compare something like MongoDB or Memcache that works in memory from AMQP when you publish persistent messages. When you tell an AMQP broker that you want your messages to be persisted then the server will make sure the message is written to disc which of course will slow down things.

Then regarding the library itself is not the same to encode plain text like what you do with STOMP with the way that AMQP works. So encoding every single AMQP bit in PHP is expensive. Another problem that the library has that I should address at some point is the way that it writes data to the socket. At the moment that's done pretty inefficiently so I know there's room for improvement there. So please don't take RabbitMQ performance based on how a library performs considering that for example the MongoDB driver for PHP is implemented by 10gen the same company that created MongoDB.

2012-03-26 12:26
Alvaro

I'd be interested to see some

I'd be interested to see some source code of the code you were running, and, to a lesser extent, a machine spec of what it was running on. The reason being is that I would love to give a try with Beanstalkd, which is my own message queue of choice.

2012-03-26 10:21

Well why not, its just going

Well why not, its just going to be another 20 times faster as it is shared memory access :) Difference is that it is local access only so you can not publish messages on front server and have them pickedup by backend boxes. I may add it in once i find time to include the proper rabbitmq client and zeromq.

thanks for comment :)

2012-03-26 01:43
admin

Would APC not be relevant

Would APC not be relevant here as well, or not?

2012-03-25 21:52
Tjorriemorrie

Post new comment

Image CAPTCHA

About the author

Artur Ejsmont

Hi, my name is Artur Ejsmont,
welcome to my blog.

I am a passionate software engineer living in Sydney and working for Yahoo! Drop me a line or leave a comment.

Follow my RSS