Since i began work on my own scalability book, i thought i should do some extra reading on the subject. I picked up a few books on scalability and the first one i read was Software Performance and Scalability: A Quantitative Approach by Henry Liu. In summary I think it is a solid book, offering a lot of good ideas and quite an interesting approach to scalability (through maths!). It is not an easy read, but i think the unique approach makes it really worthwhile.
Cacti is an awesome tool but it requires quite a lot of manual work to setup new metrics measurement and begin graphing. As i am a total monitoring freak i love to have insight into different aspects of my applications. I love to export custom metrics and graph them in cacti but i found it too time consuming so i hacked a cacti graph and input method generator this weekend. The code is a bit primitive so please forgive me but i needed it to do just one particular thing it is not a multi purpose code really :)
I think some of you may find it useful as learning cacti for the first time and manually setting up custom graphs can be difficult so this script should help, check it out.
I have looked at the state of different messaging backends recently and i ran a little benchmark to see what is the rough comparison of message publishing throughput. Results that i got are quite surprising.
What i wanted to achieve was some sort of reassurance before choosing a messaging bus for my PHP project. PHP is usually a bit special as it's runtime environment is different than java / .net. I wanted to use rabbitmq because of it's routing flexibility and implementation of AMQP. After the simple benchmark i am not convinced any more if that is the best way to go for me right now.
I have played around with CouchDB half year ago and it's performance was just horrible. I have heard a lot of good about mongodb recently so i thought i will have a look at it.
I think that NoSQL can have really good use cases in web. The problem is that you need a really performant and stable system if you want to use it in production. I ran just a few simple tests so its not a real benchmark or anything. It is just a simple test trying to figure out how far behind are nosql solutions (performancewise).
Personally i think it is a good book but it lacks details, tools and practical solutions. Reading the book is quite enjoyable and it definitely contains a lot of useful tips and tricks.
What i liked the most is the fact that the book is meaty and condensed down to less than 150 pages. I really like books that are focused so i was not disappointed here.
The thing that author covers really well is the analysis and preparations of the testing plan and processes around it. You will read a lot about what to consider, how to prepare your self, what to check etc. There are also some useful checklists.
Good news, another good book!
The Art of Capacity Planning is a really decent book with a good overview of how to measure and predict web based applications load.
Book is very short (130 pages) but I love that in books. Author does not waste time nor paper just goes straight to the point.
Terracotta is an amazing piece of software and it comes with some really cool tools and features. To enable Tomcat 6 session replication via terracotta you need to do a few things but its relatively simple lets do it.
Finally i got a book that is truly worth recommending! It is a very good book and i think every web developer should read it.
Book is a sort of a continuation of another good book High performance websites but to be honest i think i like this one more.
Book is very condensed, there is no wasted page in it. Information is well structured and you can see that authors prepared well for the publishing. Information is backed by a decent research and some of the tricks are really cool.
When you build PHP applications you need cache storage to keep your calculated data in. There are quite a few options and use case decides which solution is better.
I knew that APC is faster than memcached as there is much less overhead but I wanted to see how memcached would compare to APC user cache.
Circuit breaker is a component that supports high relaiability of web sites. It helps discovering, at runtime, which of the external dependencies are failing. Having that knowledge application can avaid wasting time on trying to call them untill they are back online.
Current PHP application may depend on several databases, soap/rest web services, external cache providers or data grids, mail, ftp etc.
It is important for the application to keep on functioning even if some of these dependencies fail. To do that application has to be able to track when services become unavailable and when they become active again.
If a database or web service is down we want our application to detect it as soon as possible and react accordingly. Maybe application has a secondary slave database that can be read from? Maybe there is a way to load cached data? If all fails maybe its best just to hide some function or display message that service will be fully functional soon. What we want to avoid is making every user wait 30s for the database connection (which has to fail any way).
UPDATE: project was moved to github a while back as an independant PHP library php circuit breaker
About the author
Hi, my name is Artur Ejsmont,
welcome to my blog. I am a passionate software engineer living in Sydney and working for Yahoo!
If you are into technology, you can order my book Web Scalability for Startup Engineers on Amazon. I would love to hear what are your thoughts so please feel free to drop me a line or leave a comment.