Cacti is an awesome tool but it requires quite a lot of manual work to setup new metrics measurement and begin graphing. As i am a total monitoring freak i love to have insight into different aspects of my applications. I love to export custom metrics and graph them in cacti but i found it too time consuming so i hacked a cacti graph and input method generator this weekend. The code is a bit primitive so please forgive me but i needed it to do just one particular thing it is not a multi purpose code really :)
I think some of you may find it useful as learning cacti for the first time and manually setting up custom graphs can be difficult so this script should help, check it out.
I could not find cacti graph templates for monitoring RabbitMQ so I decided to create them myself. Since I worked with cacti before and created some templates before it was not that hard. The only issue I ran into was that rabbitmqctl has to be ran as rabbitmq or root user and I could not get the stats from non-privileged user. At the end i decided to run a cron as root to generate static txt file on the server and then hit it with cacti instead of the stats generating script itself.
I hope you find them useful.
It was very interesting as my cacti installation began failing today. For no apparent reason. I noticed CPU usage and CPU temperature going through the roof as PHP script ran by cacti was stuck in some infinite loop or active wait state. It seemed to query database non stop without sleeping or anything and it was not doing anything either.
I looked in error log and there was nothing, looked in access log of monitored server ... nothing again. Finally i looked into cacti log and found tons of messages like this:
Good news, another good book!
The Art of Capacity Planning is a really decent book with a good overview of how to measure and predict web based applications load.
Book is very short (130 pages) but I love that in books. Author does not waste time nor paper just goes straight to the point.
Monitoring of application and server health is an important task, while trying to maintain high availability. Without monitoring you dont know what goes wrong and you do not know when exactly does it happen. Some time ago i realized that the graphs i used to use were not perfect. I searched the web, read a bit of documentation and decided to put together a set of simple scripts gathering key server performance metrics.
This bundle includes gathering scripts and graph templates for Memcached, APC, Apache2, Linux file system, Linux memeory, CPU and Network. It should cover all the most important aspects for a typical web server. Graphs are designed to match my expectations and to make analysis easier.
I still dont have postgres, sphinx nor mysql stats included but .... who knows .... maybe in version 4! :- )
Sometimes we have to automate some pages testing or loading content or whatever. CURL is an excelent library for any HTTP interfacing.
If you ever have to connect to a server and ask for particular server name you can do it with curl setopt parameters. If you want to get file /index.php from www.mytopsite.com from particular IP (without adding any /etc/hosts mappings) you can do it like this:
Mysql is so common that its good idea to have some command line scripts or tricks. Sometimes you need to execute some SQL queries on set of tables or all of them or get stats or fix tables etc.
Its good to be able quickly get list of tables and basic stats.
You can make bash script of it or just type in the command line:
After some more work on the weekend and playing around with cacti i have fixed up some of the previous scripts and joined input methods with data sources and graphs all together.
I have also added a host template so now adding servers will be much easier you just enter host name and click ccreate graphs ... job done.
Package includes graphs to monitor Linux system metrics, disk IO, network IO, apache status, APC op code cache, memcached.
Im still adding stuff to it so make sure to come back for an update : -)
You can see some of the graphs in earlier versions here:
There is a lot going on on the average server and it helps a lot to be able to see how each component performs and under what stress it is. I prepared a few graphs that show most important parameters in a more readable form than default cacti graphs.
First there is memory status showing total phisical memory with swap on top:
Then there is CPU time division to kernel user io itd:
APC is one of the best PHP op code caches. To be sure its performing the best it can you should monitor its status and make sure it has enough memory as well as correct ini settings.
You want to monitor memory allocated for op code cache as well as user stored items. Its important to get it right as wrong settings may reset your complete cache once it gets full:
You should also monitor the hit and miss ratio for op code file cache: