So, we need monitoring and alerting and we don’t want to use Nagios (environment is too dynamic).
So, I’ve heard good things about Sensu, why not give that a go. So, go ahead and start with the “Getting Started” guide and setup a “standalone” system with all the three things (Client, API and Server) installed on it, with RabbitMQ for transport and Redis for data storage. Is this overkill? I don’t know, probably. But it sure as fuck better scale well.
Next up, you find that there’s no UI unless you pay or install Uchiwa. Ok, this is a PoC, we’ll go with Uchiwa. All good. Then, we hit the first issue. It seems that “out of the box” Sensu comes with exactly 0 (zero) checks. No disk, no CPU, no anything. Ok, so I suppose it’s doing the minimalism thing to keep from getting bloated, ok, fine. So, let’s install some checks with it’s handy “sensu-install” command!
Wait… wtf is this shit… why am I getting a bunch of errors from ruby gems not being able to compile native extensions because “make” and “g++” are missing?
Anyways, a few “apt-get install” commands later we’ve got some basic checks in place. However, Uchiwa’s not updating… why…
Turns out that you have to restart API for Uchiwa to pick up the changes? I mean, that’s pretty shoddy.
Ok, so got the http plugin, added the json file as a check and… nothing. Had to restart Sensu server for it to pick up the new check and publish it to the client.
So far, not too impressed with Sensu. Seems more complicated than Nagios and not really offering much in the way of removing a lot of the “toil” in defining and setting up monitors and alerts.
There is a nice thing with the pub/sub monitors where you define checks for a group of servers on Sensu and it distributes them. It’ll be interesting to see if we can define an alert to alert when “there is less than 1 server in this group” or something like that.
Also, haven’t even started with the alerting.