Everything about Web and Network Monitoring

Home > Server Management > Apache Monitoring > Simple metric aggregation and automated custom monitors with Monitis and StatsD

Simple metric aggregation and automated custom monitors with Monitis and StatsD

StatsD is a Node.js daemon that accepts metrics over a simple and lightweight UDP protocol, aggregates those metrics, and sends the results to one or more backend systems for long-term time series data storage, graphing, alerting, etc. Existing backends included with StatsD support graphite and console output for testing. There are also third-party backends for Librato, Ganglia, and AMQP.

In this article, we introduce a new StatsD backend for Monitis, statsd-monitis-backend and use it to build a couple of examples for easily piping metric data into Monitis.

Why StatsD?

Why is StatsD a good solution for collecting metrics before sending them to Monitis? Aggregation is the primary reason. Like many other systems that gather and display time series data, Monitis expects input on a regular schedule, and typically no more than one update per monitor per minute. However, applications have events that occur more or less frequently. Consider user logins, which happen on an irregular basis. Many applications take the approach of gathering statistical information about these events internally, and then providing aggregated data to monitoring systems, i.e. a metric such as logins/minute. But why add the overhead and complexity of metric aggregation into every application you want to monitor? Instead, simply have the applications notify StatsD each time an interesting event happens, and let StatsD take care of the rest. For more information on this topic, take a look at the excellent Etsy blog post, Measure Anything, Measure Everything.

Two examples

Two examples will help to illustrate how this works. First, using a very simple script, send events to StatsD on an irregular schedule, such as users requesting a specific web page. StatsD will aggregate these, and send the aggregated data to Monitis once per minute. For the second example, use Diamond, a flexible systems monitoring tool to send raw metrics though StatsD, and from there on to Monitis. The point of the second example in not to demonstrate aggregation, but rather to demonstrate that StatsD can be used as a adapter to get metrics into Monitis from applications that might otherwise be incompatible.

Example 1: Watching log files

To generate statistics from a log file, capture new lines as they are appended, identify interesting ones, and send an appropriate increment message to StatsD. Apache web server log files are a good choice for this example, as hits to a web site are typically more frequent than one per minute, and are also something that many systems administrators are interested in monitoring.

To set up the simplest site possible that can generate a couple of events to look for in the logs, install apache and create a very minimal site with a couple of links:

$ sudo apt-get install apache2
$ cd /path/to/webroot
$ mkdir dir1 dir2
$ echo '<html><body><a href="/dir1/">Dir 1</a><br/><a href="/dir2/">Dir 2</a></body></html>' \
> dir1/index.html
$ echo '<html><body><a href="/dir1/">Dir 1</a><br/><a href="/dir2/">Dir 2</a></body></html>' \
> dir2/index.html

The events for this example are web requests to either of those directories. Create a script to tail the log and send increments to StatsD when a matching line is seen. First, get StatsD running on the local host, configured to send updates to Monitis.

$ git clone https://github.com/monitisexchange/statsd-monitis-backend.git
$ git clone https://github.com/etsy/statsd.git
$ cd statsd
$ npm install ../statsd-monitis-backend

Next, create a configuration file and run StatsD. The following configuration provides plenty of debugging output.

$ cat > local.config <<EOF
{
  backends: ["statsd-monitis-backend"],
  flushInterval: 60000,
  dumpMessages: true,
  monitis: {
    apikey: 'your_monitis_apikey',
    secretkey: 'your_monitis_secretkey',
    debug: true
  }
}
EOF
$ node stats.js local.config

Create a script, such as the Python script below, to parse the logs and send events to StatsD.

#!/usr/bin/env python
# apachelog-statsd-counter.py

from apachelog import parser
import statsd, sys

p = parser(r'%h %l %u %t \"%r\" %>s %O \"%{Referer}i\" \"%{User-Agent}i\"')
counter = {}

while True:
    line = sys.stdin.readline()
    if line == '':
        break
    fields = p.parse(line)
    stem =  fields['%r'].split()[1].split('/')[1].replace('.','_')
    if (not counter.has_key(stem)):
        counter[stem] = statsd.Counter(stem)
    counter[stem] += 1

And finally, run that script, and click around the example web pages to generate some traffic. You may need to hit reload, to get around browser caching.

$ sudo pip install python-statsd apachelog
$ tail -f /var/log/apache2/access.log | python apachelog-statsd-counter.py

The custom monitors named in the script are created automatically by the Monitis StatsD backend, and the results can be seen in the web portal.

Example 2: Forwarding metrics from StatsD compatible applications

In addition to aggregation, StatsD can also act as an adapter between data collectors and backend systems. Because StatsD provides a simple message format, there are a number of applications that are capable of sending metric data to StatsD. Via the Monitis backend in StatsD, all of these compatible applications are now also able to send data to Monitis custom monitors.

Diamond is a Python-based systems monitoring tool with a flexible and extensible architecture. It supports pluggable data collectors and handlers for sending data to monitoring systems. In this second example, StatsD serves as an adapter between Diamond and Monitis custom monitors.

Start by installing StatsD and Diamond. This example uses a new feature of the StatsD message format for raw metrics that is not yet incorporated in the main Etsy branch. Until this is merged in, support for raw metrics can be found in a fork of the project with a pending pull request to enable the feature. The example also uses a similar fork of python-statsd, a module required by the Diamond StatsD handler.

Uninstall the default python-statsd, and replace it with the fork supporting StatsD raw metrics.

$ sudo pip uninstall python-statsd
$ sudo pip install -e git://github.com/jeremiahshirk/python-statsd.git#egg=python-statsd

Clone the modified StatsD into a new source directory, choosing the raw_and_averages branch.

$ git clone git://github.com/jeffminard-ck/statsd.git -b raw_and_averages statsd_raw

Installation instructions for Diamond are available in the Diamond wiki at https://github.com/BrightcoveOS/Diamond/wiki/Installation. Note that since custom monitors are automatically created for each individual metric sent through StatsD, care should be taken to avoid configuring extraneous metrics. Once Diamond is installed, configure it to limit Diamond to the cpu and loadavg collectors.

$ cd /usr/share/diamond/
$ sudo mv collectors collectors.bak
$ mkdir collectors
$ sudo cp -r collectors.bak/cpu collectors.bak/loadavg collectors/
$ sudo /etc/init.d/diamond restart

Configure Diamond to send data to its StatsD handler. In /etc/diamond/diamond.conf, switch to the StatsD handler, and configure the host and port. Of course, substitute the correct host name or IP address for the StatsD server.

[server]
handlers = diamond.handler.stats_d.StatsdHandler, diamond.handler.archive.ArchiveHandler

...

[[StatsdHandler]]
host = 127.0.0.1
port = 8125

StatsD should be configured as with the first example, so that it uses the Monitis backend. Once everything is configured, start up the Diamond and StatsD servers. Within a couple of minutes, the necessary custom monitors should be created, and receive updates once per minute.

Conclusion

At this point, this is a working system that will record the metrics in Monitis custom monitors every minute. Even better, those custom monitors are automatically created when they’re sent through StatsD, minimizing the work needed to Monitor Everything.

Post Tagged with , , , , ,

About Jeremiah Shirk

Web & Cloud
Monitoring