There’s a rather busy Graylog installation next door which dissects messages given it via a few syslog inputs and pushes these into a stream to be viewed in a Graylog dashboard. In particular, it attempts to determine a user connecting via POP3 or IMAP, and this has been working rather well.

Graylog dashboard

I was given the task to think about how we could store a per/day count of unique users, and I started dabbling about with the stream settings, to find I could export a stream via GELF, so I did that.

The GELF forwarder pushes each message to a small Python program (prototype production version below) which sees this after unpacking the GELF:

{
    "_level": 6,
    "_timestamp": "2015-02-18T09:38:59.000+01:00",
    "_mail_loginuser": "jane.doe@example.net",
    "level": 6,
    "_gl2_source_node": "88e6b126-3e28-4464-84d9-7f10d0008a70",
    "timestamp": 1424248739.0,
    "_source": "m001",
    "_message": "m001 imapd: user=jane.doe@example.net, ip=[10.0.1.12], port=[1167]\n",
    "_gl2_source_input": "54003b3d498e65130f5803b0",
    "host": "m001",
    "version": "1.1",
    "full_message": "m001 imapd: user=jane.doe@example.net, ip=[10.0.1.12], port=[1167]\n",
    "_facility": "mail",
    "_id": "95939419-b749-11e4-a19a-9c8e992bcdc0",
    "_forwarder": "org.graylog2.outputs.GelfOutput",
    "short_message": "m001 imapd: user=jane.doe@example.net, ip=[10.0.1.12], port=[1167]\n"
}

The small utility opens a UDP datagram port and waits for messages to flow in, unpacking the GELF from each and processing the data.

#!/usr/bin/env python
import socket
import zlib
import json
import redis

UDP_PORT = 15005

sock = socket.socket(socket.AF_INET, socket.SOCK_DGRAM)
sock.bind(("0.0.0.0", UDP_PORT))

r = redis.StrictRedis(host='localhost', port=6379, db=0)

while True:
    data, addr = sock.recvfrom(8192)
    payload = zlib.decompress(data, 16+zlib.MAX_WBITS)
    data = json.loads(payload[0:len(payload) - 1]) # -1 for NUL byte at end

    tstamp = data['_timestamp'][0:10]       # "2015-02-18"

    proto = 'pop3'
    if 'imap' in data['_message']:
        proto = 'imap'

    username = data.get('_mail_loginuser', 'nop').strip().lower()

    key = 'day-' + tstamp + '-' + proto
    r.incr(key)

    key = 'user-' + username
    r.incr(key)

This has been running for a couple of weeks and appears to be quite reliable, even with the volume of messages we’re seeing.

Graylog

At the end of the day we can read the incremented values which are stored in redis.

$ redis-cli
127.0.0.1:6379> get "day-2015-03-14-pop3"
"902521"