Ansible provides hooks for running custom callbacks on the management machine (not the nodes) as it invokes modules. These callbacks allow us to trace what Ansible is doing, log operations it is starting on or has completed, and (most importantly to me) collect results from modules it has run. (Ever since Ansible stopped storing facts it collected in a setup file on the target node, I’ve been yearning to get at that data. What has been possible all along is to manually run the setup module, dump that into a file and carry on from there, but it’s a bit messy.)

Update: Hot off the mailing-list press, the following collects all facts from all machines and dumps them into the specified directory.

ansible all -m setup --tree /tmp/dump_path

Ansible’s callback plugins are poorly documented, but after quite a bit of trial and error and error, I’ve been able to obtain the data I’m looking for, experimenting with the callbacks contained in the example noop.py.

For example:

  • playbook_on_start is invoked at, well, yes, Playbook start. :)
  • playbook_on_task_start when a task starts.
  • playbook_on_vars_prompt after a Playbook has prompted the user for variables.

In my ansible directory, I drop the following bit of Python into lib/ansible/callback_plugins/inventory.py where Ansible picks it up on its next run.

import os
import time
import sqlite3

dbname = '/etc/ansible/setup.db'
TIME_FORMAT='%Y-%m-%d %H:%M:%S'

try:
    con = sqlite3.connect(dbname)
    cur = con.cursor()
except:
    pass

def log(host, data):

    if type(data) == dict:
        invocation = data.pop('invocation', None)
        if invocation.get('module_name', None) != 'setup':
            return

    facts = data.get('ansible_facts', None)

    now = time.strftime(TIME_FORMAT, time.localtime())

    try:
        # `host` is a unique index
        cur.execute("REPLACE INTO inventory (now, host, arch, dist, distvers, sys,kernel) VALUES(?,?,?,?,?,?,?);",
        (
            now,
            facts.get('ansible_hostname', None),
            facts.get('ansible_architecture', None),
            facts.get('ansible_distribution', None),
            facts.get('ansible_distribution_version', None),
            facts.get('ansible_system', None),
            facts.get('ansible_kernel', None)
        ))
        con.commit()
    except:
        pass

class CallbackModule(object):
    def runner_on_ok(self, host, res):
        log(host, res)

The callback I’m interested in is called runner_on_ok which is invoked upon a successful run of a module. Each and every module. That means, say, the command module will also end up in here. To obtain results from the setup module only, I inspect the name of the module and return if it isn’t "setup". Once I’ve determined that, I can grab the facts I want to record in my database table.

sqlite> SELECT * FROM inventory;
2012-09-11 12:25:20|hippo|x86_64|CentOS|6.2|Linux|2.6.32-220.17.1.el6.x86_64
2012-09-11 12:32:09|jmbp|i386|NA|NA|Darwin|10.8.0

If I wanted to obtain results from fact modules we create ourselves, I’d have to explicitly add check for the appropriate module name: these custom facts aren’t merged in at setup time. What is automatically merged into ansible_facts however, are “facts” obtained from facter and/or ohai if these utilities are installed on the nodes.

Exporting the data we obtained to CSV and formatting it “pour le chef” is a cinch:

#!/bin/sh

sqlite3 /etc/ansible/setup.db <<EOF
.headers on
.mode csv
.output setup.csv
SELECT * FROM inventory;
EOF

CSV in Numbers

So, you see: as someone said the other day: Ansible is automation that even a manager can understand. :-)