DSC is probably the de-facto standard for monitoring queries to DNS servers, certainly in larger environments. It consists of so-called collectors which sniff DNS packets and periodically dump them as XML files into the file system. On the other side, the DSC presenter creates graphs for long-term viewing. While DSC is easy to set up, it’s not everybody’s cup of tea; some find it too heavyweight, others not dynamic enough. Indeed, DSC is primarily useful as a “hindsight” utility.

Unfortunately, there’s no standard for query-logging. BIND9 has optional querylogging which can be switched on or off on demand, while NSD leaves us pretty much in the rain, as do most of the other server brands.

I’ve been working quite intensively with Logstash and ElasticSearch, and it occurred to me that these two utilities could be a bridge into logging DNS queries. Kibana, the front-end utility to logs stored by Logstash into ElasticSearch, is certainly good-looking enough to serve as a monitor.

Overview

I’m pretty familiar with the format of zone master files, but I’m no wiz at decoding DNS packets, so I went shopping for a utility I could mess about with. Passivedns, by Edward Bjarte Fjellskål, does what I want. I’ve forked the tool and have added bits and pieces of random code for this proof of concept. The credit for the hard work goes to its original author, of course. My version, which I’ve dubbed stash53, is probably broken beyond repair, and that is purely my fault.

My original intention was to grab DNS queries, encode them into something more or less printable, and publish them via MQTT (read more). This would allow any number of subscribers to grab the data on the fly and do something useful with it. I thought I’d simply pass some of that into my mqtt2graphite to visualize things. In order to keep things simple, and be able to easily debug, I chose JSON as the transport format. (Messagepack et.al. would probably be better, as they reduce the size of the data.)

The more familiar I get with Logstash, the more the idea appealed to me to have Logstash collect the data itself. However, Logstash cannot subscribe to MQTT directly, but it does have inputs for ZeroMQ and for Redis, so I added what I call “emitters” for those as well.

If you’re still following, and I hope you are, what we now have is the following:

Stash53 architecture

stash53 sniffs packets, wraps them in JSON, and, at your choice, pushes them into a Redis list or publishes them via ZeroMQ or MQTT. An example stash53 JSON-wrapped packet:

{
    "d_addr": "192.168.1.10", 
    "error": true, 
    "n": 1, 
    "nsid": "hippo-B9", 
    "qname": "www.example.com.", 
    "qtype": "A", 
    "rcode": "NXDOMAIN", 
    "s_addr": "192.168.1.130"
}

Logstash is configured to read from the Redis list. It pulls in the individual JSON-formatted events, chases them through the GeoIP filter, and passes them to ElasticSearch for storage.

input {
    redis {
        type              => "dns"
        host              => 'localhost'
        port              => 6379
        db                => 0
        data_type         => 'list'
        key               => 'dns:hippo'
        format            => 'json'
        message_format    => "%{s_addr} %{qname} (%{qtype})"
    }
}

filter {
    date {
        match       => [ "timestamp", "UNIX" ]
        add_tag     => 'dated'
    }

    geoip {
        type        => 'dns'
        field       => 's_addr'
        add_tag     => [ 'geo' ]
        database    => '/home/jpm/logstash/geoip/GeoIP.dat'
    }
}

output { 
    elasticsearch {
        cluster     => 'logstash'
        host        => 'localhost'
    }
}

You can see what the result looks like, in the image above.

Kibana lets me click on a single event, which shows all the details of the capture:

The JSON for each of these events is stored in ElasticSearch and looks like this. (The additional fields are added by Logstash when the event passes through it. In particular, the GeoIP Logstash filter will add geographical information if it exists for the source address.)

{
    "@fields": {
        "answer": "82.165.102.119", 
        "d_addr": "192.168.1.10", 
        "error": false, 
        "geoip": {
            "continent_code": "--", 
            "country_code": 0, 
            "country_code2": "--", 
            "country_code3": "--", 
            "country_name": "N/A", 
            "ip": "192.168.1.10"
        }, 
        "ipv6": false, 
        "n": 85, 
        "nsid": "hippo-B9", 
        "qclass": "IN", 
        "qname": "www.ww.mens.de.", 
        "qtype": "A", 
        "rrprint": "www.ww.mens.de.\t86400\tIN\tA\t82.165.102.119\n", 
        "s_addr": "192.168.1.130", 
        "tld": "de", 
        "ttl": 86400
    }, 
    "@message": "192.168.1.130 www.ww.mens.de. (A)", 
    "@source": "default", 
    "@source_host": null, 
    "@source_path": "default", 
    "@tags": [
        "geo"
    ], 
    "@timestamp": "2013-05-27T09:45:31.916Z", 
    "@type": "dns"
}

The map you see in the screen shot, is built directly by Kibana from the GeoIP data we have in the events.

Closeup of the map

The same goes for the pie chart with a list of countries that have queried my servers. Both the map and the pie chart are updated automatically as new events arrive!

Closeup of the CC list

stash53 also attempts to log response codes, so we can see what is going on. The following screen shot depicts Kibana’s events table showing the destination address (d_addr), query type (qtype) and response code (rcode) fields. I simply instructed Kibana to display these with a click in the corresponding field.

Showing RCODEs

The Logstash/ElasticSearch combo ought to sustain around 2000 events/sec on commodity hardware, at least that’s the kind of throughput I can store on the box it’s running on at the moment, an HP MicroServer.

I have no idea what stash53 is able to sustain, as I don’t have access to a server that obtains tremendous volume of queries. I was able to run stash53 on one of the NixSPAM mirrors for a day, and it survived the 1000 qps pretty bravely. Be that as it may, this combo will certainly not be able to cater to a TLD. :-)

I’ve been running stash53 on different DNS server brands for a week now (BIND9, PowerDNS authoritative, PowerDNS Recursor, Knot, Unbound, and NSD), and it’s holding out pretty nicely. The program still needs a lot of love, particularly in terms of error-handling, and it can probably be greatly improved upon by somebody who has the time and the inclination.

Make it happen. :-)

Further reading:

DNS, Logstash, ElasticSearch, and pcap :: 27 May 2013 :: e-mail