.@ Tony Finch – blog

(Tuesday’s notes) (Wednesday’s notes) (Thursday’s notes) (Epilogue)

This week I have taken time off work to enjoy the beer festival, like I did last year, and again I am planning to do some recreational hacking on my qp tries.

Last year I fiddled with some speculative ideas that turned out to be more complicated and less clearly beneficial than I hoped. I didn’t manage to get it working within one week, and since it wasn’t obviously going to win I have not spent any more time on it. (I’m not a scientist!)

During this year’s beer festival I’m seeing if I can help the Knot DNS developers improve Knot’s qp trie.

Monday morning

My start was delayed a bit because I needed to deploy the BIND security release that fixed a crash bug I reported in March.


My general plan is to work on redusing Knot’s memory use for UPDATE and IXFR, both of which involve incremental changes to a DNS zone. At the moment, Knot makes a clone of the zone, modifies the clone, then discards the old version. The clone doubles the zone’s memory usage, which can be painful if it is big.

My aim is to add copy-on-write (COW) updates, so that the memory usage is proportional to the size of the update rather than the size of the zone. Operators will still have to size a server to allow for double memory during whole-zone updates; the aim is to make frequent small updates cheaper.

Pre-flight checks

On Friday last week I discussed my ideas with the Knot developers to confirm that my plan is definitely something they are also interested in, and Vladimír Čunát (who adapted qp tries into Knot) had some very useful suggestions.

My first step today was to build Knot and make sure I could run its test suite. This was pleasingly easy :-)

Reading and thinking

Most of the rest of the afternoon I spent reading the code, understanding the differences between Knot’s qp code and mine, and thinking about how to add COW. It’s difficult to measure progress at this stage, since it’s mostly throwing away half-formed ideas that turn out to be wrong when I understand the problem better.

The importance of context

The most obvious difference between Knot’s qp code and mine is the API. My API was bodged without any particular context to shape it, other than to provide a proof of concept for the data structure and something that a test suite could use.

This COW work requires extending Knot’s trie API, and Knot has an existing infrastructure for zone changes that will use this new API.

This context is really helpful for clarifying the design.

A half-formed bad idea

In the run-up to this week I had been aiming for a reasonably self-contained project that I could make decent progress on in a few beery days. But it looks like it will be more complicated than that!

Today, when I was thinking about the edge cases and fenceposts and invariants for COW qp tries, I started off thinking of it as a purely internal problem - internal to the data structure.

In a qp trie, whereas branches are complicated (and have space for extra COW bits), leaf nodes are dead simple. The overall layout of the trie, and its memory use, are mostly constrained by the layout of a simple leaf node: a pair of words containing a key pointer and a value pointer. Making leaves bigger makes the whole trie bigger.

When I was thinking of COW as an internal problem I was scavenging space from branch nodes only. But, for COW to work, we also have to COW the key / value data structures hanging off each leaf, and the application using the data structure has to co-operate with the COW so that it knows about key / value pairs that got altered or deleted.

So the COW metadata can’t be purely internal, it has to extend into the leaf nodes and into the application-specific data.

How this relates to Knot

In Knot, the trie is a map from DNS names to the list of RRsets owned by that name. It’s possible to push COW a few levels beyond the trie into these RRsets, but that’s something to leave for later.

What’s unavoidable is keeping track of which leaf nodes were modified or deleted during a COW update - we have to know when to copy or free these RRsets.

Keeping that information inside the trie almost certainly requires making leaf nodes bigger, which defeats the goal of reducing memory use.


So I have gone back to an earlier half-formed idea, that I should use an auxiliary list of modified or deleted nodes - interior branches or application leaves - that only exists during updates, and so does not bloat the trie in normal read-only use.

Maybe this will allow me to draw a line to keep the scope of this project inside the qp trie data structure (and maybe inside a week), without getting lost in the weeds of a large application.