In a previous entry, I wrote about making DNS name decompression faster by moving work left on this diagram so that we do less of it:
names < pointers < labels < bytes
Last week I had a bright idea about that leftmost step, moving per-pointer work to per-name, using some dirty tricks. Sadly the experiment was not successful, because it also increased the per-label work. Nevertheless I think it’s interesting enough to be worth writing about.
This article is part of a series on optimizing DNS names:
- tolower() in bulk at speed
- a DNS name compression algorithm
- faster DNS name decompression
- BIND zone transfer performance
- slower DNS name decompression
- tolower() with AVX-512
- tolower() small string performance
bright idea
Remember that a DNS name compression pointer refers to a “prior occurance” of a name. We have already parsed prior names, so can we re-use the work we already did, instead of re-doing it?
potential wins
The main effect of this idea is that we can stop parsing the name as soon as we get to the first pointer. We no longer need to chase pointers, so the DNS name parsing loop can be simpler and faster.
It’s relatively common for a DNS name to be nothing but a pointer, i.e. an exact match for a prior name, not a suffix. For instance, this happens when there are multiple records in an RRset: each record has the same owner name. The DNS message parser needs to collect these records into an RRset, so it needs to match owner names. That becomes easier if the name parser has already found a match.
I also thought I might be able to harden the DNS name parser against maliciously long chains of pointers, by a very strict reading of RFC 1035: a compression pointer must point to a prior name that we have already parsed.
unknown RRtypes
Sadly, hardening a DNS name parser is not that easy. A message can contain DNS records of a type that we don’t understand, and we have to treat those records as opaque blobs. But those records can contain names, which we will not have parsed. And the sender can legitimately use those names as compression pointer targets. (Thanks to Peter van Dijk for pointing this out.)
This also means that we still need a full-fat name parser that chases pointers to unknown targets. I moved it out of the hot path so it is only used in awkward cases, so the common cases can still benefit from simplification.
matching pointers
We need a data structure for looking up compression pointer targets. It needs to be very fast, ideally faster than chasing pointers, or at least negligibly slower than chasing one pointer. It also needs to avoid using lots of memory. It is difficult to satisfy both of these requirements using a data structure that is indexed with the compression pointer’s value.
Instead we use a dirty and dangerous trick.
After we have parsed a name, we add the position and name of each possible compression pointer target (the start of each label) to an array. The label length octet in the message is overwritten by its array index.
To look up a compression pointer, we get the octet in the message at the pointer’s target, and use that as the array index. We cross-check by verifying that the position stored in that array slot matches the pointer target.
As I was starting this experiment, I wrote to myself in a comment, “This is probably unwise.” Yes. Yes, it is.
roll-back
There are a couple of situations where we need to undo our message mangling.
-
After parsing a message, we may need to verify a
TSIG
orSIG0
message signature, so we need to completely clean up when parsing is done. -
BIND parses records into a target buffer; when this runs out of space, BIND allocates some more space and re-parses the record. In the out-of-space recovery path, we need to roll back to before the record so it can be re-parsed.
Our “prior occurance” array contains pointers to parsed names; the octet value that we overwrote with the array index was originally the first octet of the name we found at that point. So it isn’t too difficult to clean up the mess.
A partial roll-back is also not too hard: we can work backwards from the end of the array, because slots are added in the order they appear in the message.
fall-back
I mentioned earlier that we still need a full-fat DNS name parser to cope with awkward pointers that do not refer to a name we saw before.
This fall-back parser needs to be slightly more enriched than a normal name parser, because it needs to check the array for known pointer targets, otherwise it will be derailed by the mess we made of the message.
fatal flaws
So I got all this working, and the performance was terrible.
There were a couple of things that made re-using work more costly than re-doing it:
-
To add entries to the “prior occurance” array, we need to loop over the labels of the name a second time after parsing them. The array slots need to know the length of the name, which we don’t know until after we have parsed it.
If, instead, we fill the array slots eagerly, we we would still need to go over them again to update the slots with the name’s length. And we would need to roll back when there is a parsing error — I don’t know if this would need another mechanism in addition to the record-level and message-level roll-backs.
-
The size of the “prior occurance” array is limited to 256 slots (one for each possible octet value). This is not large enough to hold every suffix of every name in a large message in the zone transfers I was testing with.
This means the more complicated fall-back parser was used a lot. We were not re-using as much work as we could have.
And the potential win from quickly matching owner names was not in fact very big, because if (like BIND) you scan backwards for matching names, you’ll either get a hit on the first one or not at all — and the first one was the case I was optimizing.
maybe search?
I considered (but I didn’t implement) another strategy: instead of mangling the message to get fast pointer matches, search the “prior occurance” array for a matching pointer.
On the up side, that can require less work adding entries to the array: we can have a slot per name, instead of a slot per label.
On the down side, it isn’t clear that searching the array will be faster than chasing compression pointers. Even after reducing it from per-label to per-name, the array is likely to grow to 100 or 200 entries, and take 7 or 8 steps of a binary search, which is more than the typical number of pointers in a name.
On the up side, we can ensure we don’t run out of slots in the array, so we can avoid using the fall-back parser as much as possible.
On the down side, once we have found a name, we need to search inside it when the pointer refers to a suffix of the name.
I am not going to try this out, because there are easier performance wins to be found elsewhere in BIND’s message parser.
interned names?
At IETF 115 in London I chatted with Perry Lorier. His DNS code interns labels, which is not a million miles away from my bright idea.
I remain mildly skeptical that interning DNS names (or parts of names)
is a win overall, but I have not thought through the effects it might
have on a DNS server as a whole. Choices like this are what people at
Microsoft call “peanut butter”: once you have chosen to put peanut
butter in your software sandwich, it pervades the whole thing, and
any other optimizations fillings will have much less effect on the
performance flavour.
anyway
I enjoyed using a different take on the trick of making the DNS message part of the data structure, like my DNS name compression algorithm.
And I like the trick of using cross-checks between two semi-co-operative arrays when you aren’t sure an index is valid. (I wish I could remember some other examples of this trick — some kind of hash map, I think?)
So it was a fun hack, even though it wasn’t useful.
p.s.
Spelling pedants should check RFC 1035 for the way Paul Mockapetris spells “occurance” :-)