Faster DNS name decompression

Earlier this year, I rewrote BIND’s DNS name compression algorithm. That work is in the November release, BIND 9.19.7.

Last week I rewrote the reverse algorithm, DNS name decompression, and made it about 2x faster. I enjoyed deleting this comment that dates back to January 1999:

/*
 * Note:  The following code is not optimized for speed, but
 * rather for correctness.  Speed will be addressed in the future.
 */

So here is a collection of folklore and tricks related to DNS name decompression.

folklore 1: internal names
- folklore 1a: label offsets
- folklore 1b: canonical case
folklore 2: compression pointer loops
- folklore 2a: count pointers
- folklore 2b: prior occurance
trick 3: doing less
folklore trick 4: tolower
trick 5: memmove more
- trick 5a: tolower more
trick 6: bounds checks
- trick 6a: check less
trick 7: where’s next?
- bug 7a: don’t use max()
trick 8: not a DNS pointer
performance
correctness
¿por qué no los dos?

This article is part of a series on optimizing DNS names:

folklore 1: internal names

How are DNS names represented in memory?

Almost all the different implementations that I have looked at use the (uncompressed) DNS wire format. It might seem like a trivial or simplistic answer, but DNS names are so small that it’s hard to do better. In particular, most DNS labels are smaller than a 64-bit pointer so adding a layer of indirection is likely to be bad for size as well as locality.

folklore 1a: label offsets

Actually, that’s an over-simplification.

Alongside the wire format name, there is usually a little vector holding the offsets in the name of each label. Each offset only needs 1 byte (because names are at most 255 long) and there are at most 128 of them.

(Implementations vary in whether they store offsets in increasing or decreasing order.)

So one of the jobs of the DNS name decompression code is to set up this array of label offsets.

folklore 1b: canonical case

The DNS is case-insensitive for ASCII letters, and traditionally it makes some attempt to preserve case (tho that is often futile).

However, DNSSEC requires canonicalizing to lower case in certain situations, so some DNS implementations smash case before doing anything else with a name.

BIND is in the case-preserving camp.

folklore 2: compression pointer loops

One of the most common traps that DNS implementations fall into is chasing compression pointers without limit. When they receive a packet where the pointers form a loop, they sit chewing CPU and doing nothing else.

folklore 2a: count pointers

One way to avoid the bug is to set a limit on the number of pointers allowed in a name. However I am not sure if there’s any agreement on what that limit should be.

folklore 2b: prior occurance

Close reading of RFC 1035 section 4.1.4 reveals that DNS name compression pointers must point earlier in the message. So another solution is:

place a marker at the start of the name;
when a pointer is found,
- ensure its target is strictly before the marker;
- move the marker to the pointer’s target.

trick 3: doing less

BIND’s old decompression code was based on a loop that examined the name one byte at a time.

BIND’s new decompression code has a loop that examines the name one label at a time — and tries to do as much work as possible one pointer at a time. When a name is uncompressed, much of the work can be done once at the end of the name.

folklore trick 4: tolower

The straightforward way to smash a DNS name to lower case is to iterate one label at a time and tolower() each byte inside each label.

Another example of doing less is to observe that label lengths are at most 63, and ASCII upper case letters are at least 65, so they cannot be confused. This means your code can run a fast tolower loop over the whole name, and make better use of memory bandwidth.

trick 5: memmove more

Let’s use the same trick again!

Most of the DNS name decompression code I have seen copies each label one at a time from the DNS message to the internal name buffer.

But take another look at 2b: we have a handy marker at the start of the sequence of labels that we just scanned. So instead of copying one label at a time, we can do the copying one pointer at a time, using one memmove() for everything between the marker and the position of the pointer.

trick 5a: tolower more

And if your DNS server prefers to smash case, it can use trick 4 when copying a multiple labels in one go.

I would be interested to hear which other DNS implementations decompress names by copying all labels between each pointer in one shot, instead of one label at a time

trick 6: bounds checks

It’s pleasing that the marker does two jobs for us. Here’s a two-for-one twofer!

For each label, the new code

appends an element to the label offsets array;
adds the label length to the name length
- and checks its bounds
adds the label length to the current position in the message
- and checks its bounds

There’s no need for a bounds check on the offsets array, because it is big enough to take all the labels that can fit in a maximum-length name. So the name bounds check also acts as the offsets bounds check.

trick 6a: check less

If (unlike BIND) your DNS name decompressor does not need to collect label offsets, then you can move the name length check from once-per-label to once-per-pointer.

trick 7: where's next?

This is more like a one-for-two, because it eliminated a variable.

DNS names are self-delimiting, which means you have to parse a name to find out where is the next thing in the message. The spec for compressed names says (somewhat obliquely) that the end of a compressed name is immediately after the first pointer.

The old code had two variables, one for the end of the name (incremented one byte at a time), and a seen_pointer flag (checked for every byte). The new code does less! It has one variable checking how much of the message was consumed, and it is set once when the first pointer is seen.

bug 7a: don't use max()

I also wrote a long comment with evil ASCII art to explain why, when the message comes from a twisted miscreant, the end of a name is not necessarily its maximum extent in the message.

trick 8: not a DNS pointer

Another one-for-two!

If you have been reading the code alongside these notes, you will have seen that it mostly uses pointers. (And for trick 7, I used NULL to indicate when consumed is still unset.)

It would perhaps be nicer to use indexes into the buffer instead of pointers, but the compiler produced better code — I think because a pointer requires fewer registers than a base and an index, and there are a lot of variables to juggle for a relatively small loop.

performance

The new code is typically 2x faster than the old code; for names with many small labels it is 1.4x faster, and for names with few long labels it can be 15x faster. And the new code is 2x smaller: I deleted 173 lines and added 72 lines of code and 90 lines of comments (including a couple of ASCII art diagrams and a reference to the Rocky Horror show).

I’m pleased with the result, and grateful to my colleague Petr Špaček for helping to make it better.

correctness

I have said (publicly, several times, in several places) that the first thing I look at in new DNS code is the name decompression algorithm, to see if it properly protects against compression pointer loops.

So it would be deeply embarrassing for me to screw it up!

/*
 *      *** WARNING ***
 *
 * dns_name_fromwire() deals with raw network data. An error in this
 * routine could result in the failure or hijacking of the server.
 */

As well as passing the existing tests, I added a dns_name_fromwire fuzz test that ensures the new code produces identical results to the old code (including side-effects on buffers), and ran it with a good variety of sanitizers.

¿por qué no los dos?

How about the deleted comment I quoted at the top of this post?

The replacement code now (nearly 25 years later!) has speed as well as correctness. And I think it’s simpler too — except, perhaps, for the way I pulled trick #5.