Migrating to BIND9 dnssec-policy

Here are some notes on migrating a signed zone from BIND’s old auto-dnssec to its new dnssec-policy.

I have been procrastinating this migration for years, and I avoided learning anything much about dnssec-policy until this month. I’m writing this from the perspective of a DNS operator rather than a BIND hacker.

migrating from auto-dnssec
why I need a custom policy
matching an existing policy
more policy details
preparing the key files
the big config change
DONE

This is the second article in a three part series:

Introducing BIND9 dnssec-policy
Migrating to BIND9 dnssec-policy (this post)
BIND9 dnssec-policy appendices

migrating from auto-dnssec

My aim is to move my zones from old-style auto-dnssec to new-style dnssec-policy with minimal disruption.

Specifically, I want to continue treating my DNSSEC keys as static configuration. I will port my existing keys over to dnssec-policy without any rollovers, and give them an unlimited lifetime so that named does not try to replace them.

One change at a time! Maybe later on I will implement a more dynamic dnssec-policy.

risks to avoid

My fear with moving to dnssec-policy is that it can trigger an accidental key rollover or even an algorithm rollover. There are two possible causes:

The configured dnssec-policy does not match the existing keys.
The dnssec-policy machinery inside named misunderstands the state of the existing keys.

I’ll explain how to deal with them in turn, after a few preliminaries.

things to know

My previous blog post introducing dnssec-policy covers some basics, including:

the rndc dnssec command
debug logging
dnssec-policy state names

prior preparations

My zones are (mostly) using algorithm 13 (ECDSA P256 SHA256) since I did an algorithm rollover a few years ago. If you are following along at home, and you are still using RSA keys, you can upgrade them using my DNSSEC algorithm rollover HOWTO before upgrading to dnssec-policy. I’m not going to investigate algorithm rollovers with dnssec-policy right now.

which version

I upgraded my primary DNS server to latest Debian Stable (bookworm, 12.5) before this process, so I’m using BIND 9.18.24. Although I have not tried it, I think the procedure described below should work with BIND 9.16 as well – but 9.18 has the advangate of being an LTS release with more bug fixes.

test jig

I adapted my Ansible setup to make an isolated copy of my primary DNS server on my dev box. I can easily wipe the copy and rebuild it from scratch. I used it to experiment with dnssec-policy and work on the Ansible changes and migration plan in safety.

These notes are based on what I learned from repeatedly breaking and fixing this scratch server.

why I need a custom policy

As we saw with my previous experiments with a scratch zone, BIND’s default dnssec-policy wants a single CSK combined signing key per zone, using algorithm 13 (ECDSA P256 SHA256), with an unlimited lifetime.

It is a sensible default for new setups, however it almost certainly does not match the (implied) policy for a zone using auto-dnssec. BIND’s older tooling preferred zones to be set up with two keys, a ZSK zone signing key and a KSK key signing key.

Although most of my zones match the default algorithm, they don’t match the default CSK keying style, so I still need a custom policy. Oh, and I have other settings that need to be updated to the new style as well.

matching an existing policy

When I examine my key directory, I see two algorithm 13 keys for each zone, as follows. (I’ve abbreviated my shell prompt to :;)

    :; ls -1 Kdotat.at*.key
    Kdotat.at.+013+30212.key
    Kdotat.at.+013+53798.key

So my matching policy looks like:

    dnssec-policy fanf {
        keys {
            ksk lifetime unlimited algorithm 13;
            zsk lifetime unlimited algorithm 13;
        };
        # ... more here ...
    };

it has two keys, one ZSK and one KSK
algorithm 13 matches the existing keys
lifetime unlimited means no rollovers

By itself the dnssec-policy block does not alter the running of any zones, so I can add it to named.conf right away.

more policy details

The following extra settings fill the # ... more here ... space in my dnssec-policy definition.

The documentation for these settings is in the dnssec-policy block in the BIND ARM.

max zone TTL

The dnssec-policy machinery needs to ensure that its state transitions are slower than the relevant TTLs. It has a max-zone-ttl setting that enforces a limit on the TTL of records in the zone.

By default this is 24h, which is fine for my purposes.

But be warned! If your zone has longer TTLs, then named will reject it: the zone will not load and queries will fail.

This error causes log messages that look like:

    zoneload: error: zone fanf2.ucam.org/IN:
            loading from master file fanf2.ucam.org failed: out of range
    zoneload: error: zone fanf2.ucam.org/IN: not loaded due to errors.

DNSKEY TTL

I normally use a 1 hour TTL, except for “infrastructure” records which I give a 24 hour TTL. Infrastructure records are those that are used for resolution and validation but mostly not queried directly, i.e. NS records, addresses of nameservers, and DNSKEY and DS records.

Previously I used the dnssec-settime -L 24h command on the key files to set the TTL on DNSKEY records. With dnssec-policy that becomes a configuration statement:

    dnskey-ttl 24h;

signature lifetimes

I prefer shorter RRSIG lifetimes than is traditional. With auto-dnssec I adjusted them by putting the following in my named.conf options block:

    sig-validity-interval 10 8; # days

This means that signatures last 10 days, and are regenerated 8 days before they expire, i.e. the zone is re-signed every 2 days.

In dnssec-policy this becomes:

    signatures-refresh 8d;
    signatures-validity 10d;
    signatures-validity-dnskey 10d;

The default signatures-refresh is 5 days. It must be at least the zone’s SOA expire timer plus the max zone TTL, which in my zones is 7 days plus 1 day.

If there is a problem such that a secondary server is unable to refresh its copy of a zone, we want to ensure that the zone expires before its signatures become invalid, so that the secondary server does not serve bogus data.

The signatures-validity and signatures-validity-dnskey settings control signatures generated by the ZSK and KSK respectively.

other settings

There are several other dnssec-policy settings which mostly relate to rollover timing. Since I have given my keys lifetime unlimited to avoid rollovers, I can leave all the other settings at their defaults.

preparing the key files

The plan is to make sure that a zone’s DNSSEC key files contain a complete description of the current state of the zone before enabling dnssec-policy. This should ensure that when dnssec-policy is activated it believes everything is already “omnipresent”, so it will not think that the zone needs to go through any unplanned state transitions.

This is the part that took the most experimentation to work out…

I’ll edit the key files using the dnssec-settime command.

which key is which

I usually check the comments in the .key files to identify which one is the KSK and which is the ZSK:

    :; grep -h signing Kdotat.at.+013+*
    ; This is a key-signing key, keyid 30212, for dotat.at.
    ; This is a zone-signing key, keyid 53798, for dotat.at.

zone signing key

My ZSKs did not need any changes, since dnssec-keygen created them with the necessary timing metadata. To verify, I can inspect each ZSK and make sure that I see old times in the first three lines and all other times UNSET, like this:

    :; dnssec-settime -p all Kdotat.at.+013+53798
    Created: Mon Jan 13 14:42:55 2020
    Publish: Mon Jan 13 14:42:55 2020
    Activate: Mon Jan 13 14:42:55 2020
    Revoke: UNSET
    Inactive: UNSET
    Delete: UNSET
    SYNC Publish: UNSET
    SYNC Delete: UNSET
    DS Publish: UNSET
    DS Delete: UNSET

key signing key

Several of my KSKs had missing times. To fix them, I got the key’s creation time using:

    :; dnssec-settime -p all Kdotat.at.+013+30212

Then I set the “sync” (i.e. CDS) and DS publication times to the same as the creation time. This is not historically accurate; it just needs to be sufficiently far in the past that dnssec-policy believes everything is already “omnipresent”.

    :; time='Mon Jan 13 14:42:53 2020'
    :; dnssec-settime -Pds   "$time" Kdotat.at.+013+30212
    :; dnssec-settime -Psync "$time" Kdotat.at.+013+30212

After running these commands, I double check to be sure the output has the same old times in the first three lines, and in the “SYNC Publish” and “DS Publish” lines, and all the others are UNSET.

    :; dnssec-settime -p all Kdotat.at.+013+30212
    Created: Mon Jan 13 14:42:53 2020
    Publish: Mon Jan 13 14:42:53 2020
    Activate: Mon Jan 13 14:42:53 2020
    Revoke: UNSET
    Inactive: UNSET
    Delete: UNSET
    SYNC Publish: Mon Jan 13 14:42:53 2020
    SYNC Delete: UNSET
    DS Publish: Mon Jan 13 14:42:53 2020
    DS Delete: UNSET

deploy updated keys

After I updated all the KSK “SYNC Publish” and “DS Publish” times in my Ansible repository, I updated the copies on my live primary server.

This caused some zones to get CDS and CDNSKEY records where they were previously missing, but otherwise everything continued as before.

permissions change

In the past I have set up the key directory on my primary servers to be read-only for named.

To prepare for dnssec-policy I had to make the key directory writable. In particular, named will need to create a .state file for each key when I switch a zone to dnssec-policy.

If you have not explicitly set a key-directory, you don’t need to worry about this. The default is to keep keys in named’s working directory which must be writable.

the big config change

When I am ready to put my new dnssec-policy into effect, it will be a one line change for each zone:

    zone dotat.at {
        type primary;
        file "dotat.at";
        update-policy local;
        # remove this line
        #auto-dnssec maintain;
        # insert this line
        dnssec-policy fanf;
    };

activate the change

When it goes smoothly, named normally logs almost nothing about this change, so for reassurance I want to make sure that debug logging is on before I put the change into effect:

    :; rndc trace 3
    :; rndc reconfig

log messages

I’m not going to quote the log messages verbatim because they are long and repetitive, with variations for each key and each state machine.

There are several messages like:

    DNSKEY dotat.at/ECDSAP256SHA256/30212 (KSK)
             initialize DNSKEY state to OMNIPRESENT (policy fanf)

And several more like:

    KSK dotat.at/ECDSAP256SHA256/30212
            type DNSKEY in stable state OMNIPRESENT

What I want to see here is all the state machines for all the keys going straight to “OMNIPRESENT”.

In normal operation, with both auto-dnssec and dnssec-policy, you’ll see hourly info log messages for each zone like,

    zone tz.dotat.at/IN: reconfiguring zone keys
    zone tz.dotat.at/IN: next key event: 11-May-2024 13:42:01.407

When dnssec-policy is active, the debug log messages repeat that everything is “in stable state OMNIPRESENT” between each zone’s info messages.

query status

The rndc dnssec -status output confirms everything is everywhere all at once:

    :; rndc dnssec -status dotat.at
    dnssec-policy: fanf
    current time:  Sat May 11 12:51:01 2024

    key: 53798 (ECDSAP256SHA256), ZSK
      published:      yes - since Mon Jan 13 14:42:55 2020
      zone signing:   yes - since Mon Jan 13 14:42:55 2020

      No rollover scheduled
      - goal:           omnipresent
      - dnskey:         omnipresent
      - zone rrsig:     omnipresent

    key: 30212 (ECDSAP256SHA256), KSK
      published:      yes - since Mon Jan 13 14:42:53 2020
      key signing:    yes - since Mon Jan 13 14:42:53 2020

      No rollover scheduled
      - goal:           omnipresent
      - dnskey:         omnipresent
      - ds:             omnipresent
      - key rrsig:      omnipresent

key files

After dnssec-policy is enabled, another sign that all is well is that the zone’s .key and .private files are the same as before: there are no changes to the timing metadata.

There is a new .state file for each key, containing another copy of the old timing metadata, and notes that all states are “omnipresent”.

I imported the .state files into my Ansible repository and adjusted things so that they would get (re)deployed the same way as the other key files.

DONE

I have gone through this process for all my personal zones, including dotat.at, so they are now all running with dnssec-policy in production.

I have a few more miscellaneous notes, but I’ll put them in another post.