Here are some notes on migrating a signed zone from BIND’s old
auto-dnssec
to its new dnssec-policy
.
I have been procrastinating this migration for years, and I avoided
learning anything much about dnssec-policy
until this month. I’m
writing this from the perspective of a DNS operator rather than a BIND
hacker.
- migrating from auto-dnssec
- why I need a custom policy
- matching an existing policy
- more policy details
- preparing the key files
- the big config change
- DONE
This is the second article in a three part series:
- Introducing BIND9 dnssec-policy
- Migrating to BIND9 dnssec-policy (this post)
- BIND9 dnssec-policy appendices
migrating from auto-dnssec
My aim is to move my zones from old-style auto-dnssec
to new-style
dnssec-policy
with minimal disruption.
Specifically, I want to continue treating my DNSSEC keys as static
configuration. I will port my existing keys over to dnssec-policy
without any rollovers, and give them an unlimited lifetime so that
named
does not try to replace them.
One change at a time! Maybe later on I will implement a more dynamic
dnssec-policy
.
risks to avoid
My fear with moving to dnssec-policy
is that it can trigger an
accidental key rollover or even an algorithm rollover. There are two
possible causes:
-
The configured
dnssec-policy
does not match the existing keys. -
The
dnssec-policy
machinery insidenamed
misunderstands the state of the existing keys.
I’ll explain how to deal with them in turn, after a few preliminaries.
things to know
My previous blog post introducing dnssec-policy
covers some
basics, including:
- the
rndc dnssec
command - debug logging
dnssec-policy
state names
prior preparations
My zones are (mostly) using algorithm 13 (ECDSA P256 SHA256) since I
did an algorithm rollover a few years ago. If you are following along
at home, and you are still using RSA keys, you can upgrade them using
my DNSSEC algorithm rollover HOWTO before upgrading to
dnssec-policy
. I’m not going to investigate algorithm rollovers with
dnssec-policy
right now.
which version
I upgraded my primary DNS server to latest Debian Stable (bookworm, 12.5) before this process, so I’m using BIND 9.18.24. Although I have not tried it, I think the procedure described below should work with BIND 9.16 as well – but 9.18 has the advangate of being an LTS release with more bug fixes.
test jig
I adapted my Ansible setup to make an isolated copy of my primary DNS
server on my dev box. I can easily wipe the copy and rebuild it from
scratch. I used it to experiment with dnssec-policy
and work on the
Ansible changes and migration plan in safety.
These notes are based on what I learned from repeatedly breaking and fixing this scratch server.
why I need a custom policy
As we saw with my previous experiments with a scratch zone,
BIND’s default dnssec-policy
wants a single CSK combined signing key
per zone, using algorithm 13 (ECDSA P256 SHA256), with an unlimited
lifetime.
It is a sensible default for new setups, however it almost certainly
does not match the (implied) policy for a zone using auto-dnssec
.
BIND’s older tooling preferred zones to be set up with two keys, a ZSK
zone signing key and a KSK key signing key.
Although most of my zones match the default algorithm, they don’t match the default CSK keying style, so I still need a custom policy. Oh, and I have other settings that need to be updated to the new style as well.
matching an existing policy
When I examine my key directory, I see two algorithm 13 keys for each
zone, as follows. (I’ve abbreviated my shell prompt to :;
)
:; ls -1 Kdotat.at*.key
Kdotat.at.+013+30212.key
Kdotat.at.+013+53798.key
So my matching policy looks like:
dnssec-policy fanf {
keys {
ksk lifetime unlimited algorithm 13;
zsk lifetime unlimited algorithm 13;
};
# ... more here ...
};
- it has two keys, one ZSK and one KSK
- algorithm 13 matches the existing keys
lifetime unlimited
means no rollovers
By itself the dnssec-policy
block does not alter the running of any
zones, so I can add it to named.conf
right away.
more policy details
The following extra settings fill the # ... more here ...
space in
my dnssec-policy
definition.
The documentation for these settings is in the dnssec-policy block in the BIND ARM.
max zone TTL
The dnssec-policy
machinery needs to ensure that its state
transitions are slower than the relevant TTLs. It has a max-zone-ttl
setting that enforces a limit on the TTL of records in the zone.
By default this is 24h, which is fine for my purposes.
But be warned! If your zone has longer TTLs, then named
will
reject it: the zone will not load and queries will fail.
This error causes log messages that look like:
zoneload: error: zone fanf2.ucam.org/IN:
loading from master file fanf2.ucam.org failed: out of range
zoneload: error: zone fanf2.ucam.org/IN: not loaded due to errors.
DNSKEY TTL
I normally use a 1 hour TTL, except for “infrastructure” records which I give a 24 hour TTL. Infrastructure records are those that are used for resolution and validation but mostly not queried directly, i.e. NS records, addresses of nameservers, and DNSKEY and DS records.
Previously I used the dnssec-settime -L 24h
command on the key files
to set the TTL on DNSKEY records. With dnssec-policy
that becomes a
configuration statement:
dnskey-ttl 24h;
signature lifetimes
I prefer shorter RRSIG lifetimes than is traditional. With
auto-dnssec
I adjusted them by putting the following in my
named.conf
options
block:
sig-validity-interval 10 8; # days
This means that signatures last 10 days, and are regenerated 8 days before they expire, i.e. the zone is re-signed every 2 days.
In dnssec-policy
this becomes:
signatures-refresh 8d;
signatures-validity 10d;
signatures-validity-dnskey 10d;
The default signatures-refresh
is 5 days. It must be at least the
zone’s SOA expire timer plus the max zone TTL, which in my zones is 7
days plus 1 day.
If there is a problem such that a secondary server is unable to refresh its copy of a zone, we want to ensure that the zone expires before its signatures become invalid, so that the secondary server does not serve bogus data.
The signatures-validity
and signatures-validity-dnskey
settings
control signatures generated by the ZSK and KSK respectively.
other settings
There are several other dnssec-policy
settings which mostly relate
to rollover timing. Since I have given my keys lifetime unlimited
to
avoid rollovers, I can leave all the other settings at their defaults.
preparing the key files
The plan is to make sure that a zone’s DNSSEC key files contain a
complete description of the current state of the zone before
enabling dnssec-policy
. This should ensure that when dnssec-policy
is activated it believes everything is already “omnipresent”, so it will
not think that the zone needs to go through any unplanned state
transitions.
This is the part that took the most experimentation to work out…
I’ll edit the key files using the dnssec-settime
command.
which key is which
I usually check the comments in the .key
files to identify which one
is the KSK and which is the ZSK:
:; grep -h signing Kdotat.at.+013+*
; This is a key-signing key, keyid 30212, for dotat.at.
; This is a zone-signing key, keyid 53798, for dotat.at.
zone signing key
My ZSKs did not need any changes, since dnssec-keygen
created them
with the necessary timing metadata. To verify, I can inspect each ZSK
and make sure that I see old times in the first three lines and all
other times UNSET, like this:
:; dnssec-settime -p all Kdotat.at.+013+53798
Created: Mon Jan 13 14:42:55 2020
Publish: Mon Jan 13 14:42:55 2020
Activate: Mon Jan 13 14:42:55 2020
Revoke: UNSET
Inactive: UNSET
Delete: UNSET
SYNC Publish: UNSET
SYNC Delete: UNSET
DS Publish: UNSET
DS Delete: UNSET
key signing key
Several of my KSKs had missing times. To fix them, I got the key’s creation time using:
:; dnssec-settime -p all Kdotat.at.+013+30212
Then I set the “sync” (i.e. CDS) and DS publication times to the same
as the creation time. This is not historically accurate; it just needs
to be sufficiently far in the past that dnssec-policy
believes
everything is already “omnipresent”.
:; time='Mon Jan 13 14:42:53 2020'
:; dnssec-settime -Pds "$time" Kdotat.at.+013+30212
:; dnssec-settime -Psync "$time" Kdotat.at.+013+30212
After running these commands, I double check to be sure the output has the same old times in the first three lines, and in the “SYNC Publish” and “DS Publish” lines, and all the others are UNSET.
:; dnssec-settime -p all Kdotat.at.+013+30212
Created: Mon Jan 13 14:42:53 2020
Publish: Mon Jan 13 14:42:53 2020
Activate: Mon Jan 13 14:42:53 2020
Revoke: UNSET
Inactive: UNSET
Delete: UNSET
SYNC Publish: Mon Jan 13 14:42:53 2020
SYNC Delete: UNSET
DS Publish: Mon Jan 13 14:42:53 2020
DS Delete: UNSET
deploy updated keys
After I updated all the KSK “SYNC Publish” and “DS Publish” times in my Ansible repository, I updated the copies on my live primary server.
This caused some zones to get CDS and CDNSKEY records where they were previously missing, but otherwise everything continued as before.
permissions change
In the past I have set up the key directory on my primary servers to
be read-only for named
.
To prepare for dnssec-policy
I had to make the key directory
writable. In particular, named
will need to create a .state
file
for each key when I switch a zone to dnssec-policy
.
If you have not explicitly set a key-directory
, you don’t need to
worry about this. The default is to keep keys in named
’s working
directory which must be writable.
the big config change
When I am ready to put my new dnssec-policy
into effect, it will be
a one line change for each zone:
zone dotat.at {
type primary;
file "dotat.at";
update-policy local;
# remove this line
#auto-dnssec maintain;
# insert this line
dnssec-policy fanf;
};
activate the change
When it goes smoothly, named
normally logs almost nothing about this
change, so for reassurance I want to make sure that debug logging is
on before I put the change into effect:
:; rndc trace 3
:; rndc reconfig
log messages
I’m not going to quote the log messages verbatim because they are long and repetitive, with variations for each key and each state machine.
There are several messages like:
DNSKEY dotat.at/ECDSAP256SHA256/30212 (KSK)
initialize DNSKEY state to OMNIPRESENT (policy fanf)
And several more like:
KSK dotat.at/ECDSAP256SHA256/30212
type DNSKEY in stable state OMNIPRESENT
What I want to see here is all the state machines for all the keys going straight to “OMNIPRESENT”.
In normal operation, with both auto-dnssec
and dnssec-policy
,
you’ll see hourly info
log messages for each zone like,
zone tz.dotat.at/IN: reconfiguring zone keys
zone tz.dotat.at/IN: next key event: 11-May-2024 13:42:01.407
When dnssec-policy
is active, the debug log messages repeat that
everything is “in stable state OMNIPRESENT” between each zone’s info
messages.
query status
The rndc dnssec -status
output confirms everything is everywhere
all at once:
:; rndc dnssec -status dotat.at
dnssec-policy: fanf
current time: Sat May 11 12:51:01 2024
key: 53798 (ECDSAP256SHA256), ZSK
published: yes - since Mon Jan 13 14:42:55 2020
zone signing: yes - since Mon Jan 13 14:42:55 2020
No rollover scheduled
- goal: omnipresent
- dnskey: omnipresent
- zone rrsig: omnipresent
key: 30212 (ECDSAP256SHA256), KSK
published: yes - since Mon Jan 13 14:42:53 2020
key signing: yes - since Mon Jan 13 14:42:53 2020
No rollover scheduled
- goal: omnipresent
- dnskey: omnipresent
- ds: omnipresent
- key rrsig: omnipresent
key files
After dnssec-policy
is enabled, another sign that all is well is
that the zone’s .key
and .private
files are the same as before:
there are no changes to the timing metadata.
There is a new .state
file for each key, containing another copy of
the old timing metadata, and notes that all states are “omnipresent”.
I imported the .state
files into my Ansible repository and adjusted
things so that they would get (re)deployed the same way as the other
key files.
DONE
I have gone through this process for all my personal zones, including
dotat.at
, so they are now all running with dnssec-policy
in
production.
I have a few more miscellaneous notes, but I’ll put them in another post.