Understanding the .io TLD's DNS configuration vulnerability
First there was Matthew Bryant's The .io Error - Taking Control of All .io Domains With a Targeted Registration, about a configuration error that allegedly allowed you to take over control of some .io nameservers, and then there was a response to it, Matt Pounsett's The .io Error: A Problem With Bad Optics, But Little Substance, which argued that this was much ado about nothing much. While I agree that the consequences are less severe than Bryant thought, I think that Pounsett's article understates the risks itself (and I believe doesn't correctly explain what's going on in the DNS here). In any case, the whole thing confused me and other people, so I'm going to write my understanding of things up here.
Let's start with the basics of compromising a domain through dangling
nameserver delegation. Suppose you find a domain barney.io
that
lists ns1.fred.ly
as one of its two nameservers, and fred.ly
is not registered (worse nameserver mistakes happen).
To attack barney.io
you register fred.ly
and create a ns1.fred.ly
A
record that points to a nameserver that you're running. Some
portion of the people looking up information in barney.io
will
wind up querying your nameserver, and at that point you can give
them whatever answers you want. If they're asking their original
question, you can directly lie to
them (telling people that all MX
entries in barney.io
point to
harvestmail.fred.ly
, for example). If they're making NS
queries
to check for zone delegation, you can just give them NS
records
that point to you and start lying some more when they follow those
NS
records.
(You can then increase how many people will talk to ns1.fred.ly
by DOSing the other barney.io
DNS server
off the Internet.)
This is more or less what the setup was for .io
. Among .io
's
nameservers were ns-a1.io
through ns-a4.io
, and all of those
names could be registered as domains in .io
and then given A
records in your DNS data for your new domain(s) (and Matthew Bryant
did just this with ns-a1.io
). However, there was an important
difference that made this less severe than my example, and that's
that .io
had active glue records in
the root zone for those names that pointed people to the IP addresses
of the real nameservers. With these glue records present, a client
didn't talk to Matthew Bryant's DNS server just because it decided
to use ns-a1.io
as part of resolving a .io
name; if it believed
and used the glue records, it would wind up talking to the real
nameserver. You only had your query diverted to Bryant's DNS server
if you decided to send a query to ns-a1.io
but not use the IP
from the glue record and instead look it up directly.
Using data from glue records instead of looking things up yourself
is common but not mandatory, and there are various reasons why a
resolver would not do so. Some recursive DNS servers will deliberately
try to check glue record information as a security measure; for
example, Unbound has the
harden-referral-path
option (via Tony Finch). Since the
original article
reported seeing real .io
DNS queries being directed to Bryant's
DNS server, we know that a decent number of clients were not using
the root zone glue records. Probably a lot more clients were still
using the glue records, through.
(There are a bunch of uncertainties about just what DNS data was
being returned by who during the incident. The original article
shows a reply from a root server and that probably didn't change,
but we don't know what the official .io
servers themselves started
returning as glue records for .io
during the time that ns-a1.io
was active as a domain registration. I will decline to speculate on
what was the likely result here.)
Given my history with glue record hell,
it amuses me that this is a case where dangling glue records helped
instead of hurt, making a problem less severe than it would otherwise
have been. Had there been no glue records or incomplete glue records
for the .io
zone, there would have been more danger (or at least
the danger would have been more clearer).
(In this case the presence of the glue records was mandatory, since
these were NS
names inside the zone itself. Without glue records
in the root zone, you would have a chicken and egg problem in getting
the IP address of, say, a0.nic.io
.)
PS: As far as I can see from Bryant's article, he didn't realize
that the root zone glue records would cause many clients to not
query his DNS servers, significantly reducing the severity of someone
having control over the names of four of the seven .io
DNS servers.
As far as Pounsett's article goes, he appears to more or less spot
the issue with root glue but doesn't explain it and appears to
expect all clients to use the glue all of the time (which is
demonstrably not the case). I think he may also be confusing the
data in the .io
zone with the root zone glue for .io
. Note that
it's not necessary to get your IP address for ns-a1.io
included
in the .io
zone; to make some clients start talking to you, it's
sufficient for NS
records for ns-a1.io
to show up and ideally
to occlude the A
and AAAA
records.
(We know that Bryant's NS
records showed up in the .io
zone.
We don't know if they occluded the A
record for ns-a1.io
that
was there, but it seems likely that they did.)
Sidebar: What I suspect went wrong in .io
's procedures
It seems quite likely that ns-a1.io
through ns-a4.io
were
intended to be purely host names of DNS servers, not domain names,
much like my example of ns1.fred.ly
. However, they were placed
directly in the apex of a zone (.io
) that allows people to register
domains, and I suspect that the people running the IO zone forgot
to tell the people running the IO registry that these names existed
in the zone as host names and should be locked out from domain
registration. That's been fixed now, obviously, and WHOIS tells
me they're 'Reserved by Registry'.
(This is thus a different failure mode than having NS
records for
your domain or TLD that point to hosts in entirely unregistered
domains. That's a pure failure, since the names don't exist at
all except perhaps through lingering glue records.
Here the names existed entirely properly, it's just that the IO
registry was allowed to override them with new data.)
The problem doesn't come up for the other .io
nameservers, which
are all under nic.io
, since nic.io
is already a registered
domain in .io
.
|
|