Kube-dns add-on should accept option ndots for SkyDNS or document ConfigMap alternative subPath #33554

bogdando · 2016-09-27T12:25:50Z

BUG REPORT:

Kubernetes version (use kubectl version):
Kubernetes v1.3.5+coreos.0,
Kube-dns add-on v19 from gcr.io/google_containers/kubedns-amd64:1.7

Environment:

Cloud provider or hardware configuration: on-premise VMs
OS (e.g. from /etc/os-release): Ubuntu Xenial
Kernel (e.g. uname -a): 4.4.0-38-generic
Install tools: Kube-dns add-on v19 from gcr.io/google_containers/kubedns-amd64:1.7
Others:

What happened:
ndots:5 is hardcoded to the containers' base /etc/resolv.conf by kubelet running with --cluster_dns, --cluster_domain, --resolv-conf=/etc/resolv.conf flags.

What you expected to happen:
ndots should be configurable via the kubedns app definition or a configmap to allow users to chose if skydns shall be attempting absolute domains or utilizing the search domains, f.e.:

        args:
        # command = "/kube-dns"
        - --domain=cluster.local.
        - --ndots=5
        - --dns-port=10053

AFAICT, DNS SRV records expect ndots:7, thus will fail to resolv via skydns (or maybe not! #33554 (comment))
Also, this might affect DNS performance by generating undesired additional resolve queries for suggested search subdomains before actually trying the absolute domain, when a number of dots in the initial query exceeds the given ndots threshold.

How to reproduce it (as minimally and precisely as possible):
Deploy kube dns cluster add-in, check containers' /etc/resolv.conf within pods

Anything else do we need to know:
This is rather a docs issue, see #35525 (comment) for details and please address that in docs.

The text was updated successfully, but these errors were encountered:

MrHohn · 2016-10-03T17:02:03Z

Why would resolving SRV records fail?

Suppose a hostname with X dot is queried and X < ndots threshhold, it will have the search paths appended before it is sent to the name server?

For the SRV record _my-port-name._my-port-protocol.my-svc.my-namespace.svc.cluster.local, I think below inputs will act properly:

_my-port-name._my-port-protocol.my-svc.my-namespace
_my-port-name._my-port-protocol.my-svc.my-namespace.svc
_my-port-name._my-port-protocol.my-svc.my-namespace.svc.cluster.local

Given the search paths is:

default.svc.cluster.local
svc.cluster.local
cluster.local

Please correct me if I miss something.

macb · 2016-10-05T18:16:30Z

Though SRV records may not fail, the ndots configuration should be exposed somewhere. The current 5 is a pretty rough default to expect everyone to adhere to. It results in something like 10+ dns queries per non-cluster domain attempted in our clusters.

#14051 (comment) describes some scenarios where lowering it "won't work". That doesn't actually seem to be the case, it just changes the behavior from first resolving all search domains and then trying the absolute domain to instead attempting the absolute domain then trying the search domains.

cc @thockin (linked your comment above)

Our clusters use cluster domains along the lines of: .int.clustername.region.internal.my.domain
This allows us to manage PKI off of our .internal.my.domain domain. This seems to get expanded to making search domains quite pricey when combined with the hosts /etc/resolv.conf search domains (6 configured search domains).

We have an existing application that lives outside our clusters at:
application.internal.my.domain

The 3 dots means we'll attempt the 6 cluster search domains before actually trying the absolute domain (which ultimately is resolved by the regional DNS servers outside the k8s cluster). That's 12 DNS queries (A and AAAA for each search domain) which ultimately fail since application.internal.my.domain doesn't exist within our cluster.

If instead we could configure ndots to say ndots:3 the absolute domain is first attempted. If it NXDOMAINs, all of the search domains will be applied just as before. 3 in this case would also keep the previous behavior of application, application.default, application.default.svc utilizing the search domains first.

In any case, it would give the option to the user to decide which is more important for their usage; attempting absolute domains or utilizing the search domains.

thockin · 2016-10-06T05:25:32Z

I don't think that behavior is consistent. The resolvers I have seen never try search if the query has >= ndots dots.

$ k run thdbg --rm --restart=Never -ti --image=ubuntu
Waiting for pod default/thdbg to be running, status is Pending, pod ready: false
Waiting for pod default/thdbg to be running, status is Pending, pod ready: false
Waiting for pod default/thdbg to be running, status is Pending, pod ready: false
Waiting for pod default/thdbg to be running, status is Pending, pod ready: false
If you don't see a command prompt, try pressing enter.

root@thdbg:/# apt-get update >/dev/null 2>&1

root@thdbg:/# apt-get install -y dnsutils >/dev/null 2>&1

root@thdbg:/# cat /etc/resolv.conf 
search default.svc.cluster.local svc.cluster.local cluster.local google.internal c.thockin-dev.internal
nameserver 10.0.0.10
options ndots:5

root@thdbg:/# cat > /etc/resolv.conf << EOF
> search default.svc.cluster.local svc.cluster.local cluster.local google.internal c.thockin-dev.internal
> nameserver 10.0.0.10
> options ndots:1
> EOF

root@thdbg:/# cat /etc/resolv.conf 
search default.svc.cluster.local svc.cluster.local cluster.local google.internal c.thockin-dev.internal
nameserver 10.0.0.10
options ndots:1

root@thdbg:/# nslookup kubernetes
Server:     10.0.0.10
Address:    10.0.0.10#53

Non-authoritative answer:
Name:   kubernetes.default.svc.cluster.local
Address: 10.0.0.1

root@thdbg:/# nslookup kubernetes.default
Server:     10.0.0.10
Address:    10.0.0.10#53

** server can't find kubernetes.default: NXDOMAIN

macb · 2016-10-06T13:55:36Z

From the resolv.conf man page:

The default for n is 1, meaning that if there are any dots in a name, the name will be tried first as an absolute name before any search list elements are appended to it.

Implies search domains should still be respected even if the absolute domain is attempted. It does seem not everything respects that though. Seems like something still best left up to the user to determine for themselves?

thockin · 2016-10-06T15:03:34Z

We had a rough proposal from someone to add a DNS policy for
"ClusterNoSearch" or something that was the same as before, but only set
ndots:1 for the requesting pod.

I would accept such a PR...

On Oct 6, 2016 6:56 AM, "Mac Browning" notifications@github.com wrote:

From the resolv.conf man page:

The default for n is 1, meaning that if there are any dots in a name, the
name will be tried first as an absolute name before any search list
elements are appended to it.

Implies search domains should still be respected even if the absolute
domain is attempted. It does seem not everything respects that though.
Seems like something still best left up to the user to determine for
themselves?

—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
#33554 (comment),
or mute the thread
https://github.com/notifications/unsubscribe-auth/AFVgVNbMlEzaPmIInzweCYhxABYDqkxGks5qxP33gaJpZM4KHmmB
.

macb · 2016-10-06T18:14:28Z

tl;dr:

It seems like libbind/netresolv and musl are consistent in ignoring search domains when the ndots option is met. However glibc (and other similar implementations such as golang's dns impl) would continue to respect search domains with a lower ndots.

I wouldn't want to just go from ndots:5 to ndots:1 since the vast majority of search signaling could still be preserved with ndots:3 in our case. Though given the choice ndots:1 is likely more useful for users that have a lot of services outside their clusters since it will be 1 failed absolute domain request then falling back to search for internal requests instead of 10+ failed searches before falling back to the absolute domain for external services.

Do you have the ClusterNoSearch proposal handy? A search didn't turn anything up in issues.

With `ndots:1` and an otherwise normal k8s resolv.conf:

nslookup

All of the bind utilities (dig, nslookup, etc) use libbind (or whatever its called)

root@macb-debug:/# nslookup kubernetes.default
Server:         172.17.240.2
Address:        172.17.240.2#53

** server can't find kubernetes.default: NXDOMAIN

And tcpdump from that request where it just tries the absolute:

17:35:56.959854 IP 172.17.41.7.44089 > 172.17.240.2.53: 56391+ A? kubernetes.default. (36)
17:35:56.963720 IP 172.17.240.2.53 > 172.17.41.7.44089: 56391 NXDomain 0/0/0 (36)

This makes sense given bind documents ndots as:

+ndots=D
Set the number of dots that have to appear in name to D for it to be considered absolute. The default value is that defined using the ndots statement in /etc/resolv.conf, or 1 if no ndots statement is present. Names with fewer dots are interpreted as relative names and will be searched for in the domains listed in the search or domain directive in /etc/resolv.conf.

curl

However, if I use curl, it will resolve:

root@macb-debug:/# curl http://kubernetes.default -v
* Rebuilt URL to: http://kubernetes.default/
* Hostname was NOT found in DNS cache
*   Trying 172.17.240.1...

And tcpdump from that request where we can see it trying the absolute then falling back to search domains:

17:34:14.753159 IP 172.17.41.7.42383 > 172.17.240.2.53: 5843+ A? kubernetes.default. (36)
17:34:14.753246 IP 172.17.41.7.42383 > 172.17.240.2.53: 5736+ AAAA? kubernetes.default. (36)
17:34:14.753510 IP 172.17.240.2.53 > 172.17.41.7.42383: 5843 NXDomain 0/0/0 (36)
17:34:14.753604 IP 172.17.240.2.53 > 172.17.41.7.42383: 5736 NXDomain 0/0/0 (36)
17:34:14.753727 IP 172.17.41.7.60146 > 172.17.240.2.53: 48770+ A? kubernetes.default.default.svc.int.frog.nyc3.internal.my.domain. (88)
17:34:14.753798 IP 172.17.41.7.60146 > 172.17.240.2.53: 46767+ AAAA? kubernetes.default.default.svc.int.frog.nyc3.internal.my.domain. (88)
17:34:14.754942 IP 172.17.240.2.53 > 172.17.41.7.60146: 46767 NXDomain 0/1/0 (259)
17:34:14.755472 IP 172.17.240.2.53 > 172.17.41.7.60146: 48770 NXDomain 0/1/0 (259)
17:34:14.755721 IP 172.17.41.7.34184 > 172.17.240.2.53: 36544+ A? kubernetes.default.svc.int.frog.nyc3.internal.my.domain. (80)
17:34:14.755860 IP 172.17.41.7.34184 > 172.17.240.2.53: 26345+ AAAA? kubernetes.default.svc.int.frog.nyc3.internal.my.domain. (80)
17:34:14.756634 IP 172.17.240.2.53 > 172.17.41.7.34184: 36544 1/0/0 A 172.17.240.1 (96)
17:34:14.756660 IP 172.17.240.2.53 > 172.17.41.7.34184: 26345 0/0/0 (80)

golang (1.7)

Tried out a simple go program(go1.7):

package main

import "net/http"

func main() {
        http.Get("http://kubernetes.default")
}

root@macb-debug:~# ./main

And the tcpdump:

17:42:55.799352 IP 172.17.41.7.54780 > 172.17.240.2.53: 49389+ AAAA? kubernetes.default. (36)
17:42:55.801342 IP 172.17.41.7.52301 > 172.17.240.2.53: 61981+ A? kubernetes.default. (36)
17:42:55.805505 IP 172.17.240.2.53 > 172.17.41.7.54780: 49389 NXDomain 0/0/0 (36)
17:42:55.805636 IP 172.17.240.2.53 > 172.17.41.7.52301: 61981 NXDomain 0/0/0 (36)
17:42:55.806171 IP 172.17.41.7.45104 > 172.17.240.2.53: 32960+ AAAA? kubernetes.default.default.svc.int.frog.nyc3.internal.my.domain. (88)
17:42:55.806411 IP 172.17.41.7.47877 > 172.17.240.2.53: 41316+ A? kubernetes.default.default.svc.int.frog.nyc3.internal.my.domain. (88)
17:42:55.807575 IP 172.17.240.2.53 > 172.17.41.7.47877: 41316 NXDomain 0/1/0 (259)
17:42:55.807816 IP 172.17.240.2.53 > 172.17.41.7.45104: 32960 NXDomain 0/1/0 (259)
17:42:55.808179 IP 172.17.41.7.54148 > 172.17.240.2.53: 11120+ AAAA? kubernetes.default.svc.int.frog.nyc3.internal.my.domain. (80)
17:42:55.808339 IP 172.17.240.2.53 > 172.17.41.7.54148: 11120 0/0/0 (80)
17:42:55.808365 IP 172.17.41.7.40203 > 172.17.240.2.53: 65043+ A? kubernetes.default.svc.int.frog.nyc3.internal.my.domain. (80)
17:42:55.808502 IP 172.17.240.2.53 > 172.17.41.7.40203: 65043 1/0/0 A 172.17.240.1 (96)

The golang stdlib spells out the case nicely in the unix dnsclient.

musl

musl documents its difference at least:

queries with fewer dots than the ndots configuration variable are processed with search first then tried literally (just like glibc), but those with at least as many dots as ndots are only tried in the global namespace (never falling back to search, which glibc would do if the name is not found in the global DNS namespace)

thockin · 2016-10-07T06:36:00Z

So we can not depend on this behavior. Unfortunately, resolv.conf behavior
is under-specified here and in other ways.

The "proposal" was an offhand remark in a bug or email or something. If
this is really a pain point, I would be happy for someone to open a new
proposal (small) on this.

On Thu, Oct 6, 2016 at 11:15 AM, Mac Browning notifications@github.com
wrote:

tl;dr:

It seems like libbind/netresolv and musl are consistent in ignoring
search domains when the ndots option is met. However glibc (and other
similar implementations such as golang's dns impl) would continue to
respect search domains with a lower ndots.

I wouldn't want to just want to go from ndots:5 to ndots:1 since the vast
majority of search signaling could still be preserved with ndots:3 in our
case. Though given the choice ndots:1 is likely more useful for users
that have a lot of services outside their clusters since it will be 1
failed absolute domain request then falling back to search for internal
requests instead of 10+ failed searches before falling back to the absolute
domain for external services.

Do you have the ClusterNoSearch proposal handy? That search didn't turn

anything up in issues.

With ndots:1 and an otherwise normal k8s resolv.conf: nslookup

All of the bind utilities (dig, nslookup, etc) use libbind

root@macb-debug:/# nslookup kubernetes.default
Server: 172.17.240.2
Address: 172.17.240.2#53

** server can't find kubernetes.default: NXDOMAIN

And tcpdump from that request where it just tries the absolute:

17:35:56.959854 IP 172.17.41.7.44089 > 172.17.240.2.53: 56391+ A? kubernetes.default. (36)
17:35:56.963720 IP 172.17.240.2.53 > 172.17.41.7.44089: 56391 NXDomain 0/0/0 (36)

This makes sense given bind documents ndots as:

+ndots=D
Set the number of dots that have to appear in name to D for it to be
considered absolute. The default value is that defined using the ndots
statement in /etc/resolv.conf, or 1 if no ndots statement is present. Names
with fewer dots are interpreted as relative names and will be searched for
in the domains listed in the search or domain directive in /etc/resolv.conf.

curl

However, if I use curl, it will resolve:

root@macb-debug:/# curl http://kubernetes.default -v

Rebuilt URL to: http://kubernetes.default/

Hostname was NOT found in DNS cache

Trying 172.17.240.1...

And tcpdump from that request where we can see it trying the absolute
then falling back to search domains:

17:34:14.753159 IP 172.17.41.7.42383 > 172.17.240.2.53: 5843+ A? kubernetes.default. (36)
17:34:14.753246 IP 172.17.41.7.42383 > 172.17.240.2.53: 5736+ AAAA? kubernetes.default. (36)
17:34:14.753510 IP 172.17.240.2.53 > 172.17.41.7.42383: 5843 NXDomain 0/0/0 (36)
17:34:14.753604 IP 172.17.240.2.53 > 172.17.41.7.42383: 5736 NXDomain 0/0/0 (36)
17:34:14.753727 IP 172.17.41.7.60146 > 172.17.240.2.53: 48770+ A? kubernetes.default.default.svc.int.frog.nyc3.internal.my.domain. (88)
17:34:14.753798 IP 172.17.41.7.60146 > 172.17.240.2.53: 46767+ AAAA? kubernetes.default.default.svc.int.frog.nyc3.internal.my.domain. (88)
17:34:14.754942 IP 172.17.240.2.53 > 172.17.41.7.60146: 46767 NXDomain 0/1/0 (259)
17:34:14.755472 IP 172.17.240.2.53 > 172.17.41.7.60146: 48770 NXDomain 0/1/0 (259)
17:34:14.755721 IP 172.17.41.7.34184 > 172.17.240.2.53: 36544+ A? kubernetes.default.svc.int.frog.nyc3.internal.my.domain. (80)
17:34:14.755860 IP 172.17.41.7.34184 > 172.17.240.2.53: 26345+ AAAA? kubernetes.default.svc.int.frog.nyc3.internal.my.domain. (80)
17:34:14.756634 IP 172.17.240.2.53 > 172.17.41.7.34184: 36544 1/0/0 A 172.17.240.1 (96)
17:34:14.756660 IP 172.17.240.2.53 > 172.17.41.7.34184: 26345 0/0/0 (80)

golang (1.7)

Tried out a simple go program(go1.7):

package main

import "net/http"

func main() {
http.Get("http://kubernetes.default")
}

root@macb-debug:~# ./main

And the tcpdump:

17:42:55.799352 IP 172.17.41.7.54780 > 172.17.240.2.53: 49389+ AAAA? kubernetes.default. (36)
17:42:55.801342 IP 172.17.41.7.52301 > 172.17.240.2.53: 61981+ A? kubernetes.default. (36)
17:42:55.805505 IP 172.17.240.2.53 > 172.17.41.7.54780: 49389 NXDomain 0/0/0 (36)
17:42:55.805636 IP 172.17.240.2.53 > 172.17.41.7.52301: 61981 NXDomain 0/0/0 (36)
17:42:55.806171 IP 172.17.41.7.45104 > 172.17.240.2.53: 32960+ AAAA? kubernetes.default.default.svc.int.frog.nyc3.internal.my.domain. (88)
17:42:55.806411 IP 172.17.41.7.47877 > 172.17.240.2.53: 41316+ A? kubernetes.default.default.svc.int.frog.nyc3.internal.my.domain. (88)
17:42:55.807575 IP 172.17.240.2.53 > 172.17.41.7.47877: 41316 NXDomain 0/1/0 (259)
17:42:55.807816 IP 172.17.240.2.53 > 172.17.41.7.45104: 32960 NXDomain 0/1/0 (259)
17:42:55.808179 IP 172.17.41.7.54148 > 172.17.240.2.53: 11120+ AAAA? kubernetes.default.svc.int.frog.nyc3.internal.my.domain. (80)
17:42:55.808339 IP 172.17.240.2.53 > 172.17.41.7.54148: 11120 0/0/0 (80)
17:42:55.808365 IP 172.17.41.7.40203 > 172.17.240.2.53: 65043+ A? kubernetes.default.svc.int.frog.nyc3.internal.my.domain. (80)
17:42:55.808502 IP 172.17.240.2.53 > 172.17.41.7.40203: 65043 1/0/0 A 172.17.240.1 (96)

The golang stdlib spells out the case nicely in the unix dnsclient
https://golang.org/src/net/dnsclient_unix.go.
musl

musl documents its difference at least:

queries with fewer dots than the ndots configuration variable are
processed with search first then tried literally (just like glibc), but
those with at least as many dots as ndots are only tried in the global
namespace (never falling back to search, which glibc would do if the name
is not found in the global DNS namespace)

—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
#33554 (comment),
or mute the thread
https://github.com/notifications/unsubscribe-auth/AFVgVEdryFhbsGNSfsnEM8txtzw5-Xmcks5qxTqqgaJpZM4KHmmB
.

bogdando · 2016-10-18T10:38:06Z

This conversation definitely fixed white spaces I had with understanding ndots in action, thank you for details - I've updated the bug description. This has nothing to the bug relevance tho.

bogdando · 2016-10-18T10:54:06Z

@thockin hardcoded ndots:5 is still a pain point, despite on implementation details of DNS stack. This keeps this issue relevant.

bogdando · 2016-11-03T09:57:17Z

NOTE: this is rather a docs issue, see #35525 (comment)
I suggest to fix this in docs, as described by @fluxrad

jkemp101 · 2016-12-09T21:05:46Z

This issue snuck up and bit me. I was surprised two DNS pods couldn’t handle the load of a 7 node/81 pod cluster when the node running the DNS containers did a GC to delete old docker images. I’m finding it hard to accept a design that causes a single connection to api.example.com to result in 8 failed NXDOMAIN responses before I get the correct address. We can scale kube-dns, dnsmasq, etcd instances as much as we want but it just seems wrong.

I’ve started testing the configmap workaround for my cluster. There is a chance the dnsmasq --neg-ttl option might help with caching the NXDOMAIN responses but I prefer the configmap option for now so I haven’t played with it.

I’m curious what the real world use cases are for having ndots set to 5. I’m a little new to k8s but I have traditionally always configured critical software to use FQDNs when possible. The only use case I can think of if you are trying to find a service that is in your namespace but you don’t know what namespace you are in. What am I missing, when do you have to rely on the search list?

And more importantly am I going to break something if I run the majority of my services with a ndots set at 1? Can we make sure no Kubernetes components are built requiring ndots = 5 which would then potentially restrict users running it set to 1.

MrHohn · 2016-12-09T21:19:31Z

cc @bowei

thockin · 2016-12-10T23:56:00Z

I apologize for the girth of this, but I have a lot to say :)

This is a tradeoff between automagic and performance. There are a number of considerations that went into this design. I can explain them, but of course reasonable people can disagree.

Same-namespace lookups are the vast majority of lookups, so we need "my local namespace" in the search path.
There are multiple "classes" of things that exist in DNS so the class must be part of the DNS name.
Services are the vast majority of lookups so class "svc" must be in the search path and ndots must be >= 1 (e.g. name must resolve)
Reasonable people want to configure the cluster zone suffix (e.g. for corp names, multi-cluster, etc)

= ergo the name of a Service is $service.$namespace.svc.$zone, and $namespace.svc.$zone is the first search path.

The second most common lookup is cross-namespace services, e.g. the kubernetes master in the defaultns, so cross-namespace lookups should be easy
Because of (4), we don't really want apps to hardcode the FQDN (bad for portability), so svc.$zone must be in the search path and ndots must be >= 2 (e.g. kubernetes.default must resolve)
Because of (2) and (4), same-namespace and cross-namespace lookups of non-service names should be easy. Therefore $zone must be in the search path and ndots must be >= 3 (e.g. name.namespace.svc must resolve).
Because of (1) and (4), and the fact that petsets have per-endpoint names, local and cross-namespace petnames should be easy. Given the previous search paths and ndots >= 4, we can ensure that petname.service.namespace.svc resolves.
We also support SRV records of the form _$port._$proto.$service.$namespace.svc.$zone. Given (6) and (2), we must enable
_$port._$proto.$service.$namespace.svc to resolve. That requires ndots = 5.

This explains how we got to ndots = 5.

Given (9) and (8), we pathologcially get _$port._$proto.$petname.$service.$namespace.svc.$zone. Therefore must enable _$port._$proto.$petname.$service.$namespace.svc to resolve. That actually requires ndots = 6, unless I can't count. In truth, SRV doesn't make much sense except for Services, but now we have federated services...

We did not change ndots to 6 because This is getting out of hand. I'd very much like to revisit some of the assumptions and the schema. The problem, of course, is how to make a transition, once we have a better schema.

Consider an alternative:

The canonical name for a service becomes $service.s.$ns.$zone
The pathological case for SRV becomes _$port._$proto.$petname.$service.s.$ns.$zone
Most common lookup being same-namespace services, search path = s.$ns.$zone (nslookup myservice)
Second most common lookup being cross-namespace services, search path += $zone (nslookup kubernetes.s.default)

That is a better, safer, more appropriate schema that only requires 2 search paths. and in fact, you could argue that it only REQUIRES $zone, while the other is sugar. People love sugar. This still leaves pathologically ndots = 6.

If we exposed $zone through downward API (as we do $namespace), then maybe we don't need so much magic. I'm reticent to require $zone to access the kube-master, but maybe we can get away with ndots = 3 (kubernetes.s.default) or ndots = 4 (petname.kubernetes.s.default). That's not much better.

We could mitigate some of the perf penalties by always trying names as upstream FQDNs first, but that means that all intra-cluster lookups get slower. Which do we expect more frequently? I'll argue intra-cluster names, if only because the TTL is so low. So that's the wrong tradeoff. Also, that is a client-side thing, so we'd have to implement server-side logic to do search expansion. But namespace is really variable, so it's some hybrid. Blech.

OTOH, good caching means that external lookups are slow the first time but fast subsequently. So that's where we've been focused. The schema change would be nice (uses less search domains, but is a little more verbose), but requires some serious ballet to transition, and we have not figured that out.

Now, we could make a case for a new DNSPolicy value that cut down on search paths and ndots. We could even make a case for per-namespace defaults that override the global API defaults. We can't make a global change, and I doubt we can make a per-cluster change because ~every example out there will break.

@johnbelamaric (spec)
@matchstick (fyi)
@madhusudancs (federation)
@kubernetes/sig-network (discuss)
@smarterclayton (smart guy)
@jbeda (smart guy)

jbeda · 2016-12-11T19:38:25Z

First, off -- I'm totally in favor of moving to a schema where the "class" is under namespace. i.e. $service.s.$ns.$zone. (Note that this is the schema we picked for GCE: $host.c.$project.internal)

Second -- SRV records are uncommon enough that it is probably okay to make that use case less smooth. Something similar can be set for petsets. Anything using petsets like that will require some elbow grease already. We could expose the zone and namespace as env variables to smooth this over.

That means we have 2 cases we care about:

same-namespace service: $otherservice -> $otherservice.s.$namespace.$zone
cross-namespace service: $otherservice.s.$othernamespace -> $otherservice.s.$othernamespace.$zone

That means ndots needs to be 3, right?

Can we have it both ways here? Can we have a local DNS cache that can cache and answer these queries super fast? If we make this be per-node (in the proxy or kubelet or a new binary {ug}) then it is faster, cheaper and config scales with cluster size.

thockin · 2016-12-12T00:49:07Z

I think we *could* get to ndots=3 in this way, yes. The transition is the the hard part. The current thinking (@bowei) is (ideally) a smallish per-node cache, which only sends $zone suffixed names to kube-dns, and does not trigger conntrack for DNS traffic, and allows other stub-domains to branch off. This does mean that DNS will not get original client IPs, but I think we can live with that.

…

On Sun, Dec 11, 2016 at 11:38 AM, Joe Beda ***@***.***> wrote: First, off -- I'm totally in favor of moving to a schema where the "class" is under namespace. i.e. $service.s.$ns.$zone. (Note that this is the schema we picked for GCE: $host.c.$project.internal. Second -- SRV records are uncommon enough that it is probably okay to make that use case less smooth. Something similar can be set for petsets. Anything using petsets like that will require some elbow grease already. We could expose the zone and namespace as env variables to smooth this over. That means we have 2 cases we care about: - same-namespace service: $otherservice -> $otherservice.s.$namespace.$ zone - cross-namespace service: $otherservice.s.$othernamespace -> $otherservice.s.$othernamespace.$zone That means ndots needs to be 3, right? Can we have it both ways here? Can we have a local DNS cache that can cache and answer these queries super fast? If we make this be per-node (in the proxy or kubelet or a new binary {ug}) then it is faster, cheaper and config scales with cluster size. — You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub <#33554 (comment)>, or mute the thread <https://github.com/notifications/unsubscribe-auth/AFVgVKmlqg1waJ6dfBm-9Lss7B9MJ2Nrks5rHFFAgaJpZM4KHmmB> .

caseydavenport · 2016-12-12T06:43:22Z

SRV records are uncommon enough that it is probably okay to make that use case less smooth. Something similar can be set for petsets

Yes, reading through this thread I found myself reaching the same place in my head. ndots=3 seems like the right value assuming we also switch service.namespace to service.s.namespace, otherwise I think we could get away with ndots=2 using the current schema, yeah?

OTOH, good caching means that external lookups are slow the first time but fast subsequently. So that's where we've been focused.

I think that's probably the right place to focus.

a smallish per-node cache ... This does mean that DNS will not get original client IPs, but I think we can live with that.

You mean kube-dns won't see the client IPs? Yeah, that could be interesting in the context of the multi-tenant DNS discussions that have popped up before. I guess so long as each tenant gets its own cache and that cache forwards to the right tenant kube-dns service...

bogdando · 2016-12-12T10:51:13Z

@thockin it's interesting you've mentioned the per node cache in front of the KubeDNS app. That's how Kargo is currently configures DNS (see the Dnsmasq svc/dset in the drawing). Although I'm failing to see how that cache would fix or improve the situation for hostnet pods, which rely on the hosts' /etc/resolv.conf files and those (hosts) start behave bad sometimes, if given the options ndots: 5 there.

PS. The transition is always the the hard part, but that is not a problem for a properly organized change management (deprecation rules) and docs, right?

johnbelamaric · 2016-12-12T15:59:59Z

For the transition, as long as we don't have a namespace that is "svc" or "pod", the server can differentiate between the new schema and the old one; so both can be active at the same time. We could use a different DnsPolicy on the client side with the new search and ndots, eventually, with good lead time, make it the default.

jkemp101 · 2016-12-12T16:17:16Z

Thanks everyone for continuing this discussion so we can figure out our best options. Here are two thoughts:

I would very much like to see a new DNS Policy that has ndots=1. That would let us configure pods with high external DNS requests to not strain the cluster DNS or suffer other performance issues. For instance, I will have many pods that will be doing a majority of non-cluster DNS lookups and would regularly use this DNS Policy. Making it a new DNS Policy will ensure this is a test/supported configuration in the future versus the config map workaround technique.
With DNS caching we need to factor in that a lot (maybe most) of these responses are NXDOMAIN so that the caching needs to handle that. I think NXDOMAIN responses are often not cached in a “typical” configuration. And if they are cached they have a very short TTL causing them to not be as beneficial as normal positive responses.

johnbelamaric · 2016-12-12T16:38:45Z

@thockin @caseydavenport kube-dns not seeing the client IPs can be mitigated by having the local cache append it (and/or other data) as an EDNS0 option.

johnbelamaric · 2016-12-12T16:45:05Z

If and when that becomes necessary.

jkemp101 · 2016-12-12T21:10:42Z

Just as a point of reference. My relatively small cluster was running 1,246 packets per second for all DNS related traffic in the cluster with the default settings. After I implemented the config map workaround for most of the pods to set ndots to 1 the same cluster is now running at 109 pps for DNS traffic.

thockin · 2016-12-13T05:23:44Z

This is firmly @bowei's territory - I'm just making trouble now. I think we're open to a proposal to either add a DNSPolicy for ndots=1 or to make a more expansive change and make more params configurable. Need a volunteer to write the proposal...

…

On Mon, Dec 12, 2016 at 1:11 PM, Joe Kemp ***@***.***> wrote: Just as a point of reference. My relatively small cluster was running 1,246 packets per second for all DNS related traffic in the cluster with the default settings. After I implemented the config map workaround for most of the pods to set ndots to 1 the same cluster is now running at 109 pps for DNS traffic. — You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub <#33554 (comment)>, or mute the thread <https://github.com/notifications/unsubscribe-auth/AFVgVDwDEAZMkIhb8xJo8414oI9CTGVbks5rHbhkgaJpZM4KHmmB> .

bowei · 2016-12-13T06:15:57Z

@thockin: I'll take a crack at the proposal
@jkemp101: 10x QPS, yikes

johnbelamaric · 2017-02-22T17:59:01Z

@rsmitty I don't think so, but the place to pick this up is probably here: kubernetes/dns#29 or maybe this issue should be re-opened in that repo.

The solutions discussed so far discuss altering the schema which can be done but should also take the federation use cases into account.

zihaoyu · 2017-03-14T03:06:52Z

@tonylambiris Could you elaborate why --no-negcache is a good idea? From what I learned negative caching is good but that flag disables it. Maybe I misunderstood something.

BrianGallew · 2017-05-01T23:59:55Z

Frankly, I don't see the reluctance to allowing this to be configurable per site (or better yet, per container). "We're smart enough to solve this for everyone" is not realistic. There are a number of assumptions in the SkyDNS design which, while undoubtedly true for the given developer's environment, is clearly not true for many of us who are trying to get work done, only to run into this issue.

bowei · 2017-05-25T20:13:11Z

/assign

shivangipatwardhan2018 · 2017-08-03T13:46:18Z

I am having this exact problem and wish to change ndots to 1. @jkemp101 mentioned a configmap work around. What is that? Where can I find it? or is there a workaround to set the ndots = 1?
Thank you!

jkemp101 · 2017-08-03T15:42:17Z

@shivangipatwardhan2018 Just create a resolve.conf file without the options ndots:5 line (look at the file in an existing pod). Create a config map with that new resolve.conf. Then add something like this to your deployments to override the k8s provided resolv.conf with the file in your config map.

        volumeMounts:
        - name: resolv-conf
          mountPath: /etc/resolv.conf
          subPath: resolv.conf
...
      volumes:
        - name: resolv-conf
          configMap:
            name: resolv.conf
            items:
            - key: resolv.conf
              path: resolv.conf

bowei · 2017-12-12T21:42:11Z

https://github.com/kubernetes/community/blob/master/contributors/design-proposals/network/pod-resolv-conf.md

This feature is in 1.9 as alpha

bowei · 2017-12-20T22:55:05Z

I am going to close this as we now have dnsConfig

k8s-github-robot added area/dns team/cluster labels Sep 27, 2016

macb mentioned this issue Oct 5, 2016

Skydns throws IO errors on large cluster, slow query performance. #19634

Closed

l0rd mentioned this issue Oct 24, 2016

Fix a DNS issue when running che-server on OpenShift l0rd/openche#4

Closed

bogdando mentioned this issue Oct 25, 2016

Parametrize KubeDNS ndots #35525

Closed

bogdando changed the title ~~Kube-dns add-on should accept option ndots for SkyDNS in order to resolve SRV DNS records~~ Kube-dns add-on should accept option ndots for SkyDNS or document ConfigMap alternative subPath Nov 3, 2016

thockin added sig/network Categorizes an issue or PR as relevant to SIG Network. and removed sig/network Categorizes an issue or PR as relevant to SIG Network. labels May 16, 2017

caseydavenport mentioned this issue May 19, 2017

consider a dnspolicy optimized for fqdn lookups #32750

Closed

bgrant0607 removed the team/cluster (deprecated - do not use) label May 25, 2017

k8s-ci-robot assigned bowei May 25, 2017

chrisohaver mentioned this issue Jun 21, 2017

middleware/kubernetes: Preemptive search path lookups coredns/coredns#747

Closed

m1093782566 mentioned this issue Jul 27, 2017

Kubernetes can't resolve external hostnames #49113

Closed

johnbelamaric mentioned this issue Oct 13, 2017

<namespace>.svc.cluster.local. duplicated kubernetes/dns#159

Closed

johnbelamaric mentioned this issue Oct 27, 2017

DNSMasq cache has low cache hit rate for some reason kubernetes/dns#160

Closed

k8s-ci-robot closed this as completed Dec 20, 2017

ApsOps mentioned this issue May 17, 2018

kube-dns: dnsmasq intermittent connection refused #45976

Closed

bboreham mentioned this issue Jun 11, 2018

ndots breaks DNS resolving #64924

Closed

icebal mentioned this issue Jan 4, 2019

weave-scope-agent and app cause DNS flooding upstream in k8s standalone weaveworks/scope#3544

Closed

mjpitz mentioned this issue May 15, 2019

Switch off alpine as a base image fluxcd/flux#2051

Closed

jaygorrell mentioned this issue Nov 6, 2019

DNS intermittent delays of 5s #56903

Closed

lmarszal mentioned this issue Oct 15, 2020

ares_getaddrinfo fails with ARES_EBADNAME when FQDN ends with dot '.' c-ares/c-ares#366

Closed

gizmotronic mentioned this issue Jan 31, 2021

git-remote-http defunct processes from cluster-register_cattle-cluster-agent rancher/rancher#30172

Closed

seanho00 mentioned this issue Feb 25, 2021

DNS failure in alpine dockers ho-ansible/k3s#6

Closed

jdmarble mentioned this issue Jun 13, 2021

Container doesn't populate resolv.conf properly felddy/foundryvtt-docker#135

Closed

manuelbuil mentioned this issue Feb 2, 2022

Strange DNS behavior in some containers including cert-manager k3s-io/k3s#5045

Closed

1 task

lmarszal mentioned this issue Apr 11, 2022

Allow caching of non-authoritative NXDOMAIN replies coredns/coredns#5313

Open

ScheererJ mentioned this issue Jun 27, 2022

Add option to automatically rewrite some dns requests to reduce amount of requests being made due to dns search path and ndots=5. gardener/gardener#6192

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Kube-dns add-on should accept option ndots for SkyDNS or document ConfigMap alternative subPath #33554

Kube-dns add-on should accept option ndots for SkyDNS or document ConfigMap alternative subPath #33554

bogdando commented Sep 27, 2016 •

edited

MrHohn commented Oct 3, 2016 •

edited

macb commented Oct 5, 2016 •

edited

thockin commented Oct 6, 2016

macb commented Oct 6, 2016

thockin commented Oct 6, 2016

macb commented Oct 6, 2016 •

edited

thockin commented Oct 7, 2016

anything up in issues.

bogdando commented Oct 18, 2016

bogdando commented Oct 18, 2016

bogdando commented Nov 3, 2016 •

edited

jkemp101 commented Dec 9, 2016

MrHohn commented Dec 9, 2016

thockin commented Dec 10, 2016 •

edited

jbeda commented Dec 11, 2016 •

edited

thockin commented Dec 12, 2016 via email

caseydavenport commented Dec 12, 2016

bogdando commented Dec 12, 2016 •

edited

johnbelamaric commented Dec 12, 2016

jkemp101 commented Dec 12, 2016

johnbelamaric commented Dec 12, 2016

johnbelamaric commented Dec 12, 2016

jkemp101 commented Dec 12, 2016

thockin commented Dec 13, 2016 via email

bowei commented Dec 13, 2016

johnbelamaric commented Feb 22, 2017

zihaoyu commented Mar 14, 2017

BrianGallew commented May 1, 2017

bowei commented May 25, 2017

shivangipatwardhan2018 commented Aug 3, 2017

jkemp101 commented Aug 3, 2017

bowei commented Dec 12, 2017

bowei commented Dec 20, 2017

Kube-dns add-on should accept option ndots for SkyDNS or document ConfigMap alternative subPath #33554

Kube-dns add-on should accept option ndots for SkyDNS or document ConfigMap alternative subPath #33554

Comments

bogdando commented Sep 27, 2016 • edited

MrHohn commented Oct 3, 2016 • edited

macb commented Oct 5, 2016 • edited

thockin commented Oct 6, 2016

macb commented Oct 6, 2016

thockin commented Oct 6, 2016

macb commented Oct 6, 2016 • edited

tl;dr:

With ndots:1 and an otherwise normal k8s resolv.conf:

nslookup

curl

golang (1.7)

musl

thockin commented Oct 7, 2016

anything up in issues.

bogdando commented Oct 18, 2016

bogdando commented Oct 18, 2016

bogdando commented Nov 3, 2016 • edited

jkemp101 commented Dec 9, 2016

MrHohn commented Dec 9, 2016

thockin commented Dec 10, 2016 • edited

jbeda commented Dec 11, 2016 • edited

thockin commented Dec 12, 2016 via email

caseydavenport commented Dec 12, 2016

bogdando commented Dec 12, 2016 • edited

johnbelamaric commented Dec 12, 2016

jkemp101 commented Dec 12, 2016

johnbelamaric commented Dec 12, 2016

johnbelamaric commented Dec 12, 2016

jkemp101 commented Dec 12, 2016

thockin commented Dec 13, 2016 via email

bowei commented Dec 13, 2016

johnbelamaric commented Feb 22, 2017

zihaoyu commented Mar 14, 2017

BrianGallew commented May 1, 2017

bowei commented May 25, 2017

shivangipatwardhan2018 commented Aug 3, 2017

jkemp101 commented Aug 3, 2017

bowei commented Dec 12, 2017

bowei commented Dec 20, 2017

bogdando commented Sep 27, 2016 •

edited

MrHohn commented Oct 3, 2016 •

edited

macb commented Oct 5, 2016 •

edited

macb commented Oct 6, 2016 •

edited

With `ndots:1` and an otherwise normal k8s resolv.conf:

bogdando commented Nov 3, 2016 •

edited

thockin commented Dec 10, 2016 •

edited

jbeda commented Dec 11, 2016 •

edited

bogdando commented Dec 12, 2016 •

edited