Quick intros to the DNS
Home Page | Comments | Articles | Faq | Documents | Search | Archive | Tales from the Machine Room | Contribute | Set language to:en it | Login/Register
This is just a quick & dirty presentation of the DNS system with some suggestions about how to improve the security of a normal system. It is neither complete or exhaustive of all the things that can and should be done on a production system. Unfortunately, the DNS is a really huge and complex argument and a full documentation is way beyond the scope of this site.
If you're interested in the whole story, and if you run DNS for work, I suggest you read the related RFC and maybe you get the wonderfull book by Paul Albitz & Cricket Liu "DNS and BIND", from O'Reilly.
In the long gone '70, DARPANet (grandaddy of the actual Internet) was a fairly small bunch of about 200 servers, mostly used by researchers to exchange data and e-mail, so not so different from today (besides the amount of servers involved).
And back then, just now, the problem was: how to figure out which server is which in the network that is based on IP addresses and not much on names. Remember an IP address is not something simple, what was needed was a way to connect an IP with a name that was easy to remember. The job was performed by a simple text-file, called 'HOSTS.TXT'. The file was maintained by the Stanford Research Institute Network Information Center (SRI-NIC).
Whenever a Network Administrator wanted to change a name or an IP or had a new machine to put in the network, he would have sent a mail to the SRI-NIC admins with the details of the changes, and they would have added those informations to the file. Every now and then, everyboyd would have downloaded the new version from NIC's ftp site and the changes were done. Now, this system is fairly simple, but it has a big problem: it doesn't scale very well. When the old DARPANet become the Internet, with several millions servers, a plain old file doesn't cut it anymore.
The amount of modifications and updates increases way to much to be handled by hand, handlying name conflicts (that is: two or more hosts with the same name) become impossible, and getting the updates from an FTP site is not handy either. Something better was needed.
The new system had to be decentralized, allowing different admins to handle little parts of the whole system, allowing for some delegation of the work down the chain. The system had to be robust to allow the whole network to function. And it had to be distributed, so to spread the load on multiple machines and hosts without overloading a single part of the system.
Paul Mockapetris proposed, with the RFC 882 and 883, a system that in the end became the modern DNS.
Ok, let's suppose that you (or one of your users) punch in the address bar of your browser the address "www.mynicesite.it". How the heck does your browser know to which host to ask for the data?
In every computer connected to a TCP/IP network (so the Internet too), there is a piece of software called a resolver. This is a normal piece of software that is part of the TCP/IP protocol stack, it has to be part of the OS of the machine and installed and configured by the SysAdmin (or whoever installed the machine). The resolver is the client part of a distributed client/server system that is the DNS.
The resolver has a number of resources that he will try to find out which ip address is the right one. First of all there is a flat-file, named hosts that contains simple ip-adress/hostname relations. This file is on the client machine and is usually used only for static servers. Second, there are one or more DNS servers that the resolver will query to find out any hostname/ip that is not in the hosts file.
The resolver will send a query to the first of the dns servers he knows about with a question that, more or less, could be translated as "which server is 'www.mynicesite.it'". Now let's forget for the moment about recursion, iteration and the like. If the DNS knows about that host, it will return immediately the answer, if he doesn't, he will start by breaking the hostname into his components: 'www', 'mynicesite' and 'it', then he will begin from the right, going to one of the Root Servers and asking "who's the server managing the .it domain?". He will get the answer, trot to that server and ask "who's the server managing the mynicesite domain?" then he will go that that server and ask (finally) "which server is 'www.mynicesite.it'? ". And then he will return the answer to the client.
To try to keep the requests to a minimum, the DNS will put the answer in memory (caching) and keep it there for a while. If many clients asks the same question, the DNS won't need to repeat the whole gig. Of course the information is not kept forever in memory, after some times the information is considered expired and the DNS will have to repeat the search from scratch to get a new answer.
This simple (well...) story probably generated more questions on your
side. First of all: how does the DNS knows about Root Servers and
what the heck are they? The Root Servers are the machines that maintain
the list of the first-level domains, that is, all the
Nationals domain servers, that handle domains at national levels, and the
servers for domains like .com, .org, .net and the like. The Root Servers
are a couple of dozens and rarely changes. Their names and ip addresses
are written in the configuration of every well-configured and working
domain server and is the duty of the Admin of such server to keep the list
Now the next question: how does the Root Servers knows about the .it domain? Well, there are agreements between countries to allow these kind of things to work. Just like there are agreements to allow trains and phone calls to move between countries, thaere are also some to allow the internet to work. Some of those agreements are about whose servers manage the national domains for each nation. Usually they are managed by national or governamental organisations.
And how does that server knows about 'mynicesite'? Well, is part of the process of registering (or requesting) a new domain that you (or whoever is the owner of the domain) have to tell your national organization administrators which dns server manges your domain.
All this is not exactly technical, but mostly part of the organizations that allow the internet to work.
As you can understand, what looks like a simple system, turn out to be a really complex one, requiring multiple governamental and international agreements to work. And we haven't touched the technical part yet. But we've touched one important bit: delegation.
The Root Servers don'handle the national domains, what they do is to delegate that responsability down the chain to some other servers and organizations, the same national servers do not handle all the domains on their own, they in turn, delegate the job to other servers.
This mechanism allow each server to only handle the minimum amount of data to let the system work, with no duplication. The changes to a single domain are not replicate all along the chain, but only to the first "head" server, the rest, don't need to know the details.
When a company (or a private) register (create) a new domain, usually is required to provide at minimum one DNS server to handle the new domain. Some registrars (the organization or company that handle the domains) require two or more DNS per domain for security.
And here we get a new important concept: Zones.
The management of a single domain can be split between multiple servers, each one handling a slice of the domain. Each slice is called a zone. A domain is composed at minimum by one single zone, but he can have more.
Usually, the DNS for one domain has 2 zones, the first one is the
"normal" zone that connect each hostname with an IP address,
the second is the "reverse" zone that does the opposite: connect
an IP address with an hostname. Most of the time, the reverse zone is not
handled by the DNS that handle the domain, especially if the domain is
"hosted" by an ISP. In that case the ISP handles the reverse
What is the pourpose of the reverse zone? Sometimes, some services tries to do a reverse-check of the address for security reason. The advantags in security are negligible.
Let's go back to our story above. Our DNS server in that story (the one contacted by our client) did the whole job running from server to server and putting together the answer. That is an example of a recursive request.
But that is not the only way of working. A server could only accept iterative requests. If a server receive an iterative requst the only thing he has to do is to provide the best answer he already knows and that's about it.
In our example, our DNS server got a recursive request and translated it in a series or iteratives requests to other DNSes. He didn't asked anyone to do his job, he only asked questions that each other DNSes should already knows. Then he put togther all the answer and got the whole picture.
An iterative request require a lot less resources, if the DNS doesn't know the answer, he will simply not give an answer.
An alternative to both way, is the forwarding. In this case the server provide the answer if he knows it, otherwise he will simply send the question to another specified "parent" DNS server. More than one parent can be specified.
Forwarding is usually used to split the load between multiple servers that belongs to a single company or ISP and handles the requests coming from specific clients. As usual, each query is stored in cache to avoid multiple repeated queries. If there are many clients, the chances that the same query will be repeated multiple times are high, this reduce the load on the whole system because the answer came from the memory and not from more queries.
Let's talk about the cache for a moment.
The Cache is used by all the DNSes to speed up repeated searches, but obviously the answers can't stay forever in memory. So, how to decide when they have to be discarded?
For each information that the DNS stores, there is a sort of "due date", called a Time To Live (TTL). The ttl specify the amount of time (in seconds) to consider the answer legitimate. After that time, the answer is considered expired and need to be refreshed with a new query.
How long should the TTL be? It depends. In most cases the TTL is 86400 seconds (that is, one day). If your domain / zone is quite "static" and nothing much happens, then you can consider a longer TTL. On the other hand, if your domain is going through a lot of changes, with servers appearing and disappearing, then a shorter TTL is better.
If a server handles one or more zones is defined Authoritative for these zones. Usually is a good practice to have more than one server handling a domain or a zone, in fact a minimum of two is always recommended.
What is usually done is to configure one server as a master and one or more as slaves. The slaves can answer questions about their zones, but all the changes to the zone happens in the master, the slaves have read-only data.
The most used DNS server software is ISC BIND (https://www.isc.org/software/bind).
Bind is distributed and used with basically all the Unix and Linux versions. It is based on a simple configuration file and a number of Zones files to define each zone and the Root Servers.
Bind can be used in various ways, the standard mode is as a forward/caching server, where the server simply forward the queries to one or more parent DNS and keep a local cache.
Another way to use bind is as a fully-fledged DNS, directly querying the Root Servers when requested. Each distribution of Bind has a list of the Root Servers, and you'll have to keep that list up to date if you want to use it (not that they change a lot, the last update is december 2008).
Last, but not least, Bind allow to manage multiple zones, so he works like a real authoritative DNS.
Warning: what follows is a really quick and dirty explanation of the various types of records that goes into the configuration of a DNS. Yet again, I do not pretend thist to be perfect.
A zone (or domain) is composed by multiple hosts to provide multiple services. To define a zone usually you crate a file (called, with a severe lack of fantasy, zone file) that for historical reason is referred as a database. This file begin with the definition of a series of options that are valid for the whole zone (the domain name, various TTLs, who is the owner/admin of the zone and o on). Then a number of different records specifying the various informations.
On of the first services (hence records) to define is obviously which are the DNSes of the domain. To do so, we will use the NS records (doh!).
Multiple DNSes can be specified for each zone. Which DNS is going to be used by a client depends by the implementation of the resolver in the client and cannot be predicted.
One of the more common records in a DNS zone file, is the one that connect an hostname with his IP address, and is the A record (Address).
gort IN A 126.96.36.199
Let's remember that a Zone refer to a domain, so if the zone is related to "soft-land.org", the complete name of the host will be "gort.soft-land.org", but we don't have to specify that because is in the headers. It is also possible to assign more than one IP address to the same hostname using multiple A records. Once again, which IP will be used by the client depends on the client's resolver's implementation. It is however a cheap solution to achieve some kind of load balancing with multiple servers.
Sometimes, one host runs multiple services, in that case is not weird that, depending on the work, he has a different names. The same server could act as a web server and respond to the 'www' name and as an ftp server with the name 'ftp'. It is, of course, possible to add multiple A records with the same IP address and different hostnames, but a better solution is to define once the hostname/ip relation and then add "aliases" for each service. The Aliases are added using the CNAME record type.
gort IN A 188.8.131.52 www IN CNAME gort.soft-land.org.
This way we define one server named gort and an alias named www that "point" to the same server. Note that we used the full hostname for the alias, with a dot at the end. That is known as a fully qualified host name.
E-mail works using multiple servers that are used in a fallback fashion. When we send a mail to a domain, our mail server try to locate which server handles the mail for that domain, if the first server is not available he will try the second one and so on. So mail servers need to have also a priority defined. That's why the MX record, used to specify the mail server (or mail eXchanger), has a priority field.
MX 10 gort.soft-land.org. MX 20 backup.soft-land.org.
The number is the priority, lower number means higher priority.
Last but not least, we have records that specify the opposite of the A records, that is, a relation between an IP address and an HostName. These kind of records, the PTR records, are not added to the same zone, but to the reverse zone, named the in-addr.arpa zone.
Let's say that you have just installed a DNS and wants to check if it works. The best thing to do is to try some query on the server and see how it does. The main tool to query a DNS is (in the Unix world) DIG.
It happens very often that I read questions on various newsgroups where somebody is trying to understand why his network doesn't work and he ignore the simplest solution: the DNS. The first thing to do is a nice dig to see if the DNS responds and what he says. The important thing about Dig is that he uses the same resolver as other software in the system, so he not always tests the DNS, he also tests the configuration of the resolver. Other tools, like ping and nslookup use their own internal version of the resolver and could return different results. In fact, nslookup is considered deprecated to test the DNS and ping is just plain wrong.
The simplest query done using dig is one to resolve one hostname:
# dig www.soft-land.org +all ; <<>> DiG 9.4.1 <<>> www.soft-land.org +all ;; global options: printcmd ;; Got answer: ;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 6052 ;; flags: qr rd ra; QUERY: 1, ANSWER: 2, AUTHORITY: 0, ADDITIONAL: 0 ;; QUESTION SECTION: ;www.soft-land.org. IN A ;; ANSWER SECTION: www.soft-land.org. 41192 IN CNAME gort.soft-land.org. gort.soft-land.org. 41192 IN A 184.108.40.206 ;; Query time: 4 msec ;; SERVER: 172.31.1.1#53(172.31.1.1)
What does this means? Simple: the first part is a simple header that tells
us that an answer has been received and the state of such
answer, in our case is NOERROR, that means that the DNS works. Note that
noerror doesn't mean that we have received an answer, only that the
DNS has not encountered an error.
Then it tells us that we have sent one query and we got 2 answers, and that the servers that gave us the answers is not authoritative (that is, we got the answer from a cache, not by asking the real thing).
Then we have our answers. It tells us that 'www.soft-land.org' is in fact a CNAME for 'gort.soft-land.org' and that 'gort.soft-land.org' has ip address 220.127.116.11. For last it tells us which DNS gave us the answer.
This tells me that my resolver is working correctly and so does my DNS. But I can do more. For example I could ask which are the authoritative DNSes for that zone by asking about the NS records:
# dig -t ns soft-land.org +short gort.soft-land.org.
The use of '+short' allow me to reduce the answer to the minimum. Dig tells me that the only nameserver for that domain is gort. Then I can repeat the previous query asking directly that name server.
# dig www.soft-land.org @18.104.22.168 +all ; <<>> DiG 9.4.1 <<>> www.soft-land.org @22.214.171.124 +all ; (1 server found) ;; global options: printcmd ;; Got answer: ;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 12735 ;; flags: qr aa rd; QUERY: 1, ANSWER: 2, AUTHORITY: 1, ADDITIONAL: 0 ;; WARNING: recursion requested but not available ;; QUESTION SECTION: ;www.soft-land.org. IN A ;; ANSWER SECTION: www.soft-land.org. 86400 IN CNAME gort.soft-land.org. gort.soft-land.org. 86400 IN A 126.96.36.199 ;; AUTHORITY SECTION: soft-land.org. 86400 IN NS gort.soft-land.org. ;; Query time: 14 msec ;; SERVER: 188.8.131.52#53(184.108.40.206) ;; WHEN: Wed Jul 29 20:53:02 2009 ;; MSG SIZE rcvd: 84
As we can see, specifying "@..." we can send a request to a specific DNS server. The answer is the same as before, but this time the Authority tells us that this is the authoritative dns for that domain.
Dig can also asks about all kind of records by using the
option to specify which record you want. For example, we could ask about
the mail exchanger for the domain:
# dig -t mx soft-land.org ;; ANSWER SECTION: soft-land.org. 86400 IN MX 10 gort.soft-land.org. ;; ADDITIONAL SECTION: gort.soft-land.org. 86400 IN A 220.127.116.11
For more informations, I really suggest to read the documentation of DIG and the RFC.
When the DNS was invented, internet was small and joung, but when things gets big, problems begin to creeps up. In specific security problems. As many protocols and solutions invented in the golden age of internet, the DNS is not designed with security in mind but to be fast, simple and efficient, and this is the problem.
What kind of problem can a DNS provide? I mean, a DNS is just to translate an hostname into an IP address and the other way around, what has security to do with it? Well, the answer is SPAM and SCAM.
When a user (anybody then) type an address, the DNS resolve the corresponding ip address and allow the client to communicate with the server. If the DNS provide the wrong ip address, the client will contact the wrong server but without any informations that something strange happened.
This kind of "hijacking" of the client connection allow the attacker to drive requests to a specific server prepared to handle the connection in various way, it could be an attempt to impersonate the server of a bank to harvest bank details and passwords, it could be a site that attempt to install virus or other malware into the client, or it could be a "simple" spam server.
How does this hijacking works? There are two way basically, throught a virus or other malware on the client that alter the configuration of the client's resolver or by adding fale informations in the DNS itself, this last method is often referred as "dns poisoning"
Another way to abuse the DNS is to send thousands of requests to a DNS in an attempt of crashing or slowing it down to a stand still.
Let's suppose that we have to configure a DNS service for a company. Such company require one DNS for internal use and for external use (to allow the internet to access web servers and the like. At the same time, we want to minimize possible risks. How to proceed?
The first thing to do is to check if it is possible to split the problem and to use two DNSes. One to handle the internal network and one for the external network. The internal DNS can be configured to forward all the requests to the ISP's servers or to the external one to minimize the amount of trafic. The internal is also configured to allow the DHCP server to update the internal zones and provide resolution for the internal services.
If the external DNS has one or more Slaves, is important to configure
the master to only allow transfer of the zones to the slaves only with
allow-transfer option of the zone definition.
A zone transfer is a process-heavy operation, restrict it to only the
right servers reduce the potential of problems.
On the external server, disable the recursion. This way the server will
answer queries about his own zones, but refuse freeloader that try to
use any DNS for other meanings. Allow recursion in the internal server
but use the
allow-query option to restrict it to only the
And what if we don't have the option of two DNS servers? Well, is still possible to run two DNS processes on the same machine making each one listen only to one network interface (or ip address). Of course each server have to have his own configuration file.
It is also possible to use virtualization techniques to run virtual servers, but consider that running a virtual server is heavy on the hardware since it require two OS to run on the same hardware and the virtualization software.
Modern versions of Bind allow it to run chrooted, what is that? It means that bind will not have access to the whole filesystem, but it will be restricted to a small portion of it. This method is always preferrable if is available. It does require a little bit more of configuration but is always better from a security point of view.
One word about logging. Bind tend to be quite chatty, the log gets filled up quite fast, especially with references about "bogus" server. A bogus server is a DNS server that provides wrong or inaccurated informations because it is badly configured or it is plain wrong. If you can identify a server that provides bad information on a consistent manner, just define it as bogus in the configuration and it will be ignored. After a while, it is advisable to tune the logging to only log usefull informations.
And last but not least: updates! Keep your system up to date! Both in the version of the software you run and in the definitions of the root servers and zone files. Every now and then check which version of Bind you're running, which one is the last stable one and if they are not the same update! In 90% of the cases when a bug or vulnerability is exploited to crack or damage a server, a patch was available. So don't be slackers.
Or more often than not "the internet doesn't work" and every time I hear this I think "yeah, right..."
If the DNS does not work there are a lot of possibilities that everything will work badly and the symptoms are usually the same: no servers is reachable if you try by name, but it works if you try with his ip address. Then the first thing to do is to isolate the problem: is it a DNS problem or a more generic networking problem?
Each machine connected with a TCP/IP network has a gateway that is used each time the machine has to contact an host that does not belongs to the local network. First thing to do is to figure out which one is the gateway and see if it is reachable. To see which is the gateway just check the network configuration of working machine.
If the gateway is reachable chances are that the problem is the DNS. Check if the resolver is correctly configured, if it is, use dig and check if the DNS respond correctly. If it does, the problem is not the DNS (firewall?).
Comments are added when and more important if I have the time to review them and after removing Spam, Crap, Phishing and the like. So don't hold your breath. And if your comment doesn't appear, is probably becuase it wasn't worth it.
Veramente un ottimo articolo By Anonymous coward posted 30/07/2009 16:38
Grazie!!! By Gabriele Corrieri posted 31/07/2009 22:11
Grazie della spiegazione By Eugenio Dorigati posted 31/07/2009 23:55
@ Eugenio Dorigati By Michele Montanari posted 02/08/2009 17:04
@ Michele Montanari By Davide Bianchi posted 02/08/2009 18:11
@ Davide Bianchi By Michele Montanari posted 02/08/2009 23:57
@ Michele Montanari By maxxfi posted 04/08/2009 08:16
@ maxxfi By Michele Montanari posted 07/08/2009 14:45
@ Michele Montanari By Davide Bianchi posted 07/08/2009 17:55
clap clap clap By Riccardo Cagnasso posted 10/08/2009 11:57
By Vacca Anonima posted 16/10/2012 12:21
Sarebbe bene, spiegassi che gli MX devono (o meglio dovrebbero) essere solo FQDN con relativo reverse valido
Tonio-- Vacca Anonima
By Tonio posted 16/10/2012 12:24
come unica alternativa a bind, fino ad ora, ho trovato solo PowerDNS.
Robba fatta in terre basse-- Tonio
By Antonio Pennino posted 13/03/2015 11:29
Congratulazioni a pioggia per questo desiderio di trasferire conoscenza, estremamente inusuale nei sistemisti che considerano il loro sapere "segreto industriale" da non diffondere ai colleghi.-- Antonio Pennino
By Anonymous coward posted 10/04/2016 10:40
Davide Bianchi, works as Unix/Linux administrator for an hosting provider in The Netherlands.
Do you want to contribute?
This site is made by me with blood, sweat and gunpowder, if you want to republish or redistribute any part of it, please drop me (or the author of the article if is not me) a mail.
This site was composed with VIM, now is composed with VIM and the (in)famous CMS FdT.
This site isn't optimized for vision with any specific browser, nor
it requires special fonts or resolution.
You're free to see it as you wish.