This post is from an earlier incarnation of this blog. I’ve gotten a few requests to republish it (the original post date was 2009-06-23), so here we go.
Recently, I’ve found myself typing quickly into my IM client to a few friends, explaining how a DNS change of some kind was going to work. It occurred to me that I should probably just write something up that would explain some DNS basics.
DNS is a big topic. I’m not even going to attempt to cover all of it here. If you really want to dig into it, then I recommend that you get and read the bugs book. A lot of this book is devoted to talking about the BIND software package (more on this later), but it’s still pretty much the reference for understanding DNS. If, on the other hand, you fall into the overwhelming majority of people who need to “get it”, but don’t need every detail, hopefully this article will help you out.
As noted, I’m not going to try and get into everything here. The thing with DNS, generally speaking, is that there’s really a lot of very small bits of information being passed around very, very quickly, and most things here in internet-landia rely on this to continue working. I’m going to talk about the following.
- Basic concepts and A Records
- MX Records
- CNAMEs (a little)
- A brief bit about DNS server software
So without further adieu, here we go.
The DNS system is, at its heart, a translation mechanism between things humans are comfortable dealing with, and things that computers are comfortable dealing with. The human comfy part is essentially the domain name. Something like chrislea.com or nata2.org… something that we can see and recognize and remember. The thing is, your computer doesn’t care about those human readable domain names at all. The computer wants a number, since computers are pretty good at dealing with numbers. Specifically, the computer wants a number known as an IP address. That’s what DNS does.
The way that a domain name actually gets mapped to an IP for your computer is a little complicated, so I’m not going to talk about all the possibilities. The simplified version is that your computer will consult something called a “resolving nameserver”. These are different than “authoritative nameservers” which I’ll talk about in a bit. For now, there are two things to keep in mind.
- Resolving nameservers are computers whose job it is to answer certain computers when they ask for a domain name to IP address mapping.
- Your computer will typically use resolving nameservers that are provided by whatever ISP you are connecting to the internet through.
This means, for example, that if you are really unlucky and have Time Warner Cable as your ISP, then when your computer wants to know what IP address to use for say amazon.com, the Time Warner resolving nameservers are there to answer that question.
If you ask for a domain name and the resolvers don’t know the answer already, then the following sequence of things happens very quickly.
- The resolver figures out what nameserver(s) know what IP is supposed to be used for the given domain.
- The resolver asks that nameserver (which is the authoritative one) what the answer is.
- The resolver stores the answer, and sends your computer the information it just got.
That is the basic life cycle of a DNS query. In general, you’ll be asking the resolver for what’s called an “A Record”. An A Record is an IP address that corresponds to the domain name you asked for.
There are a couple of things to pay attention to here. First off, the nameservers that your domain registrar made you pick as your authoritative nameservers isn’t actually what most client computers talk to. Those authoritative ones spend most of their time answering requests from resolvers. Another important point is that the resolvers store the answers for all the computers that use them. If you’re using those Time Warner resolvers, and somebody else who uses them has gone to Digg prior to you going there, then when you go you’ll get the already stored response. It doesn’t have to do the lookup each time. But, the resolvers will periodically do a fresh lookup, even if it already has an answer in hand. This brings us to our next topic.
So as noted, resolvers must periodically refresh what they know. Otherwise IP addresses couldn’t change reliably online. This is where the “TTL”, which stands for “time to live”, value comes in. When a resolver gets an answer from an authoritative nameserver, it gets more than just an IP address. It also gets a TTL value, which is measured in seconds. The TTL value tells the resolver “you don’t need to bother checking back in with me for at least X seconds”, where X is the TTL. In this way, the nameservers distribute data around the internet, and this keeps the authoritative nameservers for popular sites from dying under the strain. There’s no way that the Google nameservers could stay up if they got queried every single time anybody surfed to google.com.
Typical values for TTL might be 12 or 24 hours. So, if you were to change the IP address of a site you control, it would take potentially up to this long for the internet to adjust to the change. If you have control over the TTL value for a domain, and you want to change where it points, you may wish to lower the TTL value ahead of time. For example, if it’s set to 12 hours, then at least 12 hours before you want to make the update, change the TTL value to something low. This way, when you do the update, the internet will pick it up much more quickly. Then, after the update has happened, you can raise the TTL value to something normal again.
I recommend never lowering a TTL to anything smaller than 300 seconds, which is 5 minutes. This is because some resolvers are set up such that if they see a very low value, they assume it’s erroneous, and just ignore it and use their own default. If that happens, then you pretty much just invoked the exact opposite behavior of what you were aiming for.
So it turns out there are different kinds of lookups you can do to a DNS server. The kind of lookup people generally think of corresponds to the A Record, which is what’s used for things like websites. I’m going to talk about two other kinds of records, the first of which is the MX Record.
In short, an MX (which stands for “mail exchanger”) Record is a special record that’s for email. When the thing that actually sends an email for you (called an MTA) is trying to figure out where the email should go, it checks the MX record for the domain to the right of the @ character in the address to see what to do. The MX record will point to at least one, and possibly several, domains. Then, it checks the A record for one of those domains to see what IP address to try and deliver the mail to. You can set a priority on the MX records, so that the mail sending program knows which to try first.
This accomplishes several things. First, it means that your mail doesn’t have to go to the same server as your website. That’s one possibility, but it could also go to a different computer altogether, such as to Google for GMail to handle, or to an exchange server, or any number of places. Second, because you can have more than one server listed to receive mail, you can get some built in resiliency. If the sending program can’t connect to the one with the highest priority it can try another one if it’s available.
The last type of record I’ll mention is called a CNAME, which stands for Canonical Name Record. This type of record is an alias for another domain name. Why would you want to do this? Well, let’s find out.
Virb, which I work on, has a domain masking feature that’s in beta right now, but which will be rolling out to everybody soon. It lets you point a domain that you own at your Virb profile, and use it like it’s your own site. If I were going to do this, I’d set up a CNAME for the domain chrislea.com, and alias it to virb.com. Now, I could just set up an A record and point chrislea.com directly to the virb.com IP address, but that introduces a potential problem. If, in the future, we ever change the IP address for virb.com, I’d have to go and change my A record settings for things to keep working. By using a CNAME to alias my domain to just virb.com, we could change the IP and everything would keep working like it’s supposed to, so it’s a much better option for that sort of thing.
I’m going to talk about the kinds of software that are used to run DNS servers just a little bit. Most of you will never have to deal with setting up or running nameservers, but it’s still worth a bit of discussion I think just so you’re familiar.
The most popular nameserver software is called BIND. It’s an old and very mature piece of software but has some significant problems. The way BIND works is that for every domain you want to serve up DNS information for, you create a flat text file called a zone file. The syntax for zone files is fairly cryptic, which is one problem. Another problem is that when BIND starts up, it loads the information for ALL the zone files it has into memory at once. This is fine if you don’t have very many domains, but when you have a lot you run into issues. One issue is that the main process starts using tons and tons of RAM. A second is that it takes BIND a long time to read in all the files, and to parse all the information into memory structures. at (mt) Media Temple, when we had something like 150,000 domains, it would take about 20 minutes to restart the BIND system we had in place at the time. Suffice to say, we got a new system in place.
As noted, DNS is a lookup system. You give the DNS server a domain name, and it gives you and IP address or another domain name back. This sounds a lot like a job for a database right? Thankfully, people much smarter than me also had that same idea. Enter PowerDNS, which is awesome. PowerDNS stores all the stuff that would be in zone files in BIND in a database, most often MySQL. When the software gets a request, it does a database query to get the information it needs and serves that back out. Whereas having 150,000 domains in BIND caused annoying problems, having 150,000 domains worth of info in a database like this barely even registers.
I have to put in a quick plug here. When were were looking to move away from BIND at (mt) Media Temple, we paid Netherlabs their consulting fees to have them evaluate our current setup and recommend a new architecture for us. It was worth every penny and then some. What we did put into production was almost exactly what they outlined and it’s been working flawlessly for over a year now, and we’ve added a considerable number of new zones in that time period on top of the ones we were starting with.
There are other options as well, but I like PowerDNS a lot, so that’s what I recommend you use if you need to run your own nameservers.
There’s actually quite a bit more about DNS than what I’ve covered here. That said, I think this will probably answer a majority of questions people might have about how the DNS system works. If you really want to know more, that book I mentioned up at the start of the post is my recommendation to read.
If you need to set up nameservers, use PowerDNS. Or, if that doesn’t work for you, I urge you to use some DNS software that has a database backend.
Should you have any additional questions, please feel free to contact me.
As you already paid for expert advice, It would be great if someday you could writeup an article on setting up and properly configuring powerdns. I’m sure there are decent articles out there already, but I also guess that you have quite a bit of information from the perspective of a huge company that most people will just never know and it could save us a lot of headaches knowing how to do it right the first time
There was nothing interesting at all about how we set up PowerDNS itself, we just did it exactly as the docs said. As for things past that, there were a few but they’d fall under the umbrella of intellectual property of the company so I can’t really disclose them.
Honestly, you can serve a lot of requests using PowerDNS with very little tuning. By the time you were big enough to have to think about that kind of optimization, you’d also be big enough to hire or task people specifically to work on that issue.
Thank you Chris, I stumbled around a long time with many hosts that I manage without truly understanding the A Name and CNAME records. Now it makes senese