Zum Inhalt springen

The Domain Name System: The Internet's Phone Book

Zusammenfassung

The Domain Name System is the infrastructure that translates human-readable addresses like “www.example.com” into the numeric IP addresses that computers actually use. Designed by Paul Mockapetris in 1983 to replace a single centrally maintained text file that had become catastrophically unscalable, DNS is now the internet’s most critical and most invisible service — the reason users can type words instead of numbers, and the reason the internet can support billions of distinct addresses. For the broader networking context, see ARPANET: Building the Network That Became the Internet and The Connected World.

The HOSTS.TXT Problem

In the early years of ARPANET, the problem of finding other computers on the network was solved with a text file. The Network Information Center (NIC) at Stanford Research Institute maintained a file called HOSTS.TXT — a simple list mapping every hostname on the network to its numeric address. If you wanted to connect to MIT’s computer, you looked up “MIT-MULTICS” in HOSTS.TXT and got its four-byte address. SRI updated the file when new hosts joined the network; every other host on the network downloaded a fresh copy periodically.

Through the 1970s, as ARPANET remained a small community of researchers, this worked well enough. The file was perhaps a few hundred lines. Network administrators knew each other, coordinated over email about name assignments, and downloaded the latest HOSTS.TXT without difficulty.

By the early 1980s, the system was breaking apart. The network had grown from dozens of hosts to thousands. NSFNET was expanding the community beyond ARPA’s original research contractors. Each new host added a line to HOSTS.TXT; each new host also downloaded HOSTS.TXT, generating traffic to SRI. The update frequency could not keep pace with the growth: a hostname assigned on Monday might not appear in other hosts’ copies of the file until Wednesday or Thursday. Name conflicts proliferated — different organizations independently chose the same hostname. SRI’s machines were being hammered by download requests from thousands of hosts, each refreshing its copy of a file that was already out of date by the time it arrived.

The problem was not just technical. HOSTS.TXT represented a centralized point of control over the naming of every machine on the internet. As the network grew beyond the tight-knit ARPA community, the assumption that SRI could know about and adjudicate every hostname became untenable. A different architecture was needed — one that could scale with the network rather than requiring central coordination.

Paul Mockapetris and the Design of DNS

Paul Mockapetris was a researcher at the University of Southern California’s Information Sciences Institute (USC-ISI) — the same institution where Jon Postel worked, and which served as a central node of internet standards development. In 1982, Mockapetris was asked by Jon Postel and Zaw-Sing Su to design a replacement for HOSTS.TXT. It was not a glamorous assignment; it was a plumbing problem.

Mockapetris’s insight was hierarchical distribution. Rather than one central file, DNS would be a tree-structured namespace — the domain hierarchy — where different parts of the tree were managed by different organizations. The root of the tree would contain the top-level domains. Each top-level domain would manage its own second-level registrations. Each organization that registered a domain would manage its own subdomains and host records. No central authority needed to know the IP address of every machine; it only needed to know who was responsible for each part of the hierarchy.

The lookup mechanism was equally elegant: recursive resolution. A client asking for the address of “mail.cs.mit.edu” would ask a resolver. The resolver, if it didn’t know the answer, would ask the root servers who was responsible for “.edu,” then ask the .edu servers who was responsible for “mit.edu,” then ask MIT’s servers who was responsible for “cs.mit.edu,” and finally get the address of “mail.” Each step required only knowledge of the next level down. The total distributed database could contain billions of records, but no single server needed more than a fraction of them.

Mockapetris published DNS in RFC 882 and RFC 883 in November 1983, then revised and finalized the design in RFC 1034 and RFC 1035 in November 1987. Those 1987 documents — over four decades old — remain the foundational standard for DNS. Virtually every DNS server running today still implements the core protocol described by Mockapetris in 1987.

The Original Top-Level Domains

The first DNS root contained seven generic TLDs: .com (commercial), .edu (educational), .gov (US government), .mil (US military), .net (network infrastructure), .org (non-profit organizations), and .int (international organizations). Country-code TLDs like .uk, .de, and .jp were added alongside them, assigned by Jon Postel based on the ISO 3166 list of country codes. The distinction between .com, .net, and .org was meant to be categorical; in practice, .com quickly became dominant for all commercial purposes, and the distinctions collapsed.

Root Servers and the 13-Address Architecture

DNS’s hierarchical design required a set of root servers — the authoritative starting point for resolving any domain name. When a resolver cannot answer a query from its cache, it asks a root server which name server is responsible for the relevant TLD.

The original DNS design limited root server addresses to 13 IP addresses (labeled A through M) — not because 13 was a magic number, but because the DNS protocol, built on UDP, had a 512-byte limit for responses, and 13 IPv4 root server addresses was the maximum that fit. The names still exist: a.root-servers.net through m.root-servers.net.

In practice, the 13 addresses are served by hundreds of physical servers distributed across every continent, using anycast routing — a technique where the same IP address is announced by multiple geographically distributed servers, and network routing automatically directs each query to the nearest copy. The “13 root servers” are actually over 1,500 server instances worldwide. The constraint imposed by UDP packet sizes in 1983 shaped the architecture of the internet’s most fundamental service for decades.

Jon Postel, IANA, and the Informal Governance Era

Through the 1980s and early 1990s, internet governance was informal to an extent that would be unimaginable today. The assignment of IP addresses, the management of TLD registries, and the administration of the root zone — the list of authoritative servers for each TLD — were handled essentially by Jon Postel at USC-ISI, under the vague authority of a contract with DARPA. The organization was called the Internet Assigned Numbers Authority (IANA), and for most of its existence it was Jon Postel plus a handful of colleagues.

This arrangement worked because the internet remained a research community with shared norms. When Postel told you that a particular IP address range or TLD belonged to a particular organization, you accepted it because he said so, and he had been saying so reliably since the late 1960s. His authority was moral and technical, not legal.

In 1998, the Clinton administration decided that the internet had grown too important for its governance to rest on one man’s judgment and a DARPA contract. The Internet Corporation for Assigned Names and Numbers (ICANN) was created as a non-profit to take over IANA functions. The transition was contentious; Postel, who had managed internet naming for thirty years, was not enthusiastic about being replaced by a formal institution.

Jon Postel died on October 16, 1998, at the age of 55, two weeks after undergoing heart surgery — and only weeks after ICANN’s formation. The informal era of internet governance ended with him.

The Domain Name Gold Rush

ICANN’s creation coincided with the dot-com boom, and the domain name system found itself at the center of a speculative frenzy. Domain names were recognized as valuable commercial property: owning “cars.com” or “insurance.com” meant controlling a highly visible internet address that countless users would type or follow. Cybersquatters registered thousands of brand names, personal names, and generic terms with the explicit intention of reselling them at inflated prices.

Network Solutions, which held the monopoly on .com, .net, and .org registrations under a government contract, charged $100 for a two-year registration. At the peak of the boom, new .com domains were being registered at over 50,000 per day.

ICANN responded with the Uniform Domain-Name Dispute-Resolution Policy (UDRP) in 1999, providing a mechanism for trademark holders to reclaim domains registered in bad faith. The policy has processed over 60,000 disputes. Courts and arbitration panels have reclaimed “madonna.com,” “panavision.com,” and many hundreds of corporate brand names from squatters.

The domain gold rush also produced the secondary market for expired and premium domains. In 2010, “sex.com” — one of the internet’s most-visited addresses in the 1990s — sold for $13 million, the highest price ever paid for a domain name at that point. “Cars.com” sold for $872 million in 2014, but as a going business rather than just a domain.

DNSSEC and the Kaminsky Bug

DNS had a serious security problem baked into its original design: responses were not cryptographically authenticated. When your resolver asked a DNS server where to find “bank.com” and received an answer, nothing in the protocol proved that the answer came from the legitimate authority for bank.com rather than from an attacker intercepting the response or poisoning a cache.

This vulnerability was known theoretically for years, but the severity was unclear until 2008, when security researcher Dan Kaminsky discovered that DNS cache poisoning was far more practical than the community had believed. An attacker could flood a resolver with forged responses using a technique that required only seconds to succeed, redirecting any domain to a malicious server without the victim’s knowledge. The attack would work against essentially every DNS resolver on the internet.

Kaminsky did something unusual: rather than publishing immediately, he contacted the major DNS software vendors and coordinated a simultaneous disclosure and patch release involving Microsoft, Cisco, ISC (BIND), and others. On July 8, 2008, patches were released by every major vendor simultaneously — an unprecedented coordination effort. The attack was publicly disclosed only after patches were deployed. It remains one of the most significant coordinated vulnerability disclosures in internet history.

The Kaminsky Bug demonstrated what security researchers had argued for years: DNS needed cryptographic authentication. DNSSEC (DNS Security Extensions), developed through the 1990s and standardized in 2005, adds digital signatures to DNS records, allowing resolvers to verify that responses are authentic. Adoption has been slow — adding signatures to DNS records requires action by zone operators, and many organizations have not bothered — but DNSSEC deployment is gradually increasing.

DNS Over HTTPS and the Privacy Debate

Traditional DNS queries are sent in plaintext over UDP port 53, visible to any observer between the client and the resolver. Your ISP can see every domain you look up. Coffee shop networks can see them. Governments with network access can see them. This was not a problem when the internet was a small research network; it became a significant privacy issue as the internet became the primary medium for communication, commerce, and political activity.

DNS over HTTPS (DoH) and DNS over TLS (DoT), standardized in 2018, encrypt DNS queries so they cannot be observed by intermediaries. Mozilla Firefox defaulted to DoH using Cloudflare’s resolver in 2019. Google Chrome followed with its own DoH implementation.

The change was controversial. Encrypting DNS protects user privacy from ISPs and network observers, but it also bypasses the network-level filtering that organizations, parental control systems, and national regulators use to block access to malicious or prohibited content. ISPs complained that DoH was breaking their business models (they sell anonymized DNS query data). Corporate security teams complained that it blinded their network monitoring. UK ISPs called Mozilla “an internet villain” for the DoH default.

The debate reflects a genuine tension: DNS is both a technical protocol and a policy instrument. When your ISP’s resolver refuses to resolve “malware-site.com” because it’s on a blocklist, that’s DNS being used as a content filter. When Cloudflare’s resolver returns the real address regardless of what your ISP thinks, that’s DNS being used as a privacy tool. Both uses rely on the same 1983 protocol.

The Protocol That Holds the Internet Together

DNS is the most critical service on the internet that most users have never heard of. Without it, the web collapses into a system navigable only by number. Every email delivery depends on DNS (MX records). Every HTTPS connection starts with a DNS lookup. Every app that connects to a server requires DNS resolution to find that server.

The protocol’s creator, Paul Mockapetris, never became famous. He received the Internet Hall of Fame award in 2012 and the ACM Software System Award in 1993. He spent years at various companies working on DNS-related infrastructure and security. His design — a hierarchical, distributed, cached database with delegated authority — has proven robust enough to scale from thousands of hosts in 1983 to over 350 million registered domain names today without fundamental architectural change.

That robustness was not accidental. Mockapetris understood that the internet would grow in ways that could not be predicted, and designed DNS accordingly: no central database that would need to scale with the network, no hard limits on the number of records, no assumption about the structure of names below the TLD level. The hierarchy delegates authority to whoever needs it without requiring central knowledge of how they use it. It is a design that solved not just the 1982 problem of an overloaded HOSTS.TXT file but every scaling problem the internet would encounter for the next forty years.

📚 Sources