10,000-word article on DNS protocol!

10,000-word article on DNS protocol!

[[376851]]

Consider this question: how many ways can we humans identify ourselves? We can identify ourselves through our ID card, social security card number, or driver's license. Although we have multiple ways of identification, in a specific environment, one method may be more suitable than another. Hosts on the Internet, like humans, can be identified using multiple methods. One way to identify a host on the Internet is to use its hostname, such as www.facebook.com, www.google.com, etc. However, this is the way we humans remember things, and routers don't understand it this way. Routers prefer fixed-length, hierarchical IP addresses.

IP address is now simply described as a 4-byte address with a strict hierarchical structure. For example, in an IP address like 121.7.106.83, each byte can be separated by ., representing a decimal number from 0 to 255.

However, routers prefer to resolve IP addresses, but we humans prefer to remember URLs. So how does a router resolve an IP address into a URL that we are familiar with? This is where DNS comes in.

DNS stands for Domain Name System. It is a distributed database implemented by hierarchical DNS servers. It is also an application layer protocol that enables hosts to query distributed databases. DNS servers are usually UNIX machines running BIND (Berkeley Internet Name Domain) software. The DNS protocol runs on UDP and uses port 53.

DNS basics

Like HTTP, FTP and SMTP, DNS is also an application layer protocol. DNS uses the client-server model to run between communicating end systems, and transmits DNS messages between communicating end systems through the following end-to-end transport protocol. However, DNS is not an application that directly interacts with users. DNS provides a core function for user applications and other software on the Internet.

DNS is not usually a standalone protocol; it is usually used by other application layer protocols, including HTTP, SMTP, and FTP, to resolve user-supplied host names into IP addresses.

The following is an example to describe the DNS resolution process. This is similar to what the browser does when you enter a URL.

What happens when you type www.someschool.edu/index.html in your browser? In order for the user's host to send an HTTP request message to the Web server www.someschool.edu, the following operations will be performed:

  • The DNS application client is running on the same user host
  • The browser extracts the hostname www.someschool.edu from the URL above and passes this hostname to the client of the DNS application.
  • A DNS client sends a request containing a host name to a DNS server.
  • The DNS client will eventually receive a reply message containing the IP address of the target host.
  • Once the browser receives the IP address of the target host, it can initiate a TCP connection to the HTTP server process located on port 80 of that IP address.

In addition to providing IP address to host name conversion, DNS also provides the following important services

  • Host aliasing: A host with a complex host name can have one or more other aliases. For example, a host named relay1.west-coast.enterprise.com can have two host aliases, enterprise.com and www.enterprise.com. In this case, relay1.west-coast.enterprise.com is also called the canonical host name, and the host alias is easier to remember than the canonical host name. The application can call DNS to obtain the canonical host name corresponding to the host alias and the host's IP address.
  • Mail server aliasing. Similarly, email applications can also call DNS to resolve the provided host name.
  • Load distribution, DNS is also used to distribute load between redundant servers. Busy sites such as cnn.com are redundantly distributed on multiple servers, each of which runs on different end systems, each with a different IP address. Because of these redundant web servers, a set of IP addresses is associated with the same canonical host name. The DNS database stores these sets of IP addresses. Since the client initiates an HTTP request each time, DNS will distribute the load cyclically between all these redundant web servers.

DNS Work Overview

Suppose that some applications (such as web browsers or mail readers) running on the user's host need to convert host names into IP addresses. These applications will call the DNS client and indicate the host name that needs to be converted. After receiving it, the DNS on the user's host will use UDP to send a DNS query message to the network through port 53. After a period of time, the DNS on the user's host will receive a DNS answer message corresponding to the host name. Therefore, from the perspective of the user's host, the DNS is like a black box, and you cannot see its internal operations. But in fact, the black box that implements the DNS service is very complex. It consists of a large number of DNS servers distributed around the world and an application layer protocol that defines how the DNS server communicates with the query host.

The earliest design of DNS was to have only one DNS server. This server would contain all DNS mappings. This is a centralized design that is not suitable for today's Internet, because the Internet has a huge and growing number of hosts. This centralized design will have the following problems:

  • A single point of failure: if the DNS server crashes, the entire network will be paralyzed.
  • Traffic volume: a single DNS server has to handle all DNS queries, which can be millions or tens of millions of queries.
  • Distant centralized database: a single DNS server cannot be close to all users. For example, a DNS server in the United States cannot be close to a query from Australia, where the query request will inevitably go through a slow and congested link, causing severe latency.
  • Maintenance: The maintenance cost is huge and it also requires frequent updates.

Therefore, DNS cannot be designed in a centralized manner. It has no scalability at all. Therefore, a distributed design is adopted. The characteristics of this design are as follows:

Distributed, hierarchical database

First of all, the first problem that the distributed design solves is the scalability of the DNS server. Therefore, DNS uses a large number of DNS servers, which are generally organized in a hierarchical manner and distributed all over the world. No DNS server can have the mapping of all hosts on the Internet. Instead, these mappings are distributed on all DNS servers.

Generally speaking, there are three types of DNS servers: root DNS servers, top-level domain (TLD) DNS servers, and authoritative DNS servers. The hierarchical model of these servers is shown in the figure below.

Suppose now a DNS client wants to know the IP address of www.amazon.com, how does the above domain name server resolve it? First, the client will contact one of the root servers, which will return the IP address of the TLD server for the top-level domain com. The client then contacts one of these TLD servers, which will return the IP address of the authoritative server for amazon.com. Finally, the client contacts one of the authoritative servers for amazon.com, which returns its IP address for www.amazom.com.

DNS Hierarchy

Let's now discuss the hierarchical system of domain name servers above

  • Root DNS servers, there are more than 400 root domain name servers around the world, these root domain name servers are managed by 13 different organizations. The list and organization of root domain name servers can be found at https://root-servers.org/. Root domain name servers provide the IP addresses of TLD servers.
  • Top-level domain DNS servers, for each top-level domain such as com, org, net, edu and gov and all country-level domains uk, fr, ca and jp there is a TLD server or server cluster. For a list of all top-level domains, see https://tld-list.com/. TDL servers provide the IP addresses of authoritative DNS servers.
  • Authoritative DNS servers, with publicly accessible hosts on the Internet, such as Web servers and mail servers, the organizations of these hosts must provide accessible DNS records that map the names of these hosts to IP addresses. An organization's authoritative DNS server houses these DNS records.

DNS query steps

Below we describe the DNS query steps, a series of processes from DNS resolution of IP to DNS return.

Note: Normally, DNS will cache the search information in the browser or local computer. When the same request comes, DNS search will not be performed again, but the result will be returned directly.

Typically, a DNS lookup goes through the following steps:

  1. When a user types the URL www.example.com into a browser and hits enter, the query goes to the network and is received by a DNS resolver.
  2. The DNS resolver will initiate a query request to the root domain name, requesting the return of the address of the top-level domain name.
  3. The root DNS server notices the prefix of the requested address and returns a list of IP addresses for the top-level domain name servers (TLD) for com to the DNS resolver.
  4. The DNS resolver then sends a query message to the TLD server.
  5. After receiving the request, the TLD server will return the IP address of the authoritative DNS server to the DNS resolver based on the address of the domain name.
  6. Finally, the DNS resolver sends the query directly to the authoritative DNS server
  7. The authoritative DNS server returns the IP address to the DNS resolver
  8. The DNS resolver will respond to the web browser with the IP address

Once the DNS lookup step returns the IP address for example.com, the browser can request the web page.

The whole process is shown in the figure below

DNS Resolver

The host and software that perform DNS queries are called DNS resolvers. Workstations and personal computers used by users are all resolvers. A resolver must register at least one IP address of a domain name server. The DNS resolver is the first stop for DNS lookups and is responsible for dealing with the client that issued the initial request. The resolver initiates the query sequence and ultimately converts the URL into the necessary IP address.

A DNS recursive query is different from a DNS recursive resolver, which is a request to a DNS resolver that needs to resolve the query. A DNS recursive resolver is a computer that accepts recursive queries and processes the response by making the necessary requests.

DNS query type

There are three types of queries that occur in a DNS lookup. By combining these queries, an optimized DNS resolution process reduces the transmission distance. Ideally, cached record data can be used, allowing the DNS name server to use non-recursive queries directly.

Recursive query: In a recursive query, a DNS client asks a DNS server (typically a DNS recursive resolver) to respond to the client with the requested resource record, or return an error message if the resolver cannot find the record.

Iterative query: In an iterative query, if the queried DNS server does not match the queried name, it returns a referral to a DNS server that is authoritative for a lower level domain name space. The DNS client then makes a query to the referral address. This process continues using other DNS servers in the query chain until an error or timeout occurs.

Non-recursive query: This query is usually made when a DNS resolver client queries a DNS server for a record that it has access to, either because it is authoritative for the record or because the record exists in its cache. DNS servers usually cache DNS records and are able to return cached results directly when a query comes in, preventing more bandwidth consumption and load on upstream servers.

DNS Cache

DNS caching, sometimes also called DNS resolver cache, is a temporary database maintained by the operating system that contains the most recent access records of websites and other Internet domains. In other words, DNS caching is just a technology and means for computers to cache loaded resources in order to meet fast response speeds, so that they can be directly and quickly referenced when they are accessed again. So how does DNS caching work?

How DNS caching works

Before the browser makes a request to the outside world, the computer intercepts each request and looks up the domain name in the DNS cache database, which contains a list of recent domain names and the addresses that DNS calculated for them when the DNS first made the request.

DNS caching method

DNS data can be cached in various locations, each of which will store DNS records with a lifetime determined by the TTL (DNS field).

Browser Cache

Today's web browsers are designed to cache DNS records for a period of time by default. The closer the DNS cache is to the web browser, the fewer requests are made to the IP address to check the cache. When a request is made for a DNS record, the browser cache is the first place checked for the requested record.

In the Chrome browser, you can use chrome://net-internals/#dns to check the status of the DNS cache. This is based on the query under Windows. After entering the above URL on my Mac computer, I cannot check the DNS and can only clear the host cache. I don’t know why. Maybe it’s due to some setting?

Operating system kernel cache

After the browser caches the query, it will query the operating system-level DNS resolver. The operating system-level DNS resolver is the second stop before the DNS query leaves your computer and is the last step of the local query.

DNS Message

All DNS servers that jointly implement the DNS distributed database store resource records (RR), which provide a mapping from host names to IP addresses. Each DNS reply message contains one or more resource records. RR records are used to respond to client queries.

A resource record is a 4-tuple consisting of the following fields:

  1. ( Name , Value, Type, TTL)

There are different types of RR. Below is a summary of the different types of RR.

DNS RR Type explain
A Record IPv4 host records, used to map domain names to IPv4 addresses
AAAA Records IPv6 host records, used to map domain names to IPv6 addresses
CNAME Record Alias ​​record, used to map the alias of the DNS domain name
MX Records Mail exchanger, used to map DNS domain names to mail servers
PTR Record Pointer, used for reverse lookup (IP address to domain name resolution)
SRV Records SRV records, which are used to map available services.

DNS has two types of messages, one is the query message and the other is the response message, and these two messages have the same format. The following is the DNS message format

The figure above shows the DNS message format, where the six fields of transaction ID, flag, number of questions, number of answer resource records, authoritative name server count, and number of additional resource records are the DNS message segment header, which has a total of 12 bytes.

Segment Header

The segment header is the basic structure of the DNS message. Below we describe each byte in the segment header.

  • Transaction ID: The transaction ID occupies 2 bytes. It is the DNS identification, also called the identifier. For the request message and the response message, the value of this field is the same. The identifier can be used to distinguish which request the DNS response message responds to.
  • Flags: The flag field occupies 2 bytes. There are many flag fields, and they are also very important. All flag fields are listed below.

The meaning of each field is as follows

  • QR (Response): The 1-bit QR identifies whether the message is a query message or a response message. QR = 0 for a query message and QR = 1 for a response message.
  • OpCode: The 4-bit OpCode represents the operation code, where 0 represents a standard query, 1 represents a reverse query, and 2 represents a server status request.
  • AA (Authoritative): 1-bit AA stands for authorization response. This AA is only valid in the response message. When the value is 1, it means that the name server is an authoritative server; when the value is 0, it means that it is not an authoritative server.
  • TC (Truncated): truncation flag. When the value is 1, it means that the response exceeds 512 bytes and has been truncated. Only the first 512 bytes are returned.
  • RD (Recursion Desired): This field is the recursion desired field, which is set in the query and returned in the response. This flag tells the name server that it must process this query, which is called a recursive query. If this bit is 0, and the requested name server does not have an authoritative answer, it will return a list of other name servers that can answer the query. This method is called an iterative query.
  • RA (Recursion Available): Available recursion field, this field only appears in the response message. When the value is 1, it means that the server supports recursive query.
  • zero: Reserved field. Its value must be 0 in all request and response messages.
  • AD: This field indicates whether the information is authorized.
  • CD: This field indicates whether to disable security check.
  • rcode (Reply code): This field is the return code field, indicating the error status of the response. When the value is 0, it means no error; when the value is 1, it means the message format is wrong (Format error), the server cannot understand the requested message; when the value is 2, it means the domain name server fails (Server failure), because the server cannot process this request; when the value is 3, it means the name error (Name Error), which is only meaningful to the authorized domain name resolution server, indicating that the resolved domain name does not exist; when the value is 4, it means the query type is not supported (Not Implemented), that is, the domain name server does not support the query type; when the value is 5, it means rejected (Refused), generally the server refuses to give a response due to the set policy, such as the server does not want to give a response to certain requesters.

I believe that readers are like me and it is meaningless to just look at these fields. Next, we will take a look at the specific DNS message by capturing the packet.

Now we can look at the specific DNS message. Through the query, we know that this is a request message. The identifier of this message is 0xcd28. Its flags are as follows

  • QR = 0 confirms that this is a request.
  • Then there is a four-byte OpCode, whose value is 0, indicating that this is a standard query.
  • Because this is a query request, no AA field is present.
  • Then there is the truncation flag, Truncated, which means it has not been truncated.
  • The following RD = 1 indicates that a recursive answer is desired.
  • No RA field appears in the request message.
  • Then the reserved field is zero.
  • A 0 immediately following it means that unauthenticated data is not acceptable.
  • There is no value for the rcode field

Then let's look at the response message

As you can see, the flag bit is also 0xcd28, which means this is the response to the query request above.

We will not explain the query request message that has already been explained here. Now we will only explain the content that is not in the request message.

  • The AA field immediately following the OpCode has appeared, and its value is 0, indicating that it is not a response from an authoritative DNS server.
  • Finally, there is the response of the rcode field. When the value is 0, it means there is no error.

Problem Areas

The problem area usually refers to the query problem area in the message format. This part is used to display the problem of the DNS query request, including the query type and query category.

The meaning of each field in this section is as follows

  • Query name: Specifies the domain name to be queried, and sometimes also the IP address, for reverse query.
  • Query type: The resource type of the DNS query request. Usually the query type is type A, which means obtaining the corresponding IP address from the domain name.
  • Query class: Address type, usually an Internet address, with a value of 1.
  • Similarly, we use wireshark to check the problem area

As you can see, this is a DNS query request for mobile-gtalk.l.google.com. The query type is A, so the response type should also be A.

As shown in the figure above, the response type is A, and the query class values ​​are usually 1, 254, and 255, representing Internet class, no class, and all classes, respectively. These are the values ​​we are interested in. Other values ​​are usually not used in TCP/IP networks.

Resource Record Section

The resource record part is the last three fields of the DNS message, including the answer area, the authoritative name server record, and the additional information area. These three fields all use a format called resource records, as shown in the following figure

The fields in the resource record section have the following meanings:

  • Domain Name: The domain name of the DNS request.
  • Type: The type of resource record, which is the same as the query type value in the question section.
  • Class: Address type, same as the query class value in the question.
  • Lifetime: In seconds, it indicates the life cycle of the resource record.
  • Resource data length: the length of the resource data.
  • Resource data: represents the data of related resource records returned according to the query segment requirements.

The resource record part only appears in the DNS response packet. Let's take a look at the specific field examples through the response message.

The domain name value is mobile-gtalk.l.google.com, the type is A, the class is 1, the lifetime is 5 seconds, the data length is 4 bytes, and the address represented by the resource data is 63.233.189.188.

SOA Records

If it is an authoritative DNS server's response, it will show a record storing important information about the zone, this information is the SOA record. All DNS zones require an SOA record to comply with IETF standards. SOA records are also important for zone transfers.

In addition to the fields in the DNS resolver response, the SOA record also has some additional fields, as follows

Specific field meaning

  • PNAME: Primary Name Server, which is the name of the primary name server for the zone.
  • RNAME: Responsible authority's mailbox. RNAME represents the administrator's email address. @ is represented by ., which means that admin.example.com is equivalent to [email protected].
  • Serial Number: Serial Number, the regional serial number is the unique identifier of the region.
  • Refresh Interval: The time (in seconds) that the secondary server should wait before requesting the primary server to provide the SOA record to see if it has been updated.
  • Retry Interval: How long the server should wait for an unresponsive primary name server to request an update again.
  • Expiration limit: If a secondary server does not receive a response from the primary server within this period of time, it should stop responding to queries for the zone.

The main name server and service name server are mentioned above. The relationship between them is as follows

Here we mainly explain the records of RR type A (IPv4) and SOA. There are many other types, which will not be introduced in detail in this article. Readers can read "TCP/IP Volume 1 Protocol" and cloudflare's official website https://www.cloudflare.com/learning/dns/dns-records/. It is worth mentioning that cloudflare is a very good website for learning network protocols.

DNS Security

Almost all network requests go through a DNS query, and like many other Internet protocols, the DNS system was not designed with security in mind and has some design limitations, which create opportunities for DNS attacks.

DNS attacks mainly include the following methods

  • The first is DoS attack, which mainly overloads important DNS servers such as TLD servers or root domain name servers, making them unable to respond to requests from authoritative servers and making DNS queries ineffective.
  • The second form of attack is DNS spoofing, which changes the content of DNS resources, such as pretending to be an official DNS server and replying with fake resource records, thus causing the host to connect to the wrong IP address when trying to connect to another machine.
  • The third form of attack is DNS tunneling, which uses other network protocols to tunnel through DNS queries and responses. Attackers can use SSH, TCP, or HTTP to pass malware or stolen information into DNS queries in a way that firewalls cannot detect, thus forming a DNS attack.
  • The fourth form of attack is DNS hijacking, in which the attacker redirects queries to other name servers. This can be done through malware or unauthorized modification of DNS servers. Although the results are similar to DNS spoofing, this is a completely different attack because it targets the DNS records of websites on name servers rather than the resolver's cache.
  • Chapter 5 The attack form is DDoS attack, also known as distributed denial of service bandwidth flooding attack, this attack form is equivalent to an upgraded version of Dos attack

So how to defend against DNS attacks?

One of the most well-known methods of defending against DNS threats is to adopt the DNSSEC protocol.

DNSSEC

DNSSEC is also called DNS Security Extensions. DNSSEC protects the validity of data by digitally signing it, thereby preventing attacks. It is a series of DNS security authentication mechanisms provided by IETF. DNSSEC does not encrypt data, it only verifies whether the address of the site you are visiting is valid.

DNS Firewall

Some attacks are conducted against servers, which is where a DNS firewall comes in. A DNS firewall is a tool that can provide many security and performance services to DNS servers. A DNS firewall sits between a user's DNS resolver and the authoritative name server for the website or service they are trying to access. The firewall provides rate-limited access to shut down attackers who are trying to overwhelm the server. If a server does go down due to an attack or any other reason, a DNS firewall can keep the operator's site or service up and running by providing DNS responses from the cache.

In addition to the above two defense methods, the operator of the DNS zone itself will take further measures to protect the DNS server, such as configuring the DNS infrastructure to prevent DDoS attacks.

More information about DNS attacks and defenses is the topic of network security, which will not be introduced in detail in this article.

Summarize

In this article, I used quite a few words to introduce you to the basic overview of DNS, the working mechanism of DNS, the query method of DNS, and the cache mechanism of DNS. We also used WireShark to capture packets to introduce you to DNS messages. Finally, I introduced you to the attack methods and defense methods of DNS.

This is a relatively comprehensive article about DNS. It took me more than a week to write it. After understanding this article, you should be able to answer most of the questions about DNS, and I think you will have a good chance of getting a job interview.

If this article is well written, I hope readers can give the following four things: like, read, comment, and share. Remember to do it this time!

This article is reprinted from the WeChat public account "Programmer cxuan", which can be followed through the following QR code. To reprint this article, please contact the programmer cxuan public account.

<<:  Is it necessary to activate a 5G package to use the 5G network? The Ministry of Industry and Information Technology has responded, and users are applauding

>>:  Don't understand the network I/O model? How to get started with Netty

Recommend

Cool Knowledge: Learn about RF Antennas in One Article

RF Antenna picture An antenna is a device used to...

How is lisahost? Lisa host US dual ISP three network 9929 line VPS simple test

A few days ago, the tribe shared the product info...

How is Gigabit LTE different from 5G?

Gigabit LTE: The 4G solution for high-speed cellu...

MIIT releases three-year action plan for industrial internet

MIIT releases three-year action plan for industri...

...

China Mobile launches A-share listing: "Making money" but not "cutting leeks"

On the evening of May 17, World Telecommunication...

What is the difference between SNMP Trap and Syslog?

System administrators use Syslog or SNMP Trap for...

Is it a major setback or a cold shower? What happened to 5G?

Because South Korea postponed the commercializati...

How does network monitoring work?

Network monitoring complements network management...

The future of 5G technology: a world of infinite possibilities

The tech world is abuzz with something really exc...