Everyone must be familiar with DNS (Domain Name System), which is used to convert a website's domain name into a corresponding IP. When we find that we can access QQ but cannot browse the web, we will think that the domain name server may be down; when we use the hosts file provided by others to browse to a "non-existent" web page, we will understand the fragility of the domain name resolution system.
However, there are still a lot of stories about DNS that are worth listening to and thinking about. DNS Origin To access a computer on the Internet, we must know its IP address, but these addresses (such as 243.185.187.39) are just a string of numbers with no pattern, so it is difficult for us to remember. And if a computer changes its IP, it must notify everyone. Obviously, using IP addresses directly is a stupid solution. So people came up with an alternative method, which is to give each computer a name and then establish a mapping relationship between computer names and addresses. We access the computer name, and the rest of the name-to-address conversion process is automatically completed by the computer. In the early days, the conversion process from name to address was very simple. Each computer kept a hosts file, which listed all computer names and their corresponding IP addresses, and then regularly updated the records from a site that maintained this file. When we accessed a computer name, we first found the corresponding IP in the hosts file, and then we could establish a connection. hosts management host This is what the early ARPANET did, but as the network grew in size, this approach became unsustainable. There are three main reasons for this: 1: The hosts file becomes very large; 2: The host name will conflict; 3: The centralized maintenance site will be overwhelmed (it is scary to think about providing hosts files for millions of machines). In order to solve the above problems, Paul Mockapetris proposed the Domain Name System (DNS) in 1983. This is a hierarchical, domain-based naming scheme implemented with a distributed database system. When we need to access a domain name (actually the name of the computer mentioned above), the application will initiate a DNS request to the DNS server, and the DNS server will return the IP address corresponding to the domain name. The above problems are solved by the following three means: 1: Not all name-to-IP mappings are stored on the user's computer, which prevents the hosts file from being too large (the hosts file in each operating system is now empty by default). 2: Specifies the naming rules for domain names to ensure that host names will not be repeated. 3: The DNS server is no longer a single machine, but a hierarchical and rationally organized server cluster. The process of accessing a domain name can be simplified as shown below: Domain name hosts resolution process DNS protocol So how to implement this so-called domain name system? It is not a simple matter to manage a very large and constantly changing set of domain name to IP mappings, not to mention having to deal with tens of thousands of DNS query requests. People finally came up with a good set of protocols that stipulate how to implement this system. Let's take a look at it. First, we need to develop a set of naming rules to prevent domain name duplication. DNS rules for domain names are similar to the express delivery system in our lives, using a hierarchical address structure. In the express delivery system, if you want to mail something to someone, the address may be: No. 12, Zhongshan West Road, Panyu District, Guangzhou City, Guangdong Province, China. A domain name looks like this: groups.google.com (Why not com.google.groups? I guess it has something to do with the habit of foreigners writing addresses). For the Internet, the first part of the domain name hierarchy (equivalent to the country part of the international express address) is managed by ICANN (Internet Corporation for Assigned Names and Numbers). Currently, there are more than 250 first-level domain names, each of which can be further divided into some subdomains (second-level domain names), which can be further divided (third-level domain names), and so on. All these domain names can be organized into a tree, as shown in the following figure (picture from Computer Networks: 7-1): Domain name space tree DNS was originally designed to map domain names to IP addresses. In theory, we only need to save one record on the domain name server for each domain name. The record here is generally called a domain name resource record, which is a five-tuple and can be expressed in the following format:
in: Domain_name: Indicates which domain name this record applies to; Time_to_live: used to indicate the life cycle of the record, that is, how long the record can be cached at most (the caching mechanism will be discussed later); Class: Generally always IN; Type: the type of record; Value: The value of the record. If it is an A record, the value is an IPv4 address. We can see that the domain name resource record has a Type field, which is used to indicate the type of record. Why is this? Because for a domain name, usually it is not just the IP address that is recorded, but also some other types of records may be needed. Some common record types are as follows: We know that we cannot use only one domain name server to respond to all DNS queries, because no single machine can provide query services to users around the world, as computing power, storage, and bandwidth do not allow it. We can only reasonably organize a cluster of domain name servers to work together to provide domain name resolution services. The first problem we will face is how to reasonably store all domain name resource records on different domain name servers. As mentioned above, the domain name space can be organized as a tree. Here we can further divide it into non-overlapping areas (DNS zones). For the domain name space in the above figure, a possible domain name division is as follows: Domain name division Each zone is then associated with multiple domain name servers (one of which is the master, and the other slave servers are used to provide data backup, speed up resolution, and ensure service availability). These domain name servers are called the authoritative name servers of the zone, which store two types of domain name resource records: 1: Domain name resource records for all domain names in the zone. 2: Domain name resource records (mainly NS records) corresponding to the domain name servers of the parent zone and child zone. In this way, all domain name resource records are stored in multiple domain name servers, and all domain name servers also form a hierarchical index structure, which is convenient for us to perform domain name resolution later. The following uses a simplified domain name space as an example to illustrate how domain name resource records are stored in domain name servers, as shown in Figure a: Domain Name Server The domain name space in the figure is divided into seven DNS zones: A, B, C, D, E, F, and G. Each DNS zone has multiple authoritative domain name servers, which store many domain name resolution records. For the DNS zone E in the figure above, the records stored in its authoritative domain name server are shown in the table in the figure. If you look closely at the above figure, you may find that zones A and B do not have a parent zone, and there is no path connecting them. This will lead to a very troublesome problem, that is, the authoritative domain name server of zone A may not know the existence of zone B at all. After realizing this, you may come up with a very natural solution, which is to record the address of B's domain name server in A and record A's in B, so that the two are connected. However, considering that we have more than 250 popular domain names, this is not very appropriate. The domain name system we use adopts a smarter approach, which is to introduce the root domain name server, which stores the authoritative domain name server records of all *** zones. Now through the root domain name server, we can find the authoritative domain name servers of all *** zones, and then we can find them one level at a time. The following figure shows the distribution of the global root domain name servers, which can be found here. ***Domain Name Server So far, our authoritative domain name servers and root domain name servers actually form a tree, with the root domain name server as the root, and each node below is an authoritative domain name server for a region. For the authoritative domain name servers of each DNS zone in Figure a, they form the following tree (in practice, an authoritative domain name server may store records of multiple DNS zones, so the connection between authoritative domain name servers does not constitute a tree. For details of this part, please refer to RFC 1034: 4. NAME SERVERS. For easy understanding, it is simplified to a tree below): Name server tree Domain name resolution We already have a domain name server cluster that properly stores the correspondence between the domain name space and the domain name resource records. Now all we have to do is send a DNS request to the domain name server and wait for it to return the correct domain name resource records. This process is called domain name resolution. Strictly speaking, the process of domain name resolution can be traced back to the establishment of network connection. Because every time you connect to the network, the computer will automatically obtain a default DNS server. Of course, you can also use a DNS server you trust, such as 8.8.8.8 (DNS servers can also be trusted or untrusted, yes, we will talk about it in the practice section). We also call this domain name server the local domain name server. Next, when we need to know the resource record corresponding to a domain name, we will initiate a request to the local domain name server. If the domain name happens to be in the domain name zone (DNS zone) under the jurisdiction of the local domain name server, then the record can be returned directly. If the resource record of the domain name is not found in the local domain name server, it is necessary to search the entire domain name space for the domain name. The resource records of the entire domain name space are stored in a hierarchical, tree-like series of domain name servers, so the local domain name server must first search from the root domain name server downwards. Here is a question: how does the local domain name server find where the root domain name server is? In fact, when the domain name server is started, it will load a configuration file that stores the NS records of the root domain name server (it should be noted that the root domain name server address is generally very stable, will not change easily, and there are very few of them, so this configuration file will be very small). After finding the root domain name server, you can search down one level at a time. Still taking our Figure a as an example, now suppose a user in area E wants to access math.sysu.edu.cn, then the request process is as follows: Domain name resolution process A simple description in words is as follows: 1: User: Hello, local domain name server, tell me the address of math.sysu.edu.cn; 2: Local domain name server: Oh, I don't know, it's not in my jurisdiction, let me ask Big Brother. Root, can you tell me the address of math.sysu.edu.cn? 3: Root domain name server: busy, please ask B(.cn); 4: Local domain name server: Hello, B, tell me the address of math.sysu.edu.cn; 5: B: Go ask D(.edu.cn); 6: Local domain name server: Hello, D, tell me the address of math.sysu.edu.cn; 7: D: Ask F (sysu.edu.cn); 8: Local domain name server: Hello, F, tell me the address of math.sysu.edu.cn; 9: F: Let me take a look. Oh, I found it. It’s XXXX. 10: Local domain name server: I finally found it after searching for a long time. Hello user, come out, I found it, it is XXXX If you think about it carefully, this is exactly the same as our express delivery. Suppose you mail something from the United States to Panyu District, Guangzhou City, it will first be delivered to China by express delivery (but there is no transit station like the root domain name server here), then go down to Guangdong Province, then Guangzhou City, and then Panyu. The above is the iterative resolution process of the local domain name server. In fact, recursive queries can also be performed, but I will not explain it here. The principle is similar. Cache mechanism Now the entire domain name system can provide us with domain name resolution services. When we enter a domain name, the computer sends a DNS request, and then the DNS server returns the resolution result to us. Everything looks very secure. But can it be more secure? Looking back at our usual browsing of websites, we can find two interesting conclusions: 1: 80% of the time we are looking at those 20% of websites, this is the famous 80/20 Rule; 2: We will jump between different web pages of a website, that is, constantly access the same domain name, which is similar to the principle of locality of program access. These two conclusions easily remind us of the cache mechanism. If we cache the resolution results of the domain names we have visited on our own computers, we can directly read the results the next time we visit, without having to repeat the DNS query process again, saving trouble for ourselves and the domain name server. Of course, a prerequisite for doing this is that the cached resolution results will not change frequently, that is, the result of resolving a domain name in ten minutes will be the same as the result of resolving it now. This is an indisputable fact for most domain names. However, there are inevitably some "fickle" domain names, which may change their resolution results frequently. In order to make the cache mechanism adapt to these two situations, we add a Time_ti_live field in the domain name resource record to indicate how long this record can be cached at most. For those "stable as a mountain", give a relatively large value, while for those "changing one thing at a time", you can give a small value. Since we can use cache locally, can we also use cache mechanism on domain name server? The answer is of course yes. Because for domain name server, the above two interesting conclusions are still valid. Therefore, domain name server can cache the records of domain name resources that have been visited. When the user initiates a request again, the cached result can be directly returned without iterative or recursive resolution. |
Quantum networks are the foundation for future hi...
PIGYun is a Chinese merchant founded in 2019, pro...
At the Global 6G Technology Conference held recen...
Kvmla currently launched a VPS host with a large ...
As we all know, 5G has become the main battlefiel...
The tribe has not shared any information about Pa...
The convergence of 5G and the Internet of Things ...
[[433809]] This article is reprinted from the WeC...
Just like cellular standards, Wi-Fi standards are...
spinservers has launched a new promotion this mon...
1. Introduction to OSPF OSPF (Open Shortest Path ...
If you are a telecom broadband user, then you mus...
I shared information about SiliCloud in September...
inet.WS has released a 25% discount coupon code f...
In recent years, 5G and the Internet of Things ha...