Network applications are the reason for the existence of computer networks. A group of early network applications mainly include email, remote access, file transfer, etc. However, with the development of computer networks and the endless needs of mankind, more and more network applications have been developed, such as instant messaging and peer-to-peer (P2P) file sharing, IP phone, video conferencing, etc. Some multi-party online games have also been developed, such as "World of Warcraft". It can be said that computer networks are the basis for the evolution of all applications. People should have a grateful heart and thank the efforts of these predecessors for making our current life so colorful. But as programmers, we should not only be able to enjoy these achievements, but also know why, so that life can be harmonious.
1. Application layer protocol principle The core of developing network applications is to write programs that can run on different end systems and communicate with each other through the network. For example, in a network application, there are two different programs that communicate with each other: one is the browser program running on the user's host; the other is the web server program running on the web server host. 2. Network Application Architecture There are two main types of network application architectures. One is the client-server architecture. In the client-server architecture, there is a host that is constantly open and waiting for connections, called a server, which serves requests from many other hosts called clients. For example, a web server will always wait for requests from a browser (running on a client host). Note that in this client-server architecture, clients do not communicate with each other, they only communicate with the corresponding server. Another point is that the server has a fixed IP address. The following figure shows this architecture: This client-server architecture has a drawback, that is, sometimes the server's response cannot keep up with the speed of client requests. In view of this, this architecture often needs to be equipped with data centers to create more powerful servers. For example, search engines (Google, Bing and Baidu), Internet stores (Amazon, e-Bay and Alibaba), web-based email (Gmail and Yahoo), social networks (Facebook, Instagram, Twitter and WeChat) use multiple data centers. Another architecture is the P2P architecture. Compared with the client-server architecture that relies too much on data centers, the P2P architecture directly communicates through two connected hosts, which are called peers. Typical applications of the P2P architecture include file sharing (BitTorrent), downloaders (Thunder), Internet phone calls and video conferencing (Skype). The following figure shows the P2P architecture diagram One of the most important features of the P2P architecture is its self-scalability. For example, in a P2P file sharing application, although each peer generates workload by requesting files, each peer also adds server capacity to the system by distributing files to other peers. 3. Process Communication We have mentioned two architectures above, one is the client-server model, and the other is the P2P peer-to-peer model. We all know that a computer allows multiple applications to run at the same time. To us, these applications seem to run at the same time, so how do they communicate with each other? It is impossible for two brothers to have the same mother but not communicate with each other. In the terminology of the operating system, communication is actually performed by processes rather than programs. A process can be considered as a program running in an end system. When multiple processes run on the same end system, they communicate with each other using an inter-process communication mechanism. The inter-process communication rules are determined by the operating system. We are not concerned with how different applications running on the same host communicate, but how two processes in different end systems communicate. Let's discuss this in two different structures. 1. Client and server processes Network applications consist of pairs of processes that send messages to each other over the network. For example, in a Web application, a file is transferred from a process in one peer to a process in another peer. In each pair of communicating processes, there is a pair of clients and servers. For example, in the Web mentioned above, the browser is a client process, and the Web server is a server process. Perhaps you should be able to guess that in a P2P architecture, a process can play two roles, being both a client and a server. But in the actual communication process, it is still easy for us to distinguish, and we usually distinguish in the following way. In a communication session scenario between a pair of processes, the process that initiates communication (i.e., initiates contact with the other process at the beginning of the session) is called a client, and the process that waits for contact at the beginning of the session is called a server. 2. Interface between processes and computer networks Computers are huge and complex, and so are computer networks. An application cannot consist of only one process, but is also run by multiple processes working together and negotiating. However, how do processes distributed between multiple end systems communicate? In fact, there is a socket software interface between each process. The socket is the internal interface of the application, through which the application can send or receive data, and can open, read, write, and close it like a file. Sockets allow applications to insert I/O into the network and communicate with other applications in the network. Let's use an example to make a simple analogy between sockets and network processes: a process can be compared to a house, and its socket is equivalent to the door of the house. When a process wants to communicate with other processes, it pushes the message out of the door, and then transports the message to another house through transportation equipment, and enters the house through the door for use. The following figure is a flowchart of communication through sockets As can be seen from the figure, Socket belongs to the internal interface of the host or service process and is controlled by the application developer. Communication between two end systems will be transmitted through the TCP buffer via the network to the TCP buffer of the other end system. Socket reads messages from the TCP buffer for internal use by the application. Sockets are programmable interfaces for building network applications, so sockets are also called application programming interfaces (APIs) between applications and networks. Application developers can control the internal details of sockets, but cannot control the transmission of the transport layer. They can only select the transport protocol of the transport layer and the transmission parameters of the transport layer, such as the maximum buffer and maximum message length. 3. Process addressing We mentioned above that network applications will send messages to each other, so how do you know where you should send the message? Is there a mechanism that can let you know where you can send it? This is like you want to send an email, you have written the content but you don’t know where to send it, so at this time there must be a mechanism to know the other party’s address. This mechanism can identify the other party’s unique address, which is the IP address. We will discuss the content of the IP address in detail in the following article. For now, you only need to know that IP is a 32-bit quantity that can uniquely identify the address of any host on the Internet. Is it enough to just know the IP address? We know that a computer may run multiple network applications, so how to determine which network application receives the message sent? Therefore, you also need to know the port number of the network application. For example, a web application needs to be marked with port 80, and a mail server program needs to be marked with port 25. 4. How does the application select a transportation service? We know that applications are application layer protocols that belong to the four-layer Internet protocols, and the four-layer protocols must work together to complete the work. Well, at this time we only have the application layer protocol, we need to send messages, how do we send messages? It's like you know where the destination is, how do you get there? Walk, take a bus, take the subway or take a taxi? There are many options for the means of transportation that applications use to send messages. We can consider whether the data transmission is reliable, throughput, timing, and security. The following are the specific things you need to consider. 1. Is data transmission reliable? We have discussed before that packets in computer networks will suffer from packet loss. The severity of the packet loss problem is related to the nature of the network application. If problems occur during the transmission of email, file transfer, remote host, or Web document, data loss may have very serious consequences. For online games and multi-person video conferencing, the impact may be relatively small. In view of this, the reliability of data transmission is also the first issue to be considered. Therefore, if a protocol provides such a service to ensure data delivery, it is considered to provide reliable data transfer, and applications that can tolerate data loss are called loss-tolerant applications. 2. Throughput In the previous article, we introduced the concept of throughput, which is the rate at which the sending process can deliver bits to the receiving process during data transmission in a network application. Applications with throughput requirements are called bandwidth-sensitive applications. Bandwidth-sensitive applications have specific throughput requirements, while elastic applications can use more or less available throughput depending on the available bandwidth at the time. 3. Timing What does timing mean? Timing can ensure that the sending and receiving of two applications in the network can be completed within the specified time. This is also a factor that applications need to consider when choosing a transportation service. This sounds natural. Your network application must have a concept of time when sending and receiving data packets. For example, in a game, if a packet of data is delayed in being sent, the opponent has already pushed the tower, but you are still stuck halfway. 4. Security Finally, the transport protocol chosen must be able to provide one or more security services to the application. 5. Transportation Services that the Internet Can Provide After talking about the selection of transport services, let's talk about what services the Internet can provide. In fact, the Internet provides two transport layer protocols for applications, namely UDP and TCP. The following are some selection requirements for network applications. You can choose the appropriate transport layer protocol according to your needs. Let's talk about the application scenarios of these two transport protocols. 1. TCP The characteristics of the TCP service model are mainly the following: (1) Connection-oriented services After the application layer datagram is sent, TCP allows the client and server to exchange transport layer control information. This handshake process is to remind the client and server that they need to be ready to receive datagrams. After the handshake phase, a TCP connection is established. This is a full-duplex connection, that is, both processes on the connection can send and receive messages on this connection at the same time. When the application finishes sending messages, the connection must be disconnected. (2) Reliable data transmission Communicating processes can rely on TCP to deliver all sent data without errors and in the proper order. Applications can rely on TCP to deliver the same byte stream to the receiving socket, with no lost or redundant bytes. (3) Congestion Control TCP congestion control does not necessarily bring direct benefits to the communication process, but it can bring overall benefits to the Internet. When the network between the receiver and the sender is congested, TCP congestion control will inhibit the sending process (client or server). We will discuss congestion control in detail later. 2. UDP UDP is a lightweight transport protocol that provides only minimal services. UDP is connectionless, so there is no handshake between two processes before they communicate. UDP also does not guarantee whether a message is delivered to the server, it is like a hands-off shopkeeper. Not only that, messages arriving at the receiving process may also arrive out of order. The following are the protocols selected by some of the applications listed in the table above 6. Application layer protocol Now we will discuss some application layer protocols. First, let's understand what application layer protocols are. Application layer protocols define how application processes running on different end systems transmit messages to each other. The application layer protocol defines:
7. Application layer protocol classification
8. Web and HTTP Web (World Wide Web) is the global wide area network, that is, the network with a URL starting with www. It is the main carrier of the HTTP protocol and is a network service built on the Internet. When we talk about the Web, we are actually referring to the HTTP protocol. As a protocol that web programmers must master and understand, it is necessary to understand the HTTP protocol well. Hypertext Transfer Protocol can be divided into text: Hypertext, Transfer, Protocol, and the relationship between them is as follows: According to the scope, protocol > transmission > hypertext. The following is an explanation of these three names. 1. What is Hypertext In the early days of the Internet, the information we input could only be saved locally and could not interact with other computers. The information we saved usually existed in the form of text, that is, simple characters. Text is a meaningful binary data packet that can be parsed by a computer. With the rapid development of the Internet, data can be transmitted between two computers. People are not satisfied with only transmitting text between two computers. They also want to transmit pictures, audio, and video, and even click on text or pictures to jump to a hyperlink. Then the semantics of text has been expanded. This kind of text with expanded semantics is called hypertext. 2. What is transmission? As we said above, two computers will form an interconnection to communicate, and the hypertext we store will be parsed into binary data packets. The transmission carrier (such as coaxial cable, telephone line, optical cable) is responsible for transmitting the binary data packets from the computer terminal to another terminal. The process (for a detailed explanation of the terminal, please refer to the article You said you understand the Internet, but do you know these?) is called transfer. Usually we call the party that transmits the data packet the requester, and the party that receives the binary data packet the responder. The requester and the responder can exchange data. The requester can also receive data as the responder, and the responder can also request data as the requester. The relationship between them is as follows: As shown in the figure, A and B are two different end systems, which can exist as carriers for information exchange. At the beginning, A is the requester and requests to exchange information with B, and B is the responder and provides information. As time goes by, B can also request A to exchange information as the requester, and A can also respond to the information requested by B as the responder. 3. What is an agreement? The term "agreement" is not limited to the Internet, but also reflected in daily life. For example, a couple agrees on a place to eat, which is also an agreement. For example, if you are successful in applying for a job, the company will sign a labor contract with you. This employment relationship between the two parties is also an agreement. Please note that an agreement between one person and himself cannot become an agreement. The prerequisite for an agreement must be an agreement between multiple people. So what is network protocol? Network protocols are some specifications for transmitting and managing information in the network (including the Internet). Just as people need to follow certain rules when communicating with each other, computers need to follow certain rules when communicating with each other. These rules are called network protocols. The Internet without network protocols is chaotic, just like human society. People cannot do whatever they want, and their behavior is constrained by the law. Similarly, the end systems in the Internet cannot send whatever they want, and they also need to be constrained by communication protocols. So we can summarize, what is HTTP? You can answer it with the following classic summary: HTTP is a convention and specification in the computer world that is specifically used to transmit hypertext data such as text, pictures, audio, video, etc. between two points. 9. Persistent and non-persistent connections HTTP can use both persistent and non-persistent connections. Let's focus on these two methods below. 1. Non-persistent connections Let's first discuss HTTP persistent connections. Are you curious about what happens when you enter a URL in your browser? How is the content you want displayed? Let's explore this with an example. Let's assume that the URL you visit is http://www.someSchool.edu/someDepartment/home.index. When we enter the URL and press Enter, the following operations will be performed inside the browser:
At this point, the whole process of typing the URL and pressing Enter is over. The above process describes a simple request-response process. The actual request-response situation may be much more complicated than the process described above. The above steps illustrate the use of non-persistent connections, where each TCP link is closed after the server completes the send. Each TCP connection transmits only one request message and response message. 2. HTTP with persistent connections There are some disadvantages of non-persistent connections. First, a new connection must be established and maintained for each requested object. For each such connection, TCP buffers must be allocated and TCP variables must be maintained in both the client and the server, which puts a serious burden on the Web server. Because a Web server may have to serve hundreds or even thousands of client requests at the same time. When HTTP 1.1 persistent connections are used, the server keeps the TCP connection open after sending a response. Subsequent request and response messages between the same client and server can be transmitted through the same connection. Generally speaking, if a hop connection is not used after a certain time interval (configurable), the HTTP server should close its connection. 3. HTTP message format We have described the HTTP request and response process above. The process is relatively simple, but if you are serious about everything, you can expand a lot of things, such as what HTTP messages look like and what is its composition format? Let's discuss it below. The HTTP protocol consists of three main parts:
The start line and the header field are collectively called the request header or response header, and the message body is also called the entity, called the body. The HTTP protocol stipulates that each message sent must have a header, but there can be no body, that is, the header information is required, and the entity information can be omitted. There must be a blank line (CRLF) between the header and the body. If I use a picture to represent it, I think it should be like this: Let's use the example above to look at the http request message As shown in the picture, this is http://www.someSchool.edu/someDepartment/home.index request. We can learn a lot by looking at this HTTP message. First, we see that the message is written in plain ASCII text, which ensures that it is understandable to humans. Then, we can see that there is a line break between each line and the last line (after the request header) has a carriage return and line feed. The start line of each message consists of three fields: method, URL, and HTTP version. 4. HTTP Request Method HTTP request methods are generally divided into 8 types, which are:
The most commonly used methods are GET and POST. You can learn about other methods for now. The following is a list of methods supported by HTTP1.0 and HTTP1.1 5. HTTP Request URL The HTTP protocol uses URIs to locate resources on the Internet. It is because of the specific function of URIs that resources anywhere on the Internet can be accessed. The URL carries the identifier of the requested object. In the above example, the browser is requesting the resource of the object /somedir/page.html. Let's parse the URL through a complete domain name, for example http://www.example.com:80/path/to/myfile.html?key1=value1&key2=value2#SomewhereInTheDocument This URL is quite complicated. Once you understand this URL, other URLs will not be a problem. The first to appear is http http:// tells the browser which protocol to use. For most web resources, HTTP or its secure version, HTTPS, is used. In addition, browsers know how to handle other protocols. For example, the mailto: protocol instructs the browser to open an email client; the ftp: protocol instructs the browser to handle file transfers. The second one to appear is the host www.example.com is both a domain name and the organization that manages the domain name. It indicates which host on the network to make a request to. Of course, you can also make a request directly to the host's IP address. However, it is not common to use the IP address directly. The third one to appear is the port As we mentioned earlier, initiating a TCP connection between two hosts requires two conditions, host + port. It represents the entry point for accessing resources on a Web server. If the Web server being accessed uses the standard port of the HTTP protocol (HTTP is 80, HTTPS is 443) to grant access to its resources, this part is usually omitted. Otherwise, the port is a required part of the URI. The above is the part that the request URL must contain, and the following is the specific request resource path of the URL The fourth one to appear is the path /path/to/myfile.html is the path of the resource on the web server. It starts with the first / after the port and ends before the ?. Each / in between represents a hierarchical relationship. The resource requested by this URL is an html page. Following the path are the query parameters. ?key1=value1&key2=value2 are additional parameters provided to the web server. If it is a GET request, it usually has request URL parameters. If it is a POST request, no parameters are added directly after the path. These parameters are a list of key/value pairs separated by & symbols. key1 = value1 is the first pair, key2 = value2 is the second pair of parameters. Following the parameters is the anchor point #SomewhereInTheDocument is an anchor to some part of the resource itself. An anchor represents a kind of "bookmark" within the resource, which gives the browser instructions to display the content located at that "bookmarked" point. For example, on an HTML document, the browser will scroll to the point where the anchor is defined; on a video or audio document, the browser will go to the time represented by the anchor. It is important to note that the part after the # sign, also called the fragment identifier, is never sent to the server with the request. For more information about HTTP1.1, please refer to these two blog posts of the blogger. I feel that I have explained HTTP clearly:
10. E-mail on the Internet Since the advent of the Internet, email has become popular on the Internet. Like regular mail, email is an asynchronous communication medium, that is, people can send and receive emails when it is convenient for them, without having to communicate with others before sending. Modern email has many powerful features, including messages with attachments, hyperlinks, HTML formatted text and pictures. The following is a general overview of the email system From the figure we can see that it has three main components: user agent, mail server, and Simple Mail Transfer Protocol (SMTP). Let's describe the process of sending and receiving mail. User agents allow users to read, reply, forward, save, and compose messages. Microsoft Outlook and Apple Mail are examples of email user agents. When a user finishes writing an email, his user agent sends the email to the mail server. At this time, the email sent by the user will be placed in the outgoing message queue of the mail server. When the recipient user wants to read the email, his user agent directly obtains the message from the outgoing message queue. The mail server forms the core of the entire mail system. Each recipient has a mailbox on the mail server. The user's mailbox manages and maintains the messages sent to him. A typical mail sending process starts from the sender's user agent, is transmitted to the sender's mail server, and then to the recipient's mail server, where it is distributed to the recipient's mailbox. When the recipient's user wants to read the mail from the mailbox, his mail server will authenticate the user. If the mail sent by the sender cannot be correctly delivered to the recipient's server, the sender's user agent will store the mail in a message queue and try to send it again later, usually every 30 minutes. If the sending fails after a period of time, the server will delete the mail in the message queue and notify the sender by email. SMTP is the main application layer protocol in Internet email. SMTP also uses TCP as the transport layer protocol to ensure the reliability of data transmission. 1. SMTP protocol transmission process To describe the basic operation of SMTP, let's look at the following common scenario. Let's assume that Alice wants to send a simple ASCII message to Bob:
The email mentioned above is actually a message, which refers to a series of ASCII codes. Before SMTP transmits emails, binary multimedia data needs to be encoded into ASCII codes for transmission. SMTP generally does not use an intermediate mail server to send mail, even if the two mail servers are located on opposite sides of the world. The TCP connection usually connects Alice's mail server directly to Bob's mail server. Now you know the general process of sending mail between two mail servers. So, how does SMTP send mail from Alice's mail server to Bob's mail server? It is mainly divided into the following three stages:
Let's analyze an actual SMTP mail sending process, which is collectively referred to as the SMTP client (C) and SMTP server (S). The client's host name is crepes.fr, and the server's host name is hamburger.edu. The ASCII code text starting with C: is the line that the client gives to the TCP socket, and the ASCII code starting with S: is the line that the server sends to its TCP socket. Once the connection is established, the following process begins
In the above example, a client sent a message ("Do you like ketchup? How about pickles?") from the mail server crepes.fr to the mail server hamburger.edu. As part of the conversation, the client sent five commands: HELO (short for HELLO), MAMIL FROM, RCPT TO, DATA, and QUIT. These commands are self-explanatory. What is self-explanatory? It means that there is no need for further explanation. The command itself can explain the function it wants to express. The above is a simple SMTP exchange process, including connection establishment, mail delivery and connection release. First, establish a TCP connection, SMTP calls port 25 of the TCP protocol to listen for the connection request, and then the client sends a HELO instruction to indicate that it is the sender's identity, and then the server responds. Then, the client sends a MAIL FROM command to indicate that the client's email address is Several HTTP-like status codes will be involved in the above process. 250 means OK, 200 similar to HTTP. When the command is successful, the server returns code 250, and if it fails, it returns code 550 (the command cannot be recognized), 451 (the error occurred during processing), 452 (the storage space is insufficient), 421 (the server is not available), etc., and 354 means the start information input. SMTP messages have limitations. The limitations of SMTP are manifested in that they can only send ASCII code format messages, and do not support Chinese, French, German, etc., and they do not support voice and video data. SMTP is supplemented through the MIME protocol. MIME uses the Network Virtual Terminal (NVT) standard, allowing non-ASCII code data to be transmitted through SMTP. 2. Comparison between SMTP and HTTP HTTP is the first application layer protocol we learn, and SMTP is the second application layer protocol we learn, so we compare these two protocols. Both protocols are used to deliver files from one host to another: HTTP delivers files from a web server to a web client (usually a browser), and SMTP delivers files from one mail server to another (ie, email messages). There will be several important differences between these two agreements.
11. DNS Internet Directory Service Agreement Imagine a question, how many ways can we humans identify ourselves? It can be identified by ID cards, social security card numbers, or driver's licenses. Although we have multiple ways of identification, in a specific environment, a certain method of identification may be more suitable than another method. Like humans, hosts on the Internet can use multiple ways of identification. One way to identify hosts on the Internet is to use its hostname, such as www.facebook.com, www.google.com, etc. But this is the way we humans remember, and routers do not understand it so. Routers like fixed-length, hierarchical IP addresses. So, do you still remember what IP is? The IP address is simply expressed in a brief description, which is a 4 bytes and has a strict hierarchy. For example, an IP address such as 121.7.106.83, each byte can be divided by ., representing a decimal number of 0 - 255. (We will discuss the specific IP later) However, routers prefer to resolve IP addresses, but we humans prefer to remember URLs. So how does a router resolve an IP address into a URL that we are familiar with? This is where DNS comes in. DNS stands for Domain Name System. It is a distributed database implemented by hierarchical DNS servers. It is also an application layer protocol that enables hosts to query distributed databases. DNS servers are usually UNIX machines running BIND (Berkeley Internet Name Domain) software. The DNS protocol runs on UDP and uses port 53. 1. Basic DNS Overview Like HTTP, FTP and SMTP, DNS is also an application layer protocol. DNS uses the client-server model to run between communicating end systems, and transmits DNS messages between communicating end systems through the following end-to-end transport protocol. However, DNS is not an application that directly interacts with users. DNS provides a core function for user applications and other software on the Internet. DNS is not usually a standalone protocol; it is usually used by other application layer protocols, including HTTP, SMTP, and FTP, to resolve user-supplied host names into IP addresses. The following is an example to describe the DNS resolution process. This is similar to what the browser does when you enter a URL. What happens when you type www.someschool.edu/index.html in your browser? In order to enable the user host to send an HTTP request message to the web server www.someschool.edu, you will go through the following operations:
In addition to providing IP address to hostname conversion, DNS also provides the following important services:
2. DNS job overview DNS is a complex system, and we are here to learn about the main aspects of its operation. Here is a general overview of the DNS working process. Suppose that some applications (such as web browsers or mail readers) running on the user's host need to convert host names into IP addresses. These applications will call the DNS client and indicate the host name that needs to be converted. After receiving it, the DNS on the user's host will use UDP to send a DNS query message to the network through port 53. After a period of time, the DNS on the user's host will receive a DNS answer message corresponding to the host name. Therefore, from the perspective of the user's host, the DNS is like a black box, and you cannot see its internal operations. But in fact, the black box that implements the DNS service is very complex. It consists of a large number of DNS servers distributed around the world and an application layer protocol that defines how the DNS server communicates with the query host. The earliest simple design of DNS is just to use a DNS server on the Internet. The server will contain all the mappings. This is a centralized design that does not apply to the Internet today because the Internet has a huge number of hosts and continues to grow. This centralized design will have the following problems.
Therefore, DNS cannot be designed in a centralized manner. It has no scalability at all, so it adopts distributed design. Therefore, the characteristics of this design are as follows: First of all, the first problem that the distributed design solves is the scalability of the DNS server. Therefore, DNS uses a large number of DNS servers, which are generally organized in a hierarchical manner and distributed all over the world. No DNS server can have the mapping of all hosts on the Internet. Instead, these mappings are distributed on all DNS servers. Generally speaking, there are three types of DNS servers: root DNS servers, top-level domain (TLD) DNS servers, and authoritative DNS servers. The hierarchical model of these servers is shown in the figure below. Suppose that a DNS client now wants to know the IP address of www.amazon.com, how does the domain name server above resolve? First, the client will associate one of the root servers first, which will return the IP address of the TLD server of the top-level domain com. The client will contact one of these TLD servers, which will return the IP address of the authoritative server for amazon.com. Finally, the client will contact one of the authoritative servers, which will return its IP address for www.amazom.com. Let's now discuss the hierarchical system of domain name servers above
The hierarchical structure of general domain name servers is mainly the above three types. In addition, there is another important type of DNS server, which is the local DNS server. Strictly speaking, the local DNS server does not belong to the above hierarchical structure, but the local DNS server is crucial. Each ISP (Internet Service Provider), such as the ISP in a residential area or the ISP of an institution, has a local DNS server. When a host connects to an ISP, the ISP will provide an IP address of a host, which will have one or more IP addresses of its local DNS servers. By accessing the network connection, users can easily determine the IP address of the DNS server. When the host sends a DNS request, the request is sent to the local DNS server, which acts as a proxy and forwards the request to the DNS server hierarchy system. 3. DNS cache DNS caching, sometimes also called DNS resolver cache, is a temporary database maintained by the operating system that contains the most recent access records of websites and other Internet domains. In other words, DNS caching is just a technology and means for computers to cache loaded resources in order to meet fast response speeds, so that they can be directly and quickly referenced when they are accessed again. So how does DNS caching work? DNS cache workflow: Before the browser makes a request to the outside world, the computer intercepts each request and looks up the domain name in the DNS cache database, which contains a list of recent domain names and the addresses that DNS calculated for them when the DNS first made the request. 4. DNS records and messages All DNS servers that jointly implement the DNS distributed database store resource records (RR), which provide a mapping from host names to IP addresses. Each DNS reply message contains one or more resource records. RR records are used to respond to client queries. A resource record is a 4 tuple containing the following fields:
There will be different types of RRs. Here is a summary table of different types of RRs: DNS message: DNS has two types of messages, one is the query message and the other is the response message, and these two messages have the same format. The following is the DNS message format The following explanation of the message format:
I will publish an article specifically for a detailed introduction to specific DNS records. 12. P2P file distribution The protocols we discussed above all use the client-server mode, which will greatly rely on the infrastructure server that is always open. P2P is the client and client mode, which has the least dependence on the infrastructure server that is always open. The full name of P2P is Peer-to-peer, P2P, which is a computer network with a distributed architecture. In the P2P system, all computers and devices are called peers, and they work with each other. Each peer in a peer network is equal to other peers. There is no privileged peer in the network, and there is no master administrator device.
In a sense, peer networks are the most equal network in the computer world. Every peer is equal, and each peer has the same rights and obligations as other peers. Peer is both a client and a server. In fact, each resource available in a peer network is shared between peers without any central server. Shared resources in a P2P network can be such as processor usage, disk storage capacity, or network bandwidth. 1. What is P2P used for The main goal of P2P is to share resources and help computers and devices work together to provide specific services or perform specific tasks. As mentioned earlier, P2P is used to share various computing resources, such as network bandwidth or disk storage space. However, the most common example of peer networks is file sharing on the Internet. Peer networks are great for file sharing because they allow connections to their computers, etc. to receive and send files at the same time. BitTorrent is the main protocol used by P2P. 2. The role of P2P network P2P networks have some characteristics that make them useful:
13. Video streaming and content distribution network 1. Internet Video In streaming video applications, the most basic media is pre-recorded videos such as movies, TV shows, recorded sports events or user-generated videos. These pre-recorded videos are placed on the server, and users send requests to the server to watch the videos as needed. Many Internet companies now offer streaming videos, including Netflix, YouTube, Amazon, and Youku. A series of images in video are usually presented at a constant rate (such as 24 or 30 images per second). An uncompressed, digitally encoded image consists of an array of pixels, in which each pixel has bit coded to represent brightness and color. An important feature of video is that it can be compressed, so that the bit rate can be used to weigh the video quality. 2. HTTP Streaming and DASH In an HTTP stream, a video is just a file stored in an HTTP server, each with a specific URL. When a user wants to watch a video, the client creates a TCP connection with the server and sends an HTTP GET request for the URL. The server sends the file video in an HTTP response at the fastest rate allowed by the underlying network protocol and traffic conditions. Although HTTP streams have been widely deployed in practice, they are severely flawed, that is, all customers receive the same encoded video, but for customers, the bandwidth changes dynamically, and at different times, the bandwidth size varies greatly. This situation has led to the development of a new type of HTTP stream, often called Dynamic Adaptive Streaming over HTTP (DASH). In DASH, videos are encoded in several different versions, each with a different bit rate. DASH allows customers to churn out videos with different encoding rates using different Ethernet access rates. Customers using 3G connections are able to accept a low bit rate version, and fiber optics are able to accept a high bit rate version. After using DASH, each video version is stored in HTTP, and each version has a different URL. The HTTP server also has a manifest file that provides a URL and its bitrate for each version. 3. Content distribution network Today, many Internet video companies distribute megabits of streams per second on demand to millions of users day after day. For an Internet video company, perhaps the most direct way to provide streaming video services is to build a single, hyperscale data center. Store all videos inside the data center and return the video to customers around the world. There are three problems with this approach:
To cope with the challenge of distributing huge amounts of video data to users who are distributed on time, almost all major video streaming companies use Content Distribution Network (CDN). CDN manages servers distributed across multiple geographical locations, stores video copies on its servers, and all attempts to direct each user request to a CDN location that provides the best user experience. So how do servers choose? In fact, there are two principles of server placement. In-depth, its main goal is to get close to users, improving latency and throughput that users feel by reducing the number of links and routers between end users and CDN clusters. Inviting guest, this principle is to invite ISPs to visit by building large clusters in a small number (for example 10) key locations. Inviting guest designs usually incur lower maintenance and management overhead compared to in-depth design principles. A CDN can be a private CDN, that is, it is owned by the content provider itself; another CDN is a third-party CDN, which distributes content on behalf of multiple content providers. 4. CDN distribution process We discussed the site selection process of CDN above, so how does CDN work? When a browser directive in the user host retrieves a specific video (identified by the URL), the CDN must be able to intercept the request to perform the following operations:
Most CDNs utilize the DNS protocol to intercept and redirect requests. Here is the specific workflow of CDN: Suppose a content provider NetCinema hires a third-party CDN company, KingCDN, to distribute videos to its customers. On NetCinema's web page, each of its videos is assigned a URL that includes the string video and the identifier of the video itself. Here is a link to http://video.netcinema.com/6Y7B23V, and its working process is as follows
5. Cluster selection policy for CDN The core of any CDN deployment is the cluster selection strategy, that is, the mechanism for dynamically directing customers to a server cluster or data center in the CDN. A simple strategy is to assign customers to the geographically closest cluster. This selection strategy ignores the time delay and available bandwidth to vary with Internet path time, and always assigns the same cluster to a specific customer; and another selection strategy is real-time measurement, which performs periodic checks based on the latency and packet loss performance between the cluster and the client. |
<<: To fight the epidemic, what 5G technical support have China's three major operators provided?
>>: 5G Downlink Channel Sounding "CSI-RS"
Why did the once-noisy "big event" die ...
At the HAS Analyst Conference recently, Chen Jinz...
The network is the most stable part of the data c...
Early morning news on January 11, for communicati...
The tribe has shared G-core product information s...
Ruijie Networks is a Chinese ICT infrastructure a...
In many offices, Wi-Fi represents the great break...
[[341641]] This article is reprinted from the WeC...
With the advent of a multi-cloud world, software-...
If the user's traffic is like the surging wav...
Some time ago, the Ministry of Industry and Infor...
I was helping a friend online to mount a disk on ...
【51CTO.com original article】 Activity description...
The word "edge" has been given a new de...
An operator executive once believed that user gro...