The transport layer is located between the application layer and the network layer. It is the fourth layer in the OSI layering system and is also an important part of the network architecture. The transport layer is mainly responsible for end-to-end communication on the network. The transport layer plays a vital role in the communication between applications running on different hosts. Let's discuss the protocol part of the transport layer. Transport Layer Overview The transport layer of a computer network is very similar to a highway. A highway is responsible for transporting people or goods from one end to the other, while the transport layer of a computer network is responsible for transporting messages from one end to the other. This end refers to the end system. In a computer network, any medium that can exchange information can be called an end system, such as mobile phones, network media, computers, operators, etc. In the process of transporting messages at the transport layer, certain protocol specifications will be followed, such as the data limit for one transmission, the choice of transport protocol, etc. The transport layer implements the function of allowing two unrelated hosts to communicate logically, which looks like connecting the two hosts. The transport layer protocol is implemented in the end system, not in the router. Routing is only responsible for identifying addresses and forwarding. This is like a courier delivering a package. Of course, it is up to the recipient of the address, that is, the person in Room xxx, Unit xxx, Building xxx, to make the decision! How does TCP determine which port it is? Remember the structure of the data packet? Let's review it here After the data packet passes through each layer, the protocol of that layer will attach a packet header to the data packet. A complete packet header diagram is shown above. After the data is transferred to the transport layer, a TCP header is attached to it, which contains the source port number and the destination port number. At the sending end, the transport layer converts the message received from the sending application process into transport layer packets, which are also called segments in computer networks. The transport layer generally divides the segments into smaller pieces, adds a transport layer header to each piece, and sends it to the destination. During the sending process, the optional transport layer protocols (that is, transportation tools) are mainly TCP and UDP. The selection and characteristics of these two transport protocols are also the focus of our discussion. TCP and UDP prerequisites In the TCP/IP protocol, the most representative ones that can realize the transport layer function are TCP and UDP. When talking about TCP and UDP, we must first talk about the definitions of these two protocols. TCP is called Transmission Control Protocol (TCP). From the name, we can roughly know that the TCP protocol has the function of controlling transmission, which is mainly reflected in its controllability. Controllability means reliability, which is indeed the case. TCP provides a reliable, connection-oriented service for the application layer, which can reliably transmit packets to the server. UDP is called User Datagram Protocol (UDP). As the name suggests, UDP focuses on datagrams. It provides a method for the application layer to directly send datagrams without establishing a connection. Why are there so many terms in computer networks to describe a piece of data? In computer networks, different layers have different descriptions. We mentioned above that the packets in the transport layer are called segments. In addition, the packets in TCP are also called segments. However, the packets in UDP are called datagrams, and the packets in the network layer are also called datagrams. However, for the sake of uniformity, we generally call TCP and UDP messages as message segments in computer networks. This is equivalent to an agreement, and there is no need to worry too much about what to call it. Sockets Before TCP or UDP sends specific message information, it needs to go through a door first. This door is the socket. The socket is connected to the application layer upward and the network layer downward. In the operating system, the operating system provides an interface (Application Programming Interface) for applications and hardware respectively. In computer networks, the socket is also an interface, and it also has an interface API. When using TCP or UDP for communication, the socket API is widely used. This API is used to set the IP address and port number to send and receive data. Now we know that there is no necessary connection between Socket and TCP/IP. The emergence of Socket only facilitates the use of TCP/IP. How to use it conveniently? You can directly use the following methods of Socket API. Socket Type There are three main types of sockets. Let's take a look at each one.
Socket Processing In a computer network, in order to achieve communication, at least two end systems are required, and at least one pair of two sockets is required. The following is the communication process of the socket.
Just as file descriptors are used to access files, socket descriptors are used to access sockets.
Although the sockets API is located in the communication model between the application layer and the transport layer, the sockets API is not part of the communication model. The sockets API allows applications to interact with the transport layer and the network layer. Before we continue, let's play a short episode and talk briefly about IP. Let’s talk about IP IP is the abbreviation of Internet Protocol, which is the network layer protocol in the TCP/IP system. The original intention of designing IP was to solve two types of problems: Improving network scalability: achieving large-scale network interconnection Decouple the application layer and the link layer to allow them to develop independently. IP is the core of the entire TCP/IP protocol suite and the foundation of the Internet. In order to achieve large-scale network interconnection, IP pays more attention to adaptability, simplicity and operability, and makes certain sacrifices in reliability. IP does not guarantee the delivery time limit and reliability of packets, and the transmitted packets may be lost, repeated, delayed or out of order. We know that the next layer of the TCP protocol is the IP protocol layer. Since IP is unreliable, how can we ensure that the data can arrive accurately? This involves the issue of TCP transmission mechanism, which we will discuss later when we talk about TCP. Port Number Before talking about port numbers, let's talk about file descriptions and the relationship between sockets and port numbers. In order to facilitate the use of resources, improve the performance, utilization and stability of the machine, etc., our computers have a layer of software called an operating system, which is used to help us manage the resources that the computer can use. When our program wants to use a resource, it can apply to the operating system, and then the operating system allocates and manages the resource for our program. Usually when we want to access a kernel device or file, the program can call the system function, and the system will open the device or file for us, and then return a file descriptor fd (or ID, which is an integer). We can only access the device or file through this file descriptor. It can be considered that the number corresponds to the open file or device. When our program wants to use the network, it needs to use the corresponding operating system kernel operations and network card devices, so we can apply to the operating system, and then the system will create a socket for us and return the ID of this socket. In the future, when our program wants to use network resources, it only needs to operate the number ID of this socket. And each of our network communication processes corresponds to at least one socket. Writing data to the socket ID is equivalent to sending data to the network, and reading data from the socket is equivalent to receiving data. And these sockets have a unique identifier-port number. The port number is a 16-bit non-negative integer ranging from 0 to 65535. This range is divided into three different port number segments and is allocated by the Internet Assigned Numbers Authority (IANA).
A computer can run multiple applications. When a segment arrives at the host, which application should it be transmitted to? How do you know that this segment is passed to the HTTP server instead of the SSH server? Is it based on the port number? When the message reaches the server, the port number is used to distinguish different applications, so the port number should be used to distinguish them. Let me give you an example to refute cxuan. If two data arrive at the server, both are sent from port 80. How do you distinguish them? Or if two data arrive at the server from the same port but different protocols, how do you distinguish them? Therefore, it is obviously not enough to identify a message only by the port number. The source IP address, destination IP address, source port number, and destination port number are generally used to distinguish packets on the Internet. If any of these items are different, they are considered to be different message segments. These are also the basis for demultiplexing and multiplexing. Determine the port number Before actual communication, you need to determine the port number. There are two ways to determine the port number: Standard port numbers The standard port numbers are statically assigned. Each program has its own port number, and each port number has a different purpose. A port number is a 16-bit number between 0 and 65535. Port numbers in the range of 0 to 1023 are dynamically assigned port numbers. For example, HTTP uses port 80 to identify, FTP uses port 21 to identify, and SSH uses port 22 to identify. This type of port number has a special name, called the Well-Known Port Number. Port number assigned by timing The second way to assign port numbers is a dynamic allocation method. In this method, the client application does not need to set the port number by itself. The operating system can allocate non-conflicting port numbers to each application. This mechanism of dynamically allocating port numbers can identify different connections even if the TCP connection is initiated by the same client. Multiplexing and Demultiplexing We have talked about how each socket on the host is assigned a port number. When a segment arrives at the host, the transport layer checks the destination port number in the segment and directs it to the corresponding socket. The data in the segment then enters the process to which it is connected through the socket. Let's talk about the concepts of multiplexing and demultiplexing. There are two types of multiplexing and demultiplexing, namely connectionless multiplexing (demultiplexing) and connection-oriented multiplexing (demultiplexing) Connectionless multiplexing and demultiplexing Developers will write code to determine whether the port number is a well-known port number or a time-assigned port number. If a port 10637 in host A wants to send data to port 45438 in host B, the transport layer uses the UDP protocol. After the data is generated in the application layer, it will be processed in the transport layer, and then the data will be encapsulated in the network layer to obtain an IP datagram. The IP data packet is delivered to host B through the link layer on a best-effort basis, and then host B will check the port number in the message segment to determine which socket it belongs to. This series of processes is shown below A UDP socket is a two-tuple that contains the destination IP address and the destination port number. Therefore, if two UDP segments have different source IP addresses and/or the same source port number, but have the same destination IP address and destination port number, then the two segments will be located at the same destination process through the socket. Let's think about a question here. When host A sends a message to host B, why does it need to know the source port number? For example, if I tell a girl that I am interested in you, does she need to know which organ of mine sent this message? Isn't it enough to know that I am interested in you? In fact, it is necessary, because if a girl wants to express that she is interested in you, she might kiss you, so she needs to know where to kiss you, right? That is, in the message segment from A to B, the source port number will be used as part of the return address. That is, when B needs to send a message segment back to A, B needs to take the source port number from A to B, as shown in the following figure Connection-oriented multiplexing and demultiplexing If connectionless multiplexing and demultiplexing refer to UDP, then connection-oriented multiplexing and demultiplexing refer to TCP. The difference between TCP and UDP in message structure is that UDP is a two-tuple while TCP is a four-tuple, namely source IP address, destination IP address, source port number, destination port number, which we mentioned above. When a TCP message segment arrives at a host from the network, the host will disassemble it to the corresponding socket according to these four values. The figure above shows the process of connection-oriented multiplexing and demultiplexing. In the figure, host C sends two HTTP requests to host B, and host A sends one HTTP request to host C. Hosts A, B, and C all have their own unique IP addresses. When host C sends an HTTP request, host B can decompose the two HTTP connections because the two source port numbers of the requests sent by host C are different. So for host B, these are two requests, and host B can decompose them. For host A and host C, these two hosts have different IP addresses, so for host B, they can also be decomposed. UDP Finally, we started to explore the UDP protocol. Let’s go! UDP stands for User Datagram Protocol (UDP). UDP provides a way for applications to send encapsulated IP data packets without establishing a connection. If the application developer chooses UDP instead of TCP, then the application is equivalent to dealing directly with IP. The data passed from the application will be attached with multiplexed/demultiplexed source and destination port number fields and other fields, and then the formed message will be passed to the network layer, which will encapsulate the transport layer message segment into an IP datagram and then deliver it to the target host as much as possible. The most critical point is that when using the UDP protocol to pass the datagram to the target host, there is no handshake between the transport layer entities of the sender and the receiver. Because of this, UDP is called a connectionless protocol. UDP Features UDP protocol is generally used as a transport layer protocol for streaming media applications, voice communication, and video conferencing. The DNS protocol that we all know also uses UDP protocol at the bottom. The main reasons why these applications or protocols choose UDP are as follows:
It should be noted here that not all application layers using the UDP protocol are unreliable. Applications can achieve reliable data transmission by themselves by adding confirmation and retransmission mechanisms. Therefore, the biggest feature of using the UDP protocol is its high speed. UDP message structure Let's take a look at the UDP message structure. Each UDP message is divided into two parts: the UDP header and the UDP data area. The header consists of four 16-bit (2-byte) fields, which respectively describe the source port, destination port, message length and checksum of the message.
The first two sums of these 16 bits are Then add the above result to the third 16-bit number The last bit added will overflow, and the overflow bit 1 will be discarded, and then the inverse operation will be performed, which is to change all 1s to 0s and 0s to 1s. Therefore, the inverse of 1000 0100 1001 0101 is 0111 1011 0110 1010, which is the checksum. If there is no error in the data at the receiving end, all 4 16-bit values will be calculated, including the checksum. If the final result is not 1111 1111 1111 1111, it means that there is an error in the data during transmission. Let's think about a question, why does UDP provide error detection function? This is actually an end-to-end design principle, which states that the probability of various errors occurring during transmission should be reduced to an acceptable level. When a file is transferred from host A to host B, that is, when hosts A and B communicate, there are three steps: first, host A reads the file from the disk and groups the data into packets, then the packets are transmitted to host B through the network connecting host A and host B, and finally host B receives the packets and writes them to the disk. In this seemingly simple but actually complex process, normal communication may be affected due to some reasons. For example, file read and write errors on the disk, buffer overflow, memory errors, network congestion, etc. These factors may cause errors or loss of data packets, which shows that the network used for communication is unreliable. Since communication can be achieved through the above three links, we wonder whether we can add an error detection and correction mechanism to one of the links to check the information? The network layer certainly cannot do this, because the main purpose of the network layer is to increase the data transmission rate. The network layer does not need to consider the integrity of the data. The integrity and correctness of the data can be left to the end system to detect. Therefore, in data transmission, the network layer can only be required to provide the best possible data transmission service, and it is impossible to expect the network layer to provide data integrity services. The reason why UDP is unreliable is that although it provides error detection function, it has no ability to recover from errors and no retransmission mechanism. This article is reprinted from the WeChat public account "Programmer cxuan", which can be followed through the following QR code. To reprint this article, please contact the programmer cxuan public account. |
<<: 5G and satellite, what is the relationship?
>>: Wi-Fi 6 testing completes, global deployment to begin in 2021
Why can our express deliveries always be accurate...
In today's networking world, Wifi and Etherne...
IP address definition: IP is known as Internet Pr...
Operations and Continuous Delivery In the era of ...
With the development of mobile Internet technolog...
South Korean telecom operator SK Telecom recently...
On November 15, 2018, the 4th Data Center Infrast...
Convergence between wired and wireless networks i...
On August 21, the Zhejiang finals will be held at...
It can be said that Bluetooth and Wi-Fi each have...
The city of Bryan, Texas, recently announced that...
Faced with the sudden outbreak of the COVID-19 pa...
On April 25, China Mobile General Manager Dong Xi...
[[416937]] Experimental requirements ISP-1 and IS...
[51CTO.com original article] Let me start with a ...