Some time ago, I was busy with a robot communication project, in which an important protocol family was used, namely TCP/IP (Transmission Control Protocol/Internet Protocol). I have always felt that the design of the TCP/IP protocol is really ingenious. It can be said to be the greatest communication protocol on the planet. It is everywhere, from WeChat instant messaging to automatic control industries such as aerospace. It is thanks to TCP/IP that every Internet message we send can reach the other party safely and without loss (for example, have you ever thought about what happened to your message during the period when you were in Beijing and opened WeChat to send a blessing phrase to your friend in Shanghai, and why it arrived accurately). A TCP/IP protocol is half of the history of computer networks, so I think every computer practitioner or computer-related practitioner should spend some time studying the TCP/IP protocol. Simple is the best way. Today we will use a packet capture tool Wireshark to capture a data packet, and then unravel it and gradually uncover the mystery of TCP/IP! In December 1997, US President Bush awarded the National Medal of Technology to Robert Kahn (center), the father of TCP/IP. The Internet originated from a decentralized command system that the U.S. Department of Defense was preparing to research during the U.S.-Soviet rivalry. It is composed of countless nodes. When some nodes are destroyed, they can still communicate with each other through other nodes. It is a topological structure. The information you send from Beijing to Shanghai can take multiple paths. The magic of TCP/IP is that it can ensure that your information arrives accurately. What is a "protocol"? The original meaning of "协" refers to cooperation among multiple people, and the "议" at the end means a style of writing in ancient Chinese. So together, it means a style of writing that multiple people cooperate and abide by. Later, it was extended to a rule that both parties agree to abide by. There are many protocols in life, such as the three-party agreement you signed when you just graduated, and the lease agreement when you rent a house. Going back a little further, after Qin Shihuang unified China, in order to facilitate communication across the country, he formulated the rule of "same track for vehicles, same writing for books". This should be the earliest national standard in Chinese history, which is also a kind of protocol. In addition, when you watch spy movies, both parties have slogans when conveying information. These are all agreed in advance and are a kind of protocol that both parties must abide by. With the protocol, our communication is greatly facilitated. For example, when we make a phone call, we usually say "Hello, how are you?" and wait for the other party to confirm the answer before we can start the formal call. The same is true for TCP. TCP performs three handshakes in the process of establishing a connection. Both parties will communicate with each other only after they have fully confirmed that everything is correct: Three-way handshake diagram With the information in this picture, let’s start our bag-catching journey! Preparation before packet captureThe first step is to download a packet capture software Wireshark. You can find a lot of this software on Baidu and download it directly. In the second step, we ping Baidu's domain name: www.baidu.com on our computer, so that we can know the IP address of Baidu's server. Of course, Baidu must have more than one server in the country, so the address pinged by each person may be different. For example, the address I pinged is 180.101.49.12. In the third step, we open the packet capture software, select your network interface in the options, and then click Start, Wireshark will start to help you capture packets; The fourth step is to open the browser, enter the IP address of the Baidu server that we just pinged, enter the address bar and press Enter, and the Baidu homepage will be opened. In fact, the URL we usually enter is also converted into an IP address after being resolved by the DNS server for access. That's right, we access a Baidu URL to capture the message information of both parties after typing Enter. In the fifth step, we enter ip.addr == 180.101.49.12 and tcp command in the search box of the packet capture software to help us filter out redundant information: Well, we can see that we have captured a lot of information, but we only need the first three, because the first three are the message information transmitted between the two parties during the process of establishing a connection (three-way handshake). Before analyzing these three messages, let's recall the message structure in each layer of the TCP/IP network model: Each layer of the model adds a header to its own data and sends it to the next layer: This is the model for WeChat to send messages (all TCP/IP protocols are like this). Of course, WeChat is definitely not transmitted directly end-to-end to another mobile phone, but it is first transmitted to the Tencent server and then forwarded to the other party, but the principle is the same. It can be seen that each layer is encapsulated data, which is processed (usually with a header) and then transmitted to the next layer, and finally transmitted to the other computer by the transmission medium of the physical layer (such as optical fiber). After receiving it, the other party will analyze it in reverse, peel off the header of each layer, and finally reveal the data part, which is the real data we want. Therefore, the packets we capture are all in the bottom Ethernet frame, including the frame header, frame tail, IP header, and TCP header data packets, so we must analyze from the last layer first. Data Link Layer The Ethernet frame header structure we receive at the data link layer is as follows (for the convenience of description, the width and bit length of the following picture are not drawn in proportion): Briefly explain the meaning of several fields:
To remind the receiving system that a frame is coming, write 10101010 seven times;
Indicates that the frame transmission officially starts from the next datagram, and the binary number sequence is 10101011;
The content is the frame length (in bytes) or the information given to the upper layer protocol. The specific content depends on the type of Ethernet. Let's first look at the data of the first handshakeThe first red box b0 95 8e 0b 15 38 represents the target MAC address, which is the network card address of the Baidu server; The second red box 10 63 c8 ff ff ff represents the source MAC address, which is the network card address of my own computer. The last three bytes ff ff ff are virtual. I also mosaiced them in the picture above. Why? Because I am afraid that hackers will know my network card address and attack my computer. The third red box 08 00 represents the IPv4 protocol. Wait, you may ask, why are the leading synchronization code and frame start delimiter of the frame header not displayed? This is because the network card at the physical layer must first remove the leading synchronization code and frame start delimiter, and then perform a CRC check on the frame. If the frame checksum is wrong, the frame will be discarded. If the checksum is correct, it will determine whether the destination hardware address of the frame meets its own receiving conditions (the destination address is its own physical hardware address, broadcast address, receivable multicast hardware address, etc.). If it does, the frame will be handed over to the "device driver" for further processing. Only then can our packet capture software capture the data. Therefore, the packet capture software captures the data except the leading synchronization code, frame start delimiter, and FCS. Network Layer Let's first look at the network layer, that is, the header structure of the IP layer: Let's first briefly explain the meaning of each field:
If it is IPv4, fill in 4, if it is IPv6 (of course the header structure is different from IPv4), fill in 6.
The size of the IP header, the header length refers to the total length of the IP header. Because of the optional Option part, it is usually 20 bytes, ranging from 20 to 60 bytes. Note that the unit of this field is 32-bit words (1 32-bit word is 4 bytes), so when the IP header length is 1111 (15), the maximum is 60 (15*4) bytes. Be sure to pay attention to the unit of this field, which is special and easy to make mistakes.
Displays the priority when sending messages.
The total size of the IP header and data indicates the total length of the IP header and data. The total length is 16 bits, and the maximum length of a data is 65535 bytes; the link only allows 1500 bytes, and if it exceeds, MTU fragmentation is required. A data packet consists of two parts: the IP header and the data, and the IP header is 20-60 bytes, so there will not be a data packet with pure data exceeding 1480 bytes.
The value used when restoring the segmented IP data packet is used for IP message fragmentation together with the tag field and offset field. If the original message size exceeds the MTU (<1480B), the original data must be fragmented. Each fragment is smaller than the MTU. The fragmented messages of the same original file are marked with the same tag, which is also used to determine whether the traffic comes from the same host. The IP software maintains a counter in the memory. Every time a data packet is produced, the counter is incremented by 1 and the identification field is assigned. After the data message is fragmented, the identification value of each fragment is the same as the identification value of the original data packet. The fragments with the same identification value at the receiving end can eventually be correctly reassembled into the original data.
Information about packet fragmentation: The first position is not used; The second bit is Do Not Fragment (DF). When the DF bit is 1, it indicates that the router cannot fragment the message. The third bit is Multiple Fragments (MF). When the router fragments the message, except for the last fragment whose MF bit is set to 0, the MF bits of all other fragments are set to 1, so that the receiver will not receive the fragment with the MF bit set to 0. For example, if the packet is split into two segments, the first flags will be 101 and the second flags will be 100.
The order of the data being split, identifying the position of the fragments in the group.
The maximum number of routers allowed to pass, that is, the number of hops a data packet can pass. The default maximum value of TTL varies in different operating systems. The purpose is to prevent IP data from being infinitely forwarded when routing loops form. The TTL value is reduced by 1 each time a router is passed. When the TTL is 0, the packet is discarded.
The upper level protocol identifies the protocol of the data carried by the data and identifies the transport layer address or protocol number, such as 1 for ICMP, 6 for TCP, and 17 for UDP.
The value used to confirm whether the IP packet header is damaged. It is used to check whether there are any discrepancies in the IP header.
Optional field (0--40B) The Option field is rarely used and is used for control, forwarding requirements, testing, etc. There is a lot of information about the network layer. We will only pick out a few important ones to talk about: The c0 a8 00 65 in the second line is converted to decimal as 192.168.0.101, which is the IP of my computer. Note that this is the LAN IP. The b4 65 31 0c at the end is converted to decimal as 180.101.49.12, which is the target IP, that is, the IP of Baidu server. Transport Layer The function of the transport layer is to ensure that data is reliably sent from the sending node to the target node. If we look at its header structure, we can already see the handshake information it carries.
Indicates which data is the first among all the data.
Indicates the number of the next data to be received Let's look at the captured packet data again, peel off one more layer, remove the IP header, and look at the first handshake first: We can see that during the first handshake, the client randomly generates a sequence number with a large value, and then sets the SYN in the status control code to 1 and sends it to the server. Second handshake: After the server receives the message from the client, it responds to the client. Since the server is greeting the client, the port numbers are swapped. The server also randomly generates a sequence number and adds 1 to the sequence number received from the client in the first handshake, and sends it as a confirmation sequence number. At the same time, the ACK and SYN status control codes are set to 1 and sent to the client. The third handshake: The third handshake is when the client responds again after receiving the response from the server. As you can see, the sequence number is the confirmation response sequence number sent by the server in the second handshake, and the confirmation sequence number of the client in the third handshake is the confirmation sequence number sent by the server in the second handshake plus 1. In this way, the three-way handshake is completed, the two parties establish a connection, and can communicate with each other. Well, that's all for today. In fact, the three-way handshake is just the tip of the iceberg in TCP. The actual transmission process involves a lot of knowledge, such as timeout retransmission, verification, window mechanism, etc. But let's stop here today. Students with weak foundation can digest it first, and I will tell you the rest of the knowledge when I have time. |
<<: An article on learning Go network library Gnet analysis
>>: Is LoRaWAN the solution to cellular IoT challenges?
[51CTO.com original article] The interview with Z...
HostingViet's April promotion will end in two...
[51CTO.com Shanghai report] The 2017 National Cyb...
Whose product is 5G private network? A new report...
MQTT (Message Queuing Telemetry Transport) is a &...
[[394613]] On April 20, China Mobile announced it...
The telecommunications industry is a hot field th...
It's the end of another year, and 2021 is sti...
On April 8, 2021, the NGINX Official Authorized C...
After decades of development and application, the...
Netflix, Youtube, Disney+ and other video sites h...
On May 15 , IBM launched a set of tools called &q...
Today, digital transformation has become a global...
This year, speed increase and fee reduction have ...
Now China Mobile, China Telecom and China Unicom ...