Preface Countdown to Chinese New Year~ Today is the last article on network. Network knowledge is also a frequently tested topic in interviews, so you must lay a solid foundation. Twelve Questions about the Internet for everyone. Can you answer these questions? I have summarized some issues related to the Internet. Please take a look. If you can answer them all, you can skip this article.
The process of network communication and what protocols are used in the middle I made an animation specifically for this question before, you can turn to the previous article to see it: This is how network data is transmitted (combined with animation analysis) Let’s briefly summarize: Client:
Server side:
TCP connection process, three-way handshake and four-way wave, why? Connection phase (three-way handshake):
SYN, synchronization sequence number, is the handshake signal used when TCP/IP establishes a connection. If this value is 1, it means it is a connection message. SEQ, data packet sequence number, is a sequential number for sending data. ACK, confirmation number, is a sequence number of received data.
Here is a question about why a three-way handshake is needed? The main reason is that both parties in the communication need to confirm that their messages have been accurately conveyed. A sends a message to B, and B replies with a message to indicate that it has been received. This process ensures the communication capability of A. B sends a message to A, and A replies with a message to indicate that it has been received. This process ensures the communication capability of B. That is to say, four messages can ensure that the message sending of both parties is normal. Among them, B's reply message and B's message sending can be merged into one message, so there is a three-way handshake. Data transmission phase: One change in the data transmission phase is that the ACK confirmation number is no longer SEQ+1, but SEQ+data length. For example:
This is the header information of a data transmission. ACK represents which byte the next data packet should start from, so it is equal to the SEQ + length of the previous data packet. SEQ is equal to the ACK of the previous data packet. Of course, TCP communication is bidirectional, so each message of actual data will have SEQ and ACK:
Disconnection phase (four waves): As in the connection stage, the TCP header also has a value called FIN that is specifically used to close the connection.
MSL is Maximum Segment Lifetime, the maximum survival time of a message. It is the longest time that any message exists on the network. If this time is exceeded, the message will be discarded. Here is a question about why it takes four waves? A sends a disconnect message to B, and B replies with a message to indicate that it has been received. This process ensures that A has successfully disconnected. B sends a disconnect message to A, and A replies with a message to indicate that it has been received. This process ensures that B has successfully disconnected. In fact, the difference from the connection stage is that B's confirmation message and disconnection message cannot be merged here. Because when A wants to disconnect, B may still have data to process and send, so it has to wait until the normal business is processed before sending the disconnection message. Common status codes
Common status codes: 200 OK - The client request is successful 301 - The resource (webpage, etc.) is permanently transferred to another URL 302 - Temporary jump 400 Bad Request - The client request has a syntax error and cannot be understood by the server 404 - The requested resource does not exist, wrong URL. 500 - An unexpected error occurred inside the server. 503 Server Unavailable - The server cannot currently process the client's request and may return to normal after a period of time. Talk about the differences and scenarios between TCP protocol and UDP protocol Let me first talk about two scenarios, and you may be able to understand it better. 1) The first scenario is browsing the web. (TCP scenario) When we visit a web page, the web page must display all the data correctly. If the packet is lost during this process, it will definitely be retransmitted. It is impossible to display only part of the web page (to ensure data correctness) Similarly, the content on a web page must be in order. For example, if I draw a lottery, I can't give you the prize before you draw it. (To ensure the order of data) Next, in this process with strict data requirements, we definitely need the two parties to establish a reliable connection, that is, we need to go through the three-way handshake before starting data transmission, and each data packet needs a receipt (connection-oriented) The data transmitted in this kind of connection is transmitted using byte stream, that is, there is a pipe. You can transmit data however you want and receive data however you want, as long as it is within this pipe. Therefore, TCP is needed in scenarios that require accurate data, correct order, and stability and reliability. 2) The second scenario is playing games. (UDP scenario) The most important thing when playing games is real-time. Otherwise, if I use a skill and you haven't been hit yet, then you can't play the game.
If you are still a little confused, you can read this article (Adam and Eve), which is a very vivid metaphor: https://www.zhihu.com/question/51388497?sort=created Socket and WebSocket Although the names of these two products are similar, they are actually not on the same level.
Popular Science: After the TCP connection is established, WebSocket needs to perform a handshake via Http, that is, send a GET request message to the server via Http, telling the server that I want to establish a WebSocket connection, please be ready, the specific method is to add relevant parameters in the header information. Then the server responds, "I understand", and changes the connection protocol to WebSocket, and starts to establish a long connection. If we have to say that the two are related, it is that the WebSocket protocol also uses TCP connections, and TCP connections use the Socket API. Https connection establishment process After talking about HTTP and TCP/IP, let’s talk about HTTPS. The previous article talked about how HTTPS ensures secure data transmission, link: https://mp.weixin.qq.com/s/dbmwBVxHkvQ0fzWaSdtPYg The main thing used is the digital certificate. Now let's take a look at the complete Https connection establishment (also called TLS handshake process):
The message content includes a random number (randomC), encryption family (key exchange algorithm, i.e. asymmetric encryption algorithm, symmetric encryption algorithm, hash algorithm), and Session ID (used for recovery). To establish communication, the client will send the first message, also called the Client Hello message, after the TCP handshake. This message mainly sends the above content. The ciphertext family is to send some algorithms supported by the client to the server, and then the server compares it with the algorithms supported by the server to obtain the optimal algorithm supported by both parties.
The Server Hello message content includes a random number (randomS), the encryption group obtained after comparison, and the Session ID (used for resuming the session). At this point, both parties have two random numbers. We will see what these two random numbers are used for later. As mentioned earlier, the encryption algorithm is negotiated by the server and three algorithms are sent back to the client. The Certificate message is used to send a digital certificate. I will not go into details here. The Server Hello Done message is a sign of completion, indicating that all the messages that should be sent have been sent to you.
1) First, the client will verify the certificate sent, such as the digital signature, certificate chain, certificate validity period, and certificate status. 2) After the certificate verification is completed, the client will encrypt and send a random number pre-master secret with the server public key in the certificate. After receiving it, the server will decrypt it with its own private key. 3) At this point, the client and the server have three random numbers: randomC, randomS, and pre-master secret. 4) Then the client and the server use the three random numbers to generate symmetric keys according to a fixed algorithm.
This step corresponds to the Session ID in the first two hello messages. A session ID will be generated. If the subsequent session is disconnected, the conversation can be restored through this Session ID without having to send the certificate and generate the key again.
After obtaining the symmetric key, both parties can use the symmetric key to encrypt and decrypt data and communicate normally. Extension: Why do we need to use asymmetric encryption algorithms to negotiate symmetric encryption? First, network transmission of data requires a relatively high transmission speed. Under the premise of ensuring security, symmetric encryption is used instead of asymmetric encryption algorithms, which are more time-consuming. Secondly, under the premise of determining symmetric encryption for data transmission, if the transmission of symmetric encryption keys is a security issue, a more secure asymmetric encryption algorithm is used, and the certificate chain mechanism is added to ensure the security of the transmission of symmetric key-related data. Please explain to me why digital signatures are authentic and reliable Digital signature, also known as electronic signature mentioned above, is briefly reviewed: Digital signature is actually a use of asymmetric encryption. Its usage is: A uses the private key to encrypt the hash value of the data. The encrypted ciphertext is called a signature, and then transmits the ciphertext and the data itself to B. After B receives it, he decrypts the signature with the public key and then compares it with the hash value of the transmitted data. If they are the same, it means that the signature is indeed signed by A, and only A can sign it because only A has the private key. The actual situation is: The server uses another private key to sign the hash value of the data, which is the data we want to transmit (public key), and then transmits it together with the data (public key). The client then uses another public key to decrypt the signature. If the decrypted data and the hash value of the data (public key) are consistent, it can be proved that the source is correct and not forged.
Certificate chain security mechanism A certificate authority (CA) is an institution that issues digital certificates. It is an authoritative institution responsible for issuing and managing digital certificates. As a trusted third party in e-commerce transactions, it is responsible for verifying the legitimacy of public keys in the public key system. In actual situations, the server will pass its public key and some information about the server to the CA, and then the CA will return a digital certificate to the server, which includes:
The server then passes this certificate to the client during the connection phase. How does the client verify it? Careful friends must know that each client, whether it is a computer or a mobile phone, has its own system root certificate, which includes the issuing authority of the server digital certificate. Therefore, the system root certificate will use their public key to help us decrypt the signature of the digital certificate, and then compare it with the data hash value in the certificate. If they are the same, it means that the source is correct and the data has not been modified. Of course, the middleman can also apply for a certificate through the CA, but the certificate will contain the server's host name, and this host name (domain name, IP) can verify which host your source comes from. To expand: In fact, there is another layer of structure between the server certificate and the root certificate: it is called the intermediate certificate. We can open any web page and click the ?? button in the upper left corner to see the certificate details: You can see that a complete SSL/TLS certificate generally has three layers:
The creation process is time-consuming, so how can we optimize it?
HTTP 2.0 was first tested for interoperability in August 2013. HTTP 2.0 will only be used for https:// URLs on the open Internet, while http:// URLs will continue to use HTTP/1. The goal is to increase the use of encryption technology on the open Internet to provide strong protection against active attacks. HTTP2 has the following main features: Binary framing. Data is transmitted in binary format, which is easier to parse and optimize than text transmission. Multiplexing. All communications under the same domain name are completed on a single connection, and a single connection can also carry any number of bidirectional data streams. Header optimization. HTTP/2 uses HPACK (a compression format designed specifically for http/2 headers) to compress and transmit message headers, which can save network traffic occupied by message headers.
This has been mentioned before. In order to repeat the connection process after disconnection and reconnection, SessionID is used to record the session ID, and then the session can be reused to locate it. This eliminates the process of repeatedly sending certificates and generating keys.
This is the optimization solution proposed by Google. The specific approach is: In the second stage of TLS handshake negotiation, that is, after the client verifies the certificate and sends the pre-master secret, it directly brings the application data, such as requesting web page data. After receiving the pre-master secret, the server generates a symmetric key, decrypts the application data directly with the symmetric key, and responds to the client. In fact, it is to mix the two steps into one step. The client does not need to wait for the server to confirm before sending the application data. Instead, it is sent directly to the server together with the pre-master secret in the second stage, which reduces the handshake process and thus reduces the time.
OCSP is an online query service that verifies and checks the revocation status (legitimacy) of a certificate. One of the steps in the certificate verification process is to verify the legitimacy of the certificate. We can let the server first query the legitimacy of the certificate through OCSP, and then send the result together with the certificate to the client. The client does not need to verify the legitimacy of the certificate separately, thereby improving the efficiency of TLS handshake. This function is called OCSP Stapling. Extensions: If we ignore the establishment process and consider the entire HTTPS transmission process, what are the optimization points? You can take a look at this article: https://www.cnblogs.com/evan-blog/p/9898046.html Talk about the difference between HTTP and HTTPS After the above long explanation, the difference between the two should be very clear:
How to achieve block transmission and breakpoint resume? Chunked transfer Under normal circumstances, the server will disconnect the connection after sending all the data. Therefore, the value of the Connection field in the request header is generally set to keep-alive, which means that the connection should be maintained until the value of the Connection field in a certain data packet is close. Another way to maintain a TCP connection is to transmit the request data in blocks. Block transmission means that the data sent by the server to the client can be divided into multiple parts for transmission. Directions:
Purpose: Allow clients to respond quickly and reduce waiting time. Maintain long connections. However, this block transmission is only available in HTTP1.1. HTTP2.0 supports multiplexing, and a single connection can carry any number of bidirectional data streams, that is, bidirectional transmission can be performed on any connection, and the block transmission function is no longer needed. Resume download It means that the client wants to start downloading or uploading the file from the point where it was last interrupted. This way, even if there is a network problem that causes the download or upload to be interrupted, it is fine, ensuring a good user experience. Directions:
Actual use process:
What are the ways to transfer pictures via Http? In fact, this question is about the understanding of Content-Type. There are three methods:
Form type file transfer request. Set content-type to multipart/form-data to send binary format files. Supports uploading multiple files and text parameters. This is the most common practice.
This method is to directly convert the image into a binary stream for transmission, and the server side can directly read the data in the stream and convert it into an image. But this method has a disadvantage that only one picture can be uploaded at a time.
Another way is to convert the image into a Base64 format string and then transmit it. Just like ordinary text parameters, set the Content-Type such as application/x-www-form-urlencoded or text/plain. refer to https://wetest.qq.com/lab/view/110.html https://www.zhihu.com/question/271701044 https://www.cnblogs.com/wqhwe/p/5407468.html http://www.ruanyifeng.com/blog/2017/06/tcp-protocol.html https://network..com/art/201909/602938.htm https://www.dazhuanlan.com/2019/11/21/5dd5aeeff1d0b/ https://zhuanlan.zhihu.com/p/26559480 "How the Network is Connected" This article is reprinted from the WeChat public account "Ma Shang Ji Mu", which can be followed through the following QR code. To reprint this article, please contact the WeChat public account "Ma Shang Ji Mu". |
OneTechCloud has officially launched a promotiona...
DMIT has released the latest special package for ...
Weibo and WeChat are two well-known social platfo...
Apache APISIX is a dynamic, real-time, high-perfo...
[[343348]] This article is reprinted from the WeC...
Michael Porter, a famous American strategic exper...
On June 8, the Ministry of Industry and Informati...
5G is the fastest growing mobile technology in hi...
Ransomware claim activity is set to grow more tha...
[51CTO.com original article] [Beijing, China, Jul...
What does an intelligent world where everything i...
The much-watched ZTE ban incident has experienced...
According to the latest forecast by Gartner, the ...
This article is reprinted from the WeChat public ...