Preface I've been reading about HTTP recently, mainly the book "HTTP in Pictures", which I think is pretty good. So I took some notes based on my own understanding. I also read "HTTP Definitive Guide" before, but I think the book has too much content and is not suitable for beginners. I recommend "HTTP in Pictures" for beginners. What is HTTPS HTTPS (full name: Hyper Text Transfer Protocol over Secure Socket Layer) is an HTTP channel with security as its goal. Simply put, it is a secure version of HTTP. HTTPS adds a security layer (SSL or TLS) between the application layer (HTTP) and the transport layer (TCP). The purpose is to solve several shortcomings of the HTTP protocol:
Several Disadvantages of HTTP Communications are in plain text (not encrypted), and the content may be eavesdropped. The HTTP protocol does not encrypt communications and data, and is sent in plain text. When we use HTTP requests, the data will pass through many routers, proxies and other devices. During this process, as long as someone captures the data packet at a certain link (this operation is not difficult), the data of the request will be seen. If there is some important data in the request, such as bank card account, mobile phone number, password and other information, there will be a risk of leakage. The integrity of the message cannot be proven, so it may have been tampered with. Since the HTTP protocol cannot prove the integrity of the communication message, there is no way to know even if the request or response is tampered with during the period from when the request or response is sent to when it is received by the other party. In other words, there is no way to confirm that the request/response sent and the request/response received are the same. The identity of the communicating party is not verified, so there is a possibility of impersonation. The implementation of the HTTP protocol itself is very simple. No matter who sends the request, a response will be returned. Therefore, if the communication party is not confirmed, the following risks may exist:
Even meaningless requests will be accepted. DoS attacks caused by massive requests cannot be prevented. HTTP+encryption+integrity protection+authentication=HTTPS Let's take a look at how HTTPS solves the above problems. Using communication encryption to solve the problem of plain text transmission in HTTP Unlike HTTP's plain text communication, HTTPS's communication is encrypted. Therefore, even if someone captures the data packet during the communication process, they cannot know the specific content of the data packet because they do not have the key, which can protect the transmitted data.
In HTTPS communication, the client and server will have two identical communication keys (set as key A). When the client sends a request, it will use key A to encrypt the request into ciphertext. After receiving the request, the server will use key A to decrypt the request content, obtain the plaintext sent by the client, and process it. The response process is similar. Therefore, during network transmission, because the data is encrypted, even if someone obtains the data packet, they cannot decrypt the content of the data packet because they do not have key A, and all they see is a bunch of garbled characters. This type of encryption method that uses the same key for both encryption and decryption is called symmetric encryption (also called shared encryption, where a key is shared). Content encrypted with key A can only be decrypted with key A, and other keys cannot decrypt it. Commonly used symmetric encryption algorithms include DES, 3DES, and AES. There is another encryption method called asymmetric encryption (also called public key encryption). Asymmetric encryption requires the use of two keys, a public key and a private key. The public key is public and known to everyone. The private key is kept secret and is not known to anyone except yourself.
HTTPS not only uses symmetric encryption, but also asymmetric encryption. In fact, during HTTPS communication, the client holds a public key and the server holds a private key. Asymmetric encryption can be used to complete several key operations, such as identity verification, negotiation of symmetric keys for communication, and encryption of summaries during data transmission.
As shown in the figure above, during the request and response process, in addition to the encrypted data, a message digest is also sent. This digest can be used to verify whether the data has been tampered with. Take the response as an example,
The client will then compare the digest it calculated with the digest sent by the server. If the two are the same, it proves that the response data sent by the server is the same as the response data received by the client. In other words, the data is complete, not lost, and not tampered with. But the premise here is that the summary sent by the server has not been tampered with. If someone tampered with the data and changed the summary as well, it would be a bit tricky. Therefore, HTTTPS will use asymmetric encryption to encrypt the summary to prevent the summary from being tampered with.
The client will then compare the digest it has calculated with the decrypted digest sent by the server. If the two are the same, it proves that the response data sent by the server is the same as the response data received by the client. In other words, the data is complete, not lost, and not tampered with. Why can asymmetric encryption ensure that the digest sent by the server is not modified? Because only the server has the private key, that is, only the server can use the private key to encrypt. Data encrypted with the private key can only be decrypted by the public key. In other words, data that can be decrypted with the public key is encrypted with the private key. Therefore, as long as the client can decrypt the encrypted summary using the public key, then this summary is encrypted by the server using the private key. And the private key is only known by the server, and others cannot tamper with the encrypted summary. I actually have a question here. Conversely, during the request process, the client encrypts the summary with the public key, and then the server decrypts the summary with the private key. It seems that it can only prove that the summary was encrypted with the public key, but the public key is public and others can know it, so can't others tamper with the summary? Verify the server's identity using a digital certificate How does HTTPS confirm the authenticity of the server? In other words, how do we confirm that the server communicating with the client is the server pointed to by our domain name, and not another server pretending to be one? HTTPS uses asymmetric encryption to solve the problem. The server has a private key, which only the server has. Therefore, as long as the client has the server's public key, when the server uses the private key to encrypt data and sends it to the client, and the client is able to decrypt the data, it means that the public key and private key are paired, and the public key corresponds to the server we want, so it means that the server is real. Then the question arises again, how do we get the public key of the server? Built into the client? No matter how many websites it seems, it is not realistic. Then we send it to the client when establishing a connection. Well, that's what HTTPS does. No, when sending the public key, the public key may also be tampered with. If it is tampered with into the public key of another server, then it will communicate with other pretending servers in the future. How can we play then? Well, HTTPS introduces an authoritative third-party organization to ensure that the public key is indeed that of the server. If you want to use HTTPS, the server administrator needs to purchase a certificate from a CA (authoritative certificate authority). The CA will encapsulate the server's domain name, public key, company information, and other content into the certificate. And use the CA's own private key to sign the certificate. Then, the server sends the certificate to the client. If the client verifies that the certificate is valid and the domain name of the certificate is consistent with the domain name of the current communication, then the public key in the certificate is valid. And the current public key is the public key of the server pointed to by the domain name. Therefore, we can ensure that the public key we get is genuine. Well, the certificate issued by this CA organization is called a digital certificate. The question arises again, how does the client verify that the certificate sent by the server is valid? The certificate is signed by the CA using its private key, and most clients have the public key of these authoritative organizations (CA) built in, so you can directly obtain the CA's public key to decrypt the signature on the certificate, and then calculate the digest according to the instructions on the certificate. If the two digests are consistent, it means the certificate is valid. Because only the CA itself has the private key, it is impossible for others to impersonate this signature. Having said so much, HTTPS will send a digital certificate to the client when the connection is established. After the client verifies the digital certificate, it can confirm the identity of the server. At the same time, it can also use the public key on the digital certificate to encrypt the randomly generated shared key A, and negotiate with the server the shared key A used to encrypt data in the subsequent communication process. HTTPS handshake process
The SSL protocol uses both public key encryption and symmetric encryption. Although symmetric encryption is faster than public key encryption, public key encryption provides better identity authentication. The SSL handshake protocol is very effective in allowing clients and servers to complete mutual identity authentication. The main process is as follows: ①The client requests an HTTPS connection to the server. The client transmits to the server the version number of the client SSL protocol, the type of encryption algorithm, the generated random number, and other information required for communication between the server and the client. ②The server confirms and returns the certificate. The server sends the version number of the SSL protocol, the type of encryption algorithm, random numbers and other related information to the client, and the server also sends its own certificate to the client. ③The client verifies the certificate sent by the server. The client verifies the legitimacy of the server using the information sent by the server. The legitimacy of the server includes: whether the certificate is expired, whether the CA that issued the server certificate is reliable, whether the public key of the issuer certificate can correctly decrypt the "issuer's digital signature" of the server certificate, and whether the domain name on the server certificate matches the actual domain name of the server. If the legitimacy verification fails, the communication will be disconnected; if the legitimacy verification passes, it will proceed to step 4. ④After the information is verified , the client generates a random key A, encrypts it with the public key and sends it to the server. The server's public key can be obtained from the certificate verified in step ③ . The random key generated by the client is encrypted using this public key. After encryption, only the owner of the server (who holds the private key) can decrypt it to ensure security. ⑤The server uses the private key to decrypt the random key A , and then uses this random key A to encrypt communications. We did not add the logic of verifying the client's identity to the handshake process because in most cases, HTTPS only verifies the server's identity. If you want to verify the client's identity, the client needs to have a certificate and send it during the handshake, and this certificate costs money. |
<<: Inventory: Three basic elements and five characteristics of the Industrial Internet
>>: The 2018 Secada Excellent Product Award selection is about to start
[[405114]] This article is reprinted from the WeC...
As software-defined wide area networks (SD-WAN) b...
The Internet of Things (IoT) has fundamentally ch...
It coincides with the third anniversary of China...
The latest generation of Wi-Fi technology, Wi-Fi ...
[51CTO.com original article] When people use vari...
At the beginning of the year, the blog shared inf...
The concepts of hybrid WAN and SD-WAN are often m...
VULTR is a foreign VPS merchant founded in 2014. ...
When it comes to data centers, most people will f...
Xiao Z: Ladies and gentlemen, welcome to our Miss...
[[386510]] Today, China Telecom announced its ful...
[[431005]] Recently, the three major operators ha...
Cellular has ‘all the ingredients’ to enhance pre...
NEC announced that it has verified that distribut...