After reading this article on HTTPS, you will have no problem arguing with the interviewer

After reading this article on HTTPS, you will have no problem arguing with the interviewer

Let's learn about HTTPS. First, let me ask you a question. Why do we need HTTPS after HTTP? I suddenly had an idea. Why do we need to answer standard answers during interviews? Why don't we express our own ideas and insights, but remember some so-called standard answers? Is there still right or wrong in technology?

[[318621]]

1. Why HTTPS Appears

The emergence of a new technology must be to solve a certain problem, so what problem does HTTPS solve for HTTP?

1. What problem does HTTPS solve?

A simple answer may be that HTTP is not safe. Due to the plain text transmission characteristics of HTTP, anyone may intercept, modify or forge the request during the HTTP transmission process, so HTTP can be considered unsafe; the identity of the communicating party will not be verified during the HTTP transmission process, so the two parties of the HTTP information exchange may be disguised, that is, there is no user verification; during the HTTP transmission process, the receiver and the sender will not verify the integrity of the message. In summary, in order to solve the above problems, HTTPS was born.

2. What is HTTPS

Do you remember how HTTP is defined? HTTP is a Hypertext Transfer Protocol, which is a convention and specification for transmitting hypertext data such as text, pictures, audio, and video between two points in the computer world. Let's take a look at how HTTPS is defined.

HTTPS stands for Hypertext Transfer Protocol Secure, which is used to exchange information securely between two end systems on a computer network. It is equivalent to adding a secure word to HTTP. So we can give a definition of HTTPS: HTTPS is a convention and specification in the computer world for securely transmitting hypertext data such as text, pictures, audio, and video between two points. HTTPS is an extension of the HTTP protocol, which does not guarantee the security of transmission itself. So who guarantees security? In HTTPS, the communication protocol is encrypted using Transport Layer Security (TLS) or Secure Sockets Layer (SSL). That is, HTTP + SSL (TLS) = HTTPS.

[[318622]]

3. What does HTTPS do?

The HTTPS protocol provides three key indicators

  • Encryption, HTTPS encrypts data to protect it from eavesdroppers. This means that when a user browses a website, no one can monitor the information exchange between him and the website, or track the user's activities, access records, etc., and steal user information.
  • Data integrity: data will not be modified by eavesdroppers during transmission. The data sent by the user will be transmitted to the server intact, ensuring that the server receives what the user sends.
  • Authentication refers to confirming the other party's true identity, that is, proving that you are you (it can be compared to face recognition). It can prevent man-in-the-middle attacks and build user trust.

With the above three key indicators guaranteed, users can exchange information with the server securely. So, now that you have mentioned the various benefits of HTTPS, how do I know whether a website uses HTTPS or HTTP? I should be able to explain it by giving you two pictures.

The HTTPS protocol is actually very simple. The RFC document is very small, only 7 pages long. It specifies the new protocol name and the default port number 443. As for other response modes, message structures, request methods, URIs, header fields, connection management, etc., they are all completely based on HTTP, with nothing new.

That is to say, except for the protocol name and the default port number (HTTP default port 80), the HTTPS protocol is the same as HTTP in terms of syntax and semantics. HTTPS also has everything that HTTP has. So, how does HTTPS achieve the security that HTTP cannot? The key lies in the S, that is, SSL/TLS.

2. What is SSL/TLS

1. Understanding SSL/TLS

TLS (Transport Layer Security) is the successor to SSL (Secure Socket Layer), which is a protocol used for authentication and encryption between two computers on the Internet.

Note: On the Internet, many names are interchangeable.

We all know that one of the most important steps for some online businesses (such as online payment) is to create a trustworthy transaction environment that allows customers to conduct transactions with peace of mind. SSL/TLS ensures this. SSL/TLS works by binding the physical information of websites and companies to encryption keys through digital documents called X.509 certificates. Each key pair has a private key and a public key. The private key is unique and is generally located on the server. It is used to decrypt information encrypted by the public key; the public key is public. Everyone who interacts with the server can hold the public key. Information encrypted with the public key can only be decrypted by the private key.

What is X.509: X.509 is a standard format for public key certificates, a document that securely associates a cryptographic key with a person or organization.

The main applications of X.509 are as follows:

  • SSL/TLS and HTTPS for authenticated and encrypted web browsing
  • Signed and encrypted emails via S/MIME protocol
  • Code Signing: It refers to the process of signing software applications using digital certificates for safe distribution and installation.

By digitally signing software with a certificate issued by a well-known public certificate authority (such as SSL.com), developers can assure end users that the software they wish to install was published by a known and trusted developer; and has not been tampered with or compromised since being signed.

  • Can also be used for document signing
  • Can also be used for client authentication
  • A government-issued electronic ID card (see https://www.ssl.com/article/pki-and-digital-certificates-for-government/ for details)

We will discuss this later.

2. The core of HTTPS is HTTP

HTTPS is not a new application layer protocol, it is just that the HTTP communication interface is partially replaced by SSL and TLS. Normally, HTTP will communicate directly with TCP first. After using HTTPS with SSL, it will evolve to communicate with SSL first, and then SSL and TCP will communicate. In other words, HTTPS is HTTP covered with a layer of SSL. (I like to save the powder for last...)

SSL is an independent protocol that can be used not only by HTTP but also by other application layer protocols, such as SMTP (email protocol) and Telnet (remote login protocol).

3. Explore HTTPS

I say, why did you give it such an awesome name? Do you want to brag? Isn't your HTTPS just a successor to TLS/SSL? Why are you so arrogant and want to explore HTTPS? It's nonsense. Just change it to TLS. Praise my Lord.

SSL is the Secure Sockets Layer, which is at the fifth layer in the OSI seven-layer network model. In 1999, SSL was renamed TLS by IETF (Internet Engineering Task Force), which is the Transport Security Layer. Until now, there have been three versions of TLS, 1.1, 1.2 and 1.3. Currently, 1.2 is the most widely used, so the following discussion is based on TLS 1.2.

TLS is used to provide confidentiality and data integrity between two communicating applications. TLS consists of several sub-protocols, such as the record protocol, handshake protocol, warning protocol, change cipher protocol, and extension protocol. It uses a combination of symmetric encryption, asymmetric encryption, identity authentication, and many other cutting-edge cryptographic technologies (if you think a technology is simple, you just haven't learned it well. Any technology has beauty. Awesome people just appreciate it, not belittle it).

After talking for so long, we still haven't seen the naming convention of TLS. Let's take a TLS example to see the structure of TLS.

(Please refer to https://www.iana.org/assignments/tls-parameters/tls-parameters.xhtml).

  1. ECDHE-ECDSA-AES256-GCM-SHA384

What does this mean? I was a little confused when I first saw it, but there is actually a routine, because the TLS cipher suite is relatively standardized. The basic format is a password string composed of key exchange algorithm - signature algorithm - symmetric encryption algorithm - digest algorithm. Sometimes there is also a grouping mode. Let's first look at what it means.

Use ECDHE for key exchange, ECDSA for signing and authentication, then use AES as the symmetric encryption algorithm, the key length is 256 bits, use GCM as the grouping mode, and finally use SHA384 as the digest algorithm.

TLS fundamentally uses two forms of encryption: symmetric and asymmetric.

1. Symmetric encryption

Before understanding symmetric encryption, let's first understand cryptography. In cryptography, there are several concepts: plaintext, ciphertext, encryption, and decryption:

  • Plaintext is generally considered to be a meaningful set of characters or bits, or a message that can be obtained through some public encoding. Plaintext is usually represented by m or p
  • Ciphertext, after some kind of encryption of plaintext, it becomes ciphertext
  • Encryption is the process of converting original information (plaintext) into ciphertext.
  • Decryption is the process of restoring encrypted information to plain text.

Symmetrical encryption, as the name implies, means that the same key is used for encryption and decryption. As long as the security of the key is guaranteed, the entire communication process will be confidential.

There are many encryption algorithms available in TLS, such as DES, 3DES, AES, ChaCha20, TDEA, Blowfish, RC2, RC4, RC5, IDEA, SKIPJACK, etc. Currently, the most commonly used ones are AES-128, AES-192, AES-256 and ChaCha20.

DES stands for Data Encryption Standard, a symmetric key algorithm used for digital data encryption. Although its short key length of 56 bits makes it too insecure for modern applications, it has been very influential in the development of encryption technology.

3DES is an encryption algorithm derived from the original Data Encryption Standard (DES). It became important after the 1990s, but later became less important due to the emergence of more advanced algorithms.

AES-128, AES-192 and AES-256 all belong to AES. The full name of AES is Advanced Encryption Standard. It is a replacement for the DES algorithm. It has high security and good performance. It is the most widely used symmetric encryption algorithm.

ChaCha20 is another encryption algorithm designed by Google. Its key length is fixed at 256 bits. Its pure software performance exceeds that of AES. It was once popular on mobile clients, but ARMv8 later added AES hardware optimization, so it no longer has a clear advantage, but it is still considered a good algorithm.

(Others can be searched by yourself)

2. Encryption Group

Symmetric encryption algorithms also have the concept of group mode. For GCM group mode, it can only be used with AES, CAMELLIA and ARIA. AES is obviously the most popular and widely deployed choice. It allows the algorithm to encrypt plaintext of any length with a fixed-length key.

There were several packet modes at first, such as ECB, CBC, CFB, and OFB, but they were all found to have security vulnerabilities, so they are basically not used now. The latest packet mode is called AEAD (Authenticated Encryption with Associated Data), which adds authentication function while encrypting. Commonly used ones are GCM, CCM, and Poly1305.

For example, ECDHE_ECDSA_AES128_GCM_SHA256 would represent a 128-bit key, and AES256 would represent a 256-bit key. GCM represents a modern authenticated encryption with associated data (AEAD) mode of operation for a block cipher with 128-bit blocks.

We talked about symmetric encryption above. The encryption party and the decryption party of symmetric encryption use the same key. That is to say, the encryption party must encrypt the original data and then give the key to the decryption party for decryption. Then the data can be decrypted. What problems will this cause? This is like "Little Soldier Zhang Ga" delivering a letter (the letter has been encrypted), but Zhang Gazi still holds the decryption password. If Zhang Gazi is discovered by the devil on the way, the letter will be completely exposed. Therefore, symmetric encryption has risks.

3. Asymmetric encryption

Asymmetrical encryption is also called public key encryption. Compared with symmetric encryption, asymmetric encryption is a new and improved encryption method. The key is transmitted and exchanged over the network, which can ensure that even if the key is intercepted, the data information will not be exposed. There are two keys in asymmetric encryption, one is the public key and the other is the private key. The public key is used for encryption and the private key is used for decryption. The public key can be used by anyone, and the private key can only be known by you.

The text encrypted with the public key can only be decrypted with the private key. At the same time, the text encrypted with the private key can also be decrypted with the public key. The public key does not need to be secure because the public key needs to be transmitted across the network. Asymmetric encryption can solve the problem of key exchange. The website keeps the private key and distributes the public key arbitrarily on the Internet. If you want to log in to the website, you only need to encrypt it with the public key. The ciphertext can only be decrypted by the private key holder. Hackers cannot crack the ciphertext because they do not have the private key.

The design of asymmetric encryption algorithms is much more difficult than that of symmetric algorithms (we will not discuss specific encryption methods). Common ones include DH, DSA, RSA, ECC, etc.

Among them, the RSA encryption algorithm is the most important and well-known one.

DHE_RSA_CAMELLIA128_GCM_SHA256. Its security is based on integer decomposition, using the product of two very large prime numbers as the material for generating the key. It is very difficult to deduce the private key from the public key.

ECC (Elliptic Curve Cryptography) is also a type of asymmetric encryption algorithm. It is based on the mathematical problem of elliptic curve discrete logarithm and uses specific curve equations and base points to generate public and private keys. ECDHE is used for key exchange and ECDSA is used for digital signatures.

TLS uses a hybrid encryption method that uses symmetric and asymmetric encryption to achieve confidentiality.

4. Hybrid Encryption

RSA has a very slow operation speed, while AES has a relatively fast encryption speed. TLS uses this hybrid encryption method. At the beginning of communication, asymmetric algorithms such as RSA and ECDHE are used to solve the key exchange problem first. Then a random number is used to generate the session key used by the symmetric algorithm, and then it is encrypted with the public key. After the other party receives the ciphertext, it uses the private key to decrypt and obtain the session key. In this way, both parties have achieved the secure exchange of symmetric keys.

Now that we have achieved confidentiality using hybrid encryption, can we transmit data securely? This is not enough. On the basis of confidentiality, we must also add integrity and identity authentication features to achieve true security. The main means of achieving integrity is the digest algorithm.

5. Digest Algorithm

How to achieve integrity? In TLS, the main means of achieving integrity is the digest algorithm. If you don't know the digest algorithm, you should know MD5. The full name of MD5 is Message Digest Algorithm 5. It is a cryptographic hash algorithm. MD5 can be used to create a 128-bit string value from a string of any length. Although MD5 has insecurity factors, it is still used today. MD5 is most commonly used to verify the integrity of files. However, it is also used in other security protocols and applications, such as SSH, SSL, and IPSec. Some applications enhance the MD5 algorithm by adding salt values ​​to the plaintext or applying hash functions multiple times.

What is salting? In cryptography, a salt is a piece of random data used as an additional input to a one-way function that hashes data, passwords, or passwords. Salts are used to protect passwords in storage. For example:

What is one-way? It means that this algorithm has no key for decryption and can only perform one-way encryption. The encrypted data cannot be decrypted and the original text cannot be reversed.

Let's go back to the discussion of the digest algorithm. In fact, you can understand the digest algorithm as a special compression algorithm that can compress data of any length into a string of fixed length, which is like adding a lock to the data.

In addition to the commonly used MD5 encryption algorithm, SHA-1 (Secure Hash Algorithm 1) is also a commonly used encryption algorithm, but SHA-1 is also an insecure encryption algorithm and is prohibited from use in TLS. Currently, TLS recommends using SHA-1's successor: SHA-2.

SHA-2 stands for Secure Hash Algorithm 2. It was introduced in 2001. It made significant changes to SHA-1. The SHA-2 series includes six hash functions, whose digests (hash values) are 224, 256, 384 or 512 bits: SHA-224, SHA-256, SHA-384, SHA-512. They can generate 28-byte, 32-byte, 48-byte, and 64-byte digests respectively.

With the protection of SHA-2, data integrity can be achieved. Even if you change a punctuation mark or add a space in the file, the generated file digest will be completely different. However, SHA-2 is based on plaintext encryption and is still not secure enough. So what should we use?

A more secure encryption method is to use HMAC. Before understanding what HMAC is, you need to know what MAC is.

MAC stands for message authentication code, which is generated from the message and key through the MAC algorithm. The MAC value allows the verifier (who also possesses the secret key) to detect any changes to the message content, thereby protecting the data integrity of the message.

HMAC is a further extension of MAC. It uses a combination of MAC value + Hash value. Any cryptographic hash function, such as SHA-256, can be used in the calculation of HMAC.

Now that we have solved the integrity problem, there is only one problem left, which is authentication. How is authentication done? When we send data to the server, hackers (attackers) may disguise themselves as any party to steal information. It can pretend to be you to send information to the server, or it can pretend to be the server and receive the information you send. So how do we solve this problem?

6. Authentication

How do you determine your own uniqueness? In the above description, we have mentioned the concept of public key encryption and private key decryption. The private key mentioned is owned by you alone and can identify uniqueness, so we can change the order to private key encryption and public key decryption. Using the private key plus the digest algorithm, you can achieve digital signature and thus authentication.

By now, we have achieved encryption, data authentication, and certification by using symmetric encryption, asymmetric encryption, and digest algorithms. So is it safe? No, there is still a digital signature authentication problem. Because the private key is your own, and the public key can be published by anyone, so the authenticated public key must be published to solve the public key trust problem.

So CA was introduced. The full name of CA is Certificate Authority. You must let CA issue certified public keys to solve the trust problem of public keys.

There are only a few CAs with certification in the world, and they issue three types of certifications: DV, OV, and EV. The difference lies in the degree of trust. DV is the lowest, and is only trusted at the domain name level. EV is the highest, and has been strictly verified by laws and audits, and can prove the identity of the website owner (the company name will be displayed in the browser address bar, such as Apple and GitHub's website). Institutions with different trust levels form a hierarchical relationship.

Typically, an applicant for a digital certificate will generate a key pair consisting of a private and public key, and a Certificate Signing Request (CSR). The CSR is an encoded text file that contains the public key and other information that will be included in the certificate (such as domain name, organization, email address, etc.). Key pair and CSR generation is usually done on the server where the certificate will be installed, and the type of information included in the CSR depends on the verification level of the certificate. Unlike the public key, the applicant's private key is kept safe and should never be shown to the CA (or anyone else).

After the CSR is generated, the applicant sends it to the CA, which verifies that the information it contains is correct and, if correct, digitally signs the certificate using the issued private key before sending it to the applicant.

Summarize

In this article, we mainly talk about why HTTPS appears, what problems HTTPS solves for HTTP, what is the relationship between HTTPS and HTTP, what are TLS and SSL, what problems TLS and SSL solve? How to achieve a truly secure data transmission?

<<:  Many manufacturers are competing to lay out the Wi-Fi 6 industry chain

>>:  Uncovering the Cost of Cyber ​​Attacks in the 5G Era

Recommend

Five signs SCVMM isn't right for your data center

Today, System Center Virtual Machine Manager (SCV...

How 5G deployment will impact enterprise network hardware and software

For most enterprises, IT teams will deploy fifth-...

BandwagonHost: $37.3/year KVM-1GB/20GB/1TB/Fremont Data Center

In January this year, BandwagonHost released a pa...

Key Roles of Artificial Intelligence in Mobile App Development

[[431728]] 【51CTO.com Quick Translation】 Today, t...

From Wi-Fi to Wired: Exploring the Role of Cable in Wireless Networks

In an increasingly connected world, wireless netw...

Survey: Germany more dependent on Huawei 5G equipment than before

Germany is even more reliant on Huawei for its 5G...

Tested in 6 cities! How fast can 5G uplink with Super Uplink run?

Ever since 5G has entered the homes of ordinary p...

5G enables the industrial Internet to flourish

[[441504]] 5G remote ultrasonic robot diagnostic ...

Is the United States blocking Huawei, or the entire future of 5G?

[[349279]] The United States is creating obstacle...

Asia Pacific to account for 60% of global 5G connections by 2026

[[422145]] According to new market research, ther...

Seven weapons of blockchain technology in the financial field

In the innovation and application exploration of ...

The 5 keys and applications of blockchain

A few years ago, not many people had heard of the...