Development History 1. A long time ago, the Web was basically just about browsing documents. Since it was just browsing, as a server, there was no need to record who browsed what documents in a certain period of time. Each request is a new HTTP protocol, which is a request plus a response. In particular, I don’t have to remember who just sent the HTTP request. Each request is brand new to me. This period of time is very happy. 2. However, with the rise of interactive Web applications, such as online shopping sites, sites that require login, etc., we immediately face a problem, that is, to manage sessions, we must remember who logs into the system and who puts items in their shopping carts. That is to say, I have to distinguish each person, which is a big challenge. Because HTTP requests are stateless, the solution I came up with is to send everyone a session id. To put it simply, it is a random string. Everyone receives a different string. Every time someone makes an HTTP request to me, this string is sent along with the request, so that I can distinguish who is who. 3. This makes everyone happy, but the server is not happy. Everyone only needs to save their own session ID, while the server needs to save everyone's session ID. If there are many accesses to the server, there will be tens of thousands or even hundreds of thousands of them. This is a huge overhead for the server and severely limits the server's scalability. For example, I use two machines to form a cluster. Xiao F logs in to the system through machine A, and the session id will be saved on machine A. What if Xiao F's next request is forwarded to machine B? Machine B does not have Xiao F's session id. Sometimes a little trick is used: session sticky, which is to make Xiao F's request stick to machine A all the time, but this does not work. If machine A hangs up, it has to go to machine B. Then I had to copy the session and move the session ID back and forth between the two machines. It was so tiring. Later, a company called Memcached came up with a solution: store the session ID in one place, and all machines can access the data in this place. In this way, there is no need for replication, but the possibility of single point failure is increased. If the machine responsible for the session hangs up, everyone will have to log in again, and they will probably be scolded to death. I also tried to cluster this single point machine to increase reliability, but no matter what, this small session is a heavy burden for me. 4. So some people have been thinking, why should I save this damn session, how much better it would be to let each client save it? But if I don’t save these session ids, how can I verify that the session id sent to me by the client is indeed generated by me? If we don't verify, we don't know whether they are legitimate users, and those guys with bad intentions can forge session IDs and do whatever they want. Yes, yes, the key point is verification! For example, Xiao F has logged into the system, and I send him a token, which contains Xiao F's user ID. The next time Xiao F visits me through Http request again, he can bring this token through Http header. But this is no different from session ID in essence, anyone can forge it, so I have to think of a way to prevent others from forging it. Then let's sign the data. For example, I use the HMAC-SHA256 algorithm and a key that only I know to sign the data. I use this signature and the data as a token. Since no one else knows the key, the token cannot be forged. I don't save this token. When Xiao F sends me this token, I use the same HMAC-SHA256 algorithm and the same key to calculate the signature of the data again, and compare it with the signature in the token. If they are the same, I know that Xiao F has logged in and can directly obtain Xiao F's user ID. If they are different, the data part must have been tampered with, and I will tell the sender: Sorry, no authentication. The data in the Token is stored in plain text (although I will use Base64 to encode it, but that is not encryption), and can still be seen by others, so I cannot store sensitive information such as passwords in it. Of course, if someone's token is stolen, I can't do anything about it. I will also think that the thief is a legitimate user. This is actually the same as someone's session ID being stolen. In this way, I don't save the session ID. I just generate the token and then verify the token. I use my CPU computing time to get my session storage space! After removing the burden of the session ID, I can say that I have nothing to worry about. My machine cluster can now be easily expanded horizontally. If the number of user visits increases, I can just add machines. This stateless feeling is really great! Cookie Cookie is a very specific thing. It refers to a type of data that can be permanently stored in the browser. It is just a data storage function implemented by the browser. The cookie is generated by the server and sent to the browser. The browser saves the cookie in the form of kv in a text file in a certain directory and sends the cookie to the server the next time the same website is requested. Since cookies are stored on the client, browsers add some restrictions to ensure that cookies are not used maliciously and do not take up too much disk space, so the number of cookies per domain is limited. Session Session literally means conversation. It is similar to when you are talking to someone, how do you know that the person you are talking to is Zhang San and not Li Si? The other person must have some characteristics (same appearance) to show that he is Zhang San. The same principle applies to session. The server needs to know who is currently sending the request to it. In order to make this distinction, the server must assign a different "identity identifier" to each client, and then each time the client sends a request to the server, it will include this "identity identifier" so that the server knows who the request comes from. As for how the client saves this "identity", there are many ways. For browser clients, everyone uses cookies by default. The server uses session to temporarily save the user's information on the server, and the session will be destroyed after the user leaves the website. This method of storing user information is safer than cookies, but sessions have a flaw: if the web server does load balancing, the session will be lost when the next operation request reaches another server. Token Token-based authentication is ubiquitous in the Web space. In most Internet companies that use Web APIs, tokens are the best way to handle authentication for multiple users. The following features will allow you to use Token-based authentication in your application:
Those who use token-based authentication Most of the APIs and web applications you have seen use tokens. For example, Facebook, Twitter, Google+, GitHub, etc. The Origin of Token Before introducing the principles and advantages of Token-based authentication, let’s take a look at how authentication was done before. Server-based authentication We all know that the HTTP protocol is stateless, which means that the program needs to verify each request to identify the client. Before this, programs used to identify requests through login information stored on the server. This was usually done by storing sessions. With the rise of the Web, applications, and mobile terminals, this authentication method has gradually exposed problems, especially in terms of scalability. Some problems exposed by server-based authentication 1. Session: Every time an authenticated user initiates a request, the server needs to create a record to store the information. As more and more users send requests, the memory overhead will continue to increase. 2. Scalability: Using Session to store login information in the server's memory brings with it scalability issues. 3. CORS (Cross-Origin Resource Sharing): When we need to use data across multiple mobile devices, cross-domain resource sharing can be a headache. When using Ajax to fetch resources from another domain, requests may be prohibited. 4. CSRF (Cross-site Request Forgery): When users visit banking websites, they are vulnerable to cross-site request forgery attacks and can be exploited to access other websites. Among these problems, expandable rows are the most prominent, so it is necessary for us to seek a more effective method. Token-based verification principle Token-based authentication is stateless, and we do not store user information on the server or in the Session. This concept solves many problems when storing information on the server side. NoSession means that your program can add or remove machines as needed without worrying about whether the user is logged in. The process of token-based authentication is as follows:
Each request requires a token. The token should be sent in the HTTP header to ensure that the HTTP request is stateless. We also set the server property Access-Control-Allow-Origin:* to allow the server to accept requests from all domains. It should be noted that when designating * in the ACAO header, certificates such as HTTP authentication, client SSL certificates and cookies must not be included. Implementation ideas: 1. User login verification, after successful verification, the Token is returned to the client. 2. The client receives the data and saves it on the client 3. Each time the client accesses the API, it carries the Token to the server. 4. The server uses a filter to verify. If the verification is successful, the request data is returned, and if the verification fails, an error code is returned. After we authenticate the information in the program and obtain the token, we can do many things with this Token. We can even create a permission-based token and pass it to third-party applications, which will be able to access our data (of course only with the specific token we allow). Advantages of Tokens Stateless and scalable Tokens stored on the client are stateless and can be extended. Based on this statelessness and not storing session information, the load balancer can pass user information from one service to other servers. If we store the authenticated user's information in the Session, each request will require the user to send authentication information to the authenticated server (called Session affinity). This may cause some congestion when there are a large number of users. But don’t worry. These problems can be solved by using tokens, because tokens hold the user’s verification information. Security Sending a token in the request instead of a cookie can prevent CSRF (Cross-Site Request Forgery). Even if you use cookies to store tokens on the client side, cookies are just a storage mechanism and not used for authentication. Not storing information in the Session reduces the need for session operations. Tokens are time-limited, and users need to re-verify after a period of time. We don’t necessarily need to wait until the token automatically expires. Tokens can be revoked, and token revocataion can invalidate a specific token or a group of tokens with the same authentication. Scalability Tokens can create programs that share permissions with other programs. For example, you can link a random social account to your main account (Facebook or Twitter). When logging to Twitter via the service (we are allowing Buffer to post to our Twitter stream), we can append these Buffers to the Twitter stream. When using tokens, you can provide optional permissions to third-party applications. When a user wants another application to access their data, we can build our own API to obtain tokens with special permissions. Multi-platform cross-domain Let's talk about CORS (cross-origin resource sharing) in advance. When expanding applications and services, various devices and applications need to be involved. Having our API just serve data, we can also make the design choice to serve assets from a CDN. This eliminates the issues that CORS brings up after we set a quick header configuration for our application. As long as the user has a verified token, data and resources can be requested on any domain. When creating a token based on the standard, you can set a number of options. We will describe these in more detail in a later article, but the standard usage is reflected in JSON Web Tokens. The latest code and documentation is for JSON Web Tokens. It supports many languages. This means you can actually switch your authentication mechanism in the future. |
>>: IPv4 scarcity threatens Internet development
LocVps is a long-established Chinese hosting comp...
This month, RAKsmart continues the previous activ...
Let's learn about HTTPS. First, let me ask yo...
Preface The daily bug troubleshooting series is a...
I haven't installed a panel for several years...
This article is reprinted from the WeChat public ...
[[403061]] This article is reprinted from the WeC...
[[386960]] It is easy to write Python crawlers, b...
Amazon, Microsoft and Google account for more tha...
[[390586]] 2020 is destined to be an extraordinar...
Although the top leadership has once again clarif...
When it comes to smart campus construction and ed...
Even in the global economic downturn, the network...
At present, cloud-network integration is facing n...
[[428404]] This article is reprinted from the WeC...