Let’s talk about the stories behind Cookie, Session and Token

Let’s talk about the stories behind Cookie, Session and Token

Hello everyone, I am Director Dabai(●—●).

Today I want to talk to you about cookies, sessions, and tokens.

This is what one of my reader friends encountered when he was interviewing for an internship position at WeChat. I would like to share it with you.

picture

Without further ado, let’s drive.

1. Upgrade the interactive experience of the website

As netizens, we use browsers to visit various websites every day to meet our daily work and life needs.

picture

The current interactive experience is still very smooth, but it was not like this in the early days. It was a one-time deal.

1.1 Stateless HTTP Protocol

What is the stateless http protocol?

HTTP is a stateless protocol, which means that the protocol has no memory for business processing. It cannot remember what was done before. Each request is completely independent and does not affect each other. There is no context information.

The lack of state means that if subsequent processing requires previous information, it must retransmit critical information, which may result in an increase in the amount of data transmitted per connection.

If you don’t understand, think about this scene from “Charlotte’s Troubles”:

You probably understand it now. If we keep using this native stateless http protocol, we may have to log in again every time we change a page. What's the point of that?

Therefore, it is necessary to solve the statelessness of the http protocol and improve the interactive experience of the website, otherwise there will be no way to reach the stars and the sea.

1.2 Solution

The only two parties interacting in the whole thing are the client and the server, so we must start with these two parties.

  • The client pays the bill. Each time the client makes a request, it encapsulates the necessary information and sends it to the server, which then checks and processes it.

picture

  • The server pays the bill. After the client's first request, the server starts to record it. Then the client only needs to send the most basic and minimum information in subsequent requests. Not too much information is needed.

picture

2. Cookie solution

Cookies are always saved in the client. According to the storage location in the client, they can be divided into memory cookies and hard disk cookies. Memory cookies are maintained by the browser and saved in the memory. They disappear after the browser is closed. Their existence time is short. Hard disk cookies are saved in the hard disk and have an expiration time. Unless the user manually cleans them or the expiration time is reached, hard disk cookies will not be deleted. Their existence time is long.

picture

2.1 Cookie definition and function

HTTP Cookie (also called Web Cookie or Browser Cookie) is a small piece of data sent by the server to the user's browser and saved locally. It will be carried and sent to the server the next time the browser makes a request to the same server.

Cookies are usually used to inform the server whether two requests come from the same browser, such as keeping the user logged in. Cookies make it possible to record stable status information based on the stateless HTTP protocol.

Cookies are mainly used in the following three aspects:

  • Session state management (such as user login status, shopping cart, and other information that needs to be recorded)
  • Personalization settings (such as user-defined settings, themes, etc.)
  • Browser behavior tracking (such as tracking and analyzing user behavior, etc.)

2.2 Server creates Cookie

When a server receives an HTTP request, the server can add a Set-Cookie option in the response header.

After receiving the response, the browser usually saves the cookie, and then sends the cookie information to the server through the Cookie request header in each subsequent request to the server. In addition, the expiration time, domain, path, validity period, and applicable site of the cookie can be specified as needed.

picture

2.3 Cookie interaction between B/S

picture

The server sends cookie information to the user's browser using the Set-Cookie response header.

A simple cookie might look like this:

 Set-Cookie: <cookie名>=<cookie值> HTTP/1.0 200 OK Content-type: text/html Set-Cookie: yummy_cookie=choco Set-Cookie: tasty_cookie=strawberry

Every time the client makes a new request to the server, the browser will send the previously saved cookie information to the server through the Cookie request header.

 GET /sample_page.html HTTP/1.1 Host: www.example.org Cookie: yummy_cookie=choco; tasty_cookie=strawberry

Let me visit Taobao and grab a package to see the real process:

picture

2.4 Problems

Cookies are often used to mark users or authorized sessions. After being sent by the browser, they may be hijacked and used for illegal activities, which may cause the authorized user's session to be attacked, thus posing a security issue.

Another situation is cross-site request forgery (CSRF). Simply put, for example, when you log in to a phishing website at the same time as logging in to a bank website, when you perform certain operations on the phishing website, you may obtain cookie information related to the bank website and initiate illegal activities such as transferring money to the bank website.

Cross-site request forgery (CSRF), also known as one-click attack or session riding, is a method of attack that forces a user to perform unintended actions on a currently logged-in web application. Compared to cross-site scripting (XSS), XSS exploits the user's trust in a given website, while CSRF exploits the website's trust in the user's web browser.

To put it simply, a cross-site request attack is when an attacker uses some technical means to trick the user's browser into visiting a website that the attacker has authenticated and performing some operations (such as sending emails, messages, or even financial operations such as transferring money and purchasing goods).
Since the browser has been authenticated, the visited website will think it is a real user operation and run it. This takes advantage of a vulnerability in user authentication on the web: simple authentication can only ensure that the request is sent from a user's browser, but it cannot ensure that the request itself is voluntarily issued by the user.

However, there are many solutions to this situation, especially for financial sites such as banks, where any sensitive operations by users need to be confirmed, and cookies with sensitive information can only have a short life cycle.

At the same time, cookies have limitations on capacity and quantity. Sending a lot of information each time will result in additional traffic consumption, and complex behavioral cookies cannot meet the requirements.

picture

Special note: The above problems only exist when Cookies are used to achieve interactive states, but they are not problems with Cookies themselves.

Just think about it: kitchen knives can be used to cook, but they can also be used to commit certain violent acts. Can you say that kitchen knives should be abolished?

3. Session Solution

3.1 Concept of Session Mechanism

If Cookie is a client behavior, then Session is a server behavior.

picture

After the Cookie mechanism initially interacts with the server, all information needed to maintain the state will be stored on the client, and will be directly read and sent to the server for interaction.

A session represents a conversation between a server and a browser and is completely controlled by the server, which implements functions such as assigning IDs, storing session information, and retrieving sessions.

The Session mechanism stores all the user's activity information, context information, login information, etc. on the server side, and only generates a unique identification ID and sends it to the client. Subsequent interactions will not have repeated user information transmission, but will be replaced by a unique identification ID, which we will call Session-ID for now.

3.2 Simple interaction process

  • When the client requests a session object for the first time, the server will create a session for the client and calculate a session ID through a special algorithm to identify the session object.
  • When the browser requests another resource next time, the browser will place the sessionID in the request header. After receiving the request, the server parses it to obtain the sessionID. The server finds the session with the id to determine the identity of the requester and some context information.

3.3 Session Implementation

First of all, it should be clear that there is no direct relationship between Session and Cookie. It can be considered that Cookie is just a way to implement the Session mechanism. Other methods can be used without Cookie.

The relationship between Session and Cookie is like the relationship between overtime and overtime pay. They seem to be closely related, but in fact they have nothing to do with each other.

There are two main ways to implement session: cookies and URL rewriting. Cookies are the preferred method because all modern browsers have the cookie function enabled by default. However, each browser also has a setting that allows cookies to expire. Therefore, a backup is needed for the Session mechanism.

picture

The technique of appending a session identification number as a parameter to the URL address of a hyperlink is called URL rewriting.

原始的URL: http://taobao.com/getitem?name=baymax&actinotallow=buy重写后的URL: http://taobao.com/getitem?sessinotallow=1wui87htentg&?name=baymax&actinotallow=buy

3.4 Problems

picture

Since the session information is stored on the server, if the number of users is large, the space occupied by the session information cannot be ignored.

For large websites, clustered and distributed server configurations are necessary. If the session information is stored locally, then due to the role of load balancing, the original request was made to machine A and the session information was stored. The next request may go to machine B, and there is no session information on machine B at this time.

In this case, either duplicate creation on machine B causes waste, or a highly available Session cluster solution is introduced, a Session proxy is introduced to achieve information sharing, or customized hashing is implemented to cluster A, which is actually a bit complicated.

picture

4. Token Solution

Token is a token that is generated by the server and issued to the client. It is a time-limited means of verifying identity.

Token avoids the massive information storage problem brought by the Session mechanism, and also avoids some security issues of the Cookie mechanism. It has a wide range of uses in modern mobile Internet scenarios, cross-domain access and other scenarios.

4.1 Simple interaction process

picture

  • The client submits the user's account and password to the server
  • The server verifies it and generates a token value and returns it to the client as an identity token for subsequent request interactions.
  • After the client gets the token value returned by the server, it can save it locally and carry the token with it every time it requests the server and submit it to the server for identity verification.
  • After receiving the request, the server parses the key information, and then generates a sign based on the same encryption algorithm, key, and user parameters, and compares it with the client's sign. If they are consistent, the request is accepted, otherwise the service is rejected.
  • After verification, the server can obtain the corresponding user information based on the uid in the Token and respond to the business request.

4.2 Token design concept

Taking JSON Web Token (JWT) as an example, the token mainly consists of three parts:

  • The header information records the encryption algorithm used.
  • Payload information records user information and expiration time, etc.
  • Signature information is generated based on the encryption algorithm in the header, the user information in the payload, and the key. It is an important basis for the server to verify the server.

picture

The information in the header and payload is not encrypted, only general base64 encoding is performed. After receiving the token, the server strips out the header and payload to obtain information such as the algorithm, user, and expiration time, and then generates a sign based on its own encryption key. It compares the consistency with the sign sent by the client to determine the legitimacy of the client's identity.

In this way, the CPU encryption and decryption time is exchanged for storage space. At the same time, the importance of the server-side key is obvious. Once it is leaked, the entire mechanism collapses. At this time, HTTPS needs to be considered.

4.3 Characteristics of the Token Solution

  • Token can be shared across sites to achieve single sign-on
  • The Token mechanism does not require much storage space. The Token contains the user's information and only needs to store the status information on the client, which is very scalable for the server.
  • The security of the token mechanism depends on the security of the server-side encryption algorithm and key
  • Token mechanism is not a panacea

5. Summary

Cookies, Sessions, and Tokens are the products of different stages of development, and each has its own advantages and disadvantages. There is no obvious opposition between the three. Instead, they often appear together, which is why they are easily confused.

Cookies focus on the storage of information, mainly client-side behavior, while Session and Token focus on identity authentication, mainly server-side behavior.

The three solutions are still viable in many scenarios. Only by understanding the scenarios can you choose the appropriate solution. There is no silver bullet.

That’s all I have to say. See you next time.

<<:  A 100% timeout murder caused by maxing out the bandwidth!

>>:  What will be the consequences if all three major operators upgrade to 5G in five years?

Recommend

Wireless router, how many little secrets do you have?

Everyone has a wireless router at home. However, ...

5G lacks "soul"? Advanced technology is coming

Judging from the current situation, 5.5G technolo...

New Development Trends of Cultural Industry in the 5G Era

5G technology has the characteristics and advanta...

Rethinking the future of 5G through the lens of extended reality (XR)

5G technology is developing globally, and Singapo...

Exploring the core idea of ​​the Reactor network model

In the network programming series, we implemented...

I experienced 5G network on Beijing Line 16 for a while and used up 7GB of data

What is 5G? Do I need to change my SIM card? Can ...