When setting up cache, everyone may consider it from a performance perspective, but if you are not careful or set it up improperly, cache may also have a negative impact on the security of our website and user privacy. Straight to the pointAs usual, I'll first state the recommended configuration, and then I'll go into detail later:
So why are these two configurations recommended? What risks will our website face if they are not configured? Let me explain below. Review of HTTP cachingWhen it comes to caching, you may quickly think of two caching methods and the corresponding request headers. Let's quickly review them. Normally, our browser client will initiate a request to the server, and then the server will return the data response to the client. However, a server may have to respond to requests from thousands of clients, many of which are duplicate requests, which puts a lot of pressure on the server. Therefore, we usually perform some caching between the client and the server. For some repeated request data, if the previous response has been stored in the cache database, it will be directly retrieved from the cache if certain conditions are met, and will not reach the server. Then, HTTP cache is generally divided into two types, strong cache and negotiated cache: Strong CacheStrong cache: if the cached data is not invalid, the client can directly use the cached data without interacting with the database. Then, judging whether a request is invalid mainly depends on two HTTP Headers:
Negotiation CacheNegotiated caching, as the name implies, requires a negotiation with the server. When the browser makes its first request, the server returns the cache identifier and data to the client, and the client backs up both to the cache database. When requesting data again, the client sends the backup cache identifier to the server. The server makes a judgment based on the cache identifier. If the judgment is successful, it returns a 304 status code to inform the client that the request is successful and the cached data can be used. The following two sets of HTTP Headers are mainly used to judge requests:
The server will compare the received If-Modified-Since with the last modification time of the resource to determine whether to use the cache.
The server compares the received If-None-Match with the unique identifier of the resource to determine whether to use the cache. Common Misconceptions About CachingThe knowledge mentioned above is probably what everyone usually recites, but have you ever thought about a question seriously? Must the cached data we obtain be cached in the browser? In fact, this is not the case: there are usually multiple levels of resource caches, some caches are dedicated to a single user, some caches are dedicated to multiple users. Some are controlled by the server, some are controlled by the user, and some are controlled by the intermediary layer.
In addition, we often use locally configured proxies that can cache HTTPS resources by configuring trusted certificates. Spectre VulnerabilitySo how can cache pose a threat to the security of our website and the privacy of our users? Let’s look at a very famous vulnerability: Spectre. Attackers can exploit the Spectre vulnerability to read the memory of the operating system process, which means they can access unauthorized cross-domain data. Especially when using some APIs that need to interact with computer hardware:
To this end, browsers once disabled high-risk APIs such as SharedArrayBuffer. Many friends are interested in the specific attack principle. How to access data without permission through several JavaScript APIs? Next time I will write a special article to talk about this. How does caching affect Spectre?So what does Spectre have to do with cache? We can understand it simply like this: If we open a page that is subject to cross-domain restrictions normally, we will definitely not be able to obtain data. However, if our Cache-Control is set to Public, the data may be cached in a Public Cache (such as our local proxy cache). Although we do not have permission to access this data, the data is stored in the cache database. Once the data has been stored, attackers can use the Spectre vulnerability to obtain the cache data. So why can Spectre be used to gain unauthorized access to cached data? Let's take a simple example: For example, we have a website whose login password is conardli. If an attacker wants to crack our password, assuming that our password must be composed of lowercase letters, then the attacker will need at least 26 to the 8th power to guess our password. This is a very large number and it is almost impossible to crack it successfully. Suppose that our password is stored in a piece of memory that the attacker has no access to, and then the attacker uses a separate piece of memory to store all 26 English letters and sets this memory to be non-cacheable. At this time, the attacker has crossed the boundary and accessed the storage area of our password, and has accessed the letter c. However, due to permission issues, he will definitely not be able to access it and will be rejected by the computer. However, even though it cannot be accessed, the letter c will be cached. At this time, the attacker goes back to traverse the memory of his 26 letters and finds that the access speed of c has become faster... So, the first digit of your password is c... I will just briefly talk about it here. In the next article, I will talk specifically about the Spectre vulnerability, which is quite clever... If you are interested, please let me know in the comment section. Recommended configuration for the websiteBecause of the above problems, we recommend the following two configurations for all important website data: Disable Public CacheSet Cache-Control: private, which can disable all public caches (such as proxies), which reduces the possibility of attackers accessing public memory across boundaries. Note that the private value is not an independent value. For example, it can coexist with max-age, and its performance is not much different from public. Let's open Google's website and take a look: Set the appropriate second-level cache keyBy default, our browser cache uses URL and request method as cache key. This means that if a website requires login, requests from different users will be cached in one memory because their request URLs and methods are the same. This is obviously a bit problematic, and we can avoid this problem by setting Vary: Cookie. When the user identity information changes, the cached memory will also change. Of course, if your resource is a public CDN resource that everyone can access, then your cache can be set up casually. If your resource data is relatively sensitive, it is recommended to use the above two settings. |
<<: What did Chinese operators show the world at the Winter Olympics?
>>: Let's talk about NAT protocol???
2023 has officially begun, and RAKsmart has launc...
[[414382]] 1. Background On July 9, 2018, I joine...
[[405869]] As we all know, in the past three or f...
In terms of network construction scale, the numbe...
AkkoCloud is a Chinese VPS hosting company founde...
[[181162]] At the arrival of the new year, China&...
5G uses large-scale antenna systems and ultra-den...
introduction: 1: CC attack is normal business log...
SmartHost has launched a promotion for Storage VP...
[[402903]] 1. Trends in enterprise-level wireless...
[[345832]] "Read the Papers" is a serie...
When buying a wireless router, the first thing to...
As we all know, Ethernet has become the most wide...
In the era of information explosion, consumers ar...
Updated again, CloudCone's Christmas promotio...