There are many ways to read data from http.Request.Body or http.Response.Body. Most of the methods in the standard library use ioutil.ReadAll to read all the data at once. If the data is in json format, you can also use json.NewDecoder to create a parser from io.Reader. If you use pprof to analyze the program, you will always find that bytes.makeSlice allocates a lot of memory and is always ranked first. Today, let's talk about how to read data from http efficiently and elegantly.
Background We have many API services, all of which use the JSON data format. The request body is the entire JSON string. When a request reaches the server, it will go through some business processing, and then request more services. All services communicate with each other using the HTTP protocol (ah, why not use RPC, because all services will be open to third parties, and HTTP + JSON is better for docking). Most request data sizes are between 1K~4K, and response data are between 1K~8K. In the early days, all services used ioutil.ReadAll to read data. As the traffic increased, pprof was used to analyze and found that bytes.makeSlice was always ranked last and occupied 1/10 of the memory allocation of the entire program. I decided to optimize this problem. The following is a record of the entire optimization process. pprof analysis Here we use the API in https://github.com/thinkeridea/go-extend/blob/master/exnet/exhttp/expprof/pprof.go to implement the /debug/pprof monitoring interface of the production environment. The net/http/pprof package of the standard library is not used because it will automatically register the route and open the API for a long time. This package can set whether the API is open and automatically close the interface after a specified time to avoid tool sniffing. After the service deployment is stable (about a day and a half later), download the allocs data through curl, and then use the following command to view the analysis.
From the results, we can see that a total of 1358.61GB was allocated during the collection period, and the top 10 occupied 44.50%, of which bytes.makeSlice accounted for nearly 1/10. So let's see who is calling bytes.makeSlice.
From the above figure, we can see that the final method of calling bytes.makeSlice is ioutil.ReadAll (due to the length of the article, the method above ioutil.ReadAll is not intercepted), and 90% of them are ioutil.ReadAll calls to read http data. Don't rush to think of optimization solutions when you find the place. Let's first see why ioutil.ReadAll causes so much memory allocation.
The above is the code of the standard library ioutil.ReadAll. Each time, a var buf bytes.Buffer is created and the size of buf.Grow(int(capacity)) is initialized to bytes.MinRead, which is 512. According to the size of this buffer, 2 to 16 memory allocations are required to read data once. This is unbearable. I should create a buffer myself. Take a look at the flame graph 🔥. The part marked with a red frame is the ioutil.ReadAll part, which has a brighter color. Optimizing reading methods Create a large enough buffer yourself to reduce the problem of multiple expansions due to insufficient capacity.
Well, this should be about the same. Why is the size of 4096 initialized? This is an average. Even if it is larger than 4096, basically you only need to allocate memory once more, and most data is smaller than 4096. But is this really a good idea? Of course not. This buffer needs to be created for each request. Shouldn't we consider reuse? Using sync.Pool to create a buffer pool would be even better. Here is the simplified code for optimizing read requests:
Is the way of using sync.Pool a bit strange? It's mainly because of defer and api.pool.Put(buffer);buffer = nil. Let me explain here. In order to improve the reuse rate of buufer, the buffer will be put back to the buffer pool as soon as possible when it is not in use. The reason why defer judges buffer != nil is mainly when an error occurs in the business logic, but the buffer is not put back to the buffer pool yet, so the buffer is put back to the buffer pool. Because writing api.pool.Put(buffer) after each error handling is not a good method and it is easy to forget. However, if api.pool.Put(buffer);buffer = nil is used when it is determined that it will not be used anymore, the buffer can be put back to the buffer pool as soon as possible, which improves the reuse rate and reduces the creation of new buffers. Is that all? Don’t worry. As mentioned before, the service will also build requests. Let’s see how to optimize the build request.
This example is similar to the previous one, except that it is not only used to read http.Response.Body, but also to create a jsoniter.NewEncoder to compress the request into a json string and use it as the body parameter of http.NewRequest. If you use jsoniter.Marshal directly, it will also create a lot of memory. jsoniter also uses buffer as the buffer, and the default size is 512. The code is as follows:
And after serialization, a data copy will be made:
Since we are going to use buffer, let's do it together ^_^. This can reduce multiple memory allocations. Before reading http.Response.Body, be sure to remember buffer.Reset(). This basically completes the data reading optimization of http.Request.Body and http.Response.Body. The specific effect will be checked after it runs online for a while and stabilizes. Effect analysis After running online for a day, let’s take a look at the results.
Wow! bytes.makeSlice finally disappeared from the top 10. That’s great. Let’s take a look at other calls to bytes.makeSlice.
From the figure, we can find that the allocation of bytes.makeSlice is already very small, and most of it is because http.Request.ParseForm reads http.Request.Body using ioutil.ReadAll. The effect of this optimization is very good. Let's take a look at the more intuitive flame graph. Compared with before optimization, it is obvious that ioutil.ReadAll is no longer visible. Problems encountered during optimization I am ashamed to say that there was a mistake during the optimization process, which caused the production environment to fail for 2 minutes. It was quickly recovered through automatic deployment and immediate rollback. After that, the code was analyzed and the problem was solved before it was fully optimized after going online. Let me summarize the problems that occurred. When building the http request, I divided it into two parts for optimization: serializing json and reading http.Response.Body data. I kept the idea of ​​putting the buffer back into the buffer pool as soon as possible, because http.DefaultClient.Do(req) is a network request, which is relatively time-consuming. Before this, I put the buffer back into the buffer pool, and then re-acquire a buffer when reading http.Response.Body. The code is as follows:
An error occurred immediately after going online: http: ContentLength=2090 with Body length 0. When sending a request, I read data from the buffer and found that the data was missing or insufficient. What the hell was going on? I rolled back and restored the business immediately, and then analyzed http.DefaultClient.Do(req) and http.NewRequest. When calling http.NewRequest, no data was read from the buffer, but only a req.GetBody was created and then the data was read in http.DefaultClient.Do. Because the buffer was put back into the buffer pool before http.DefaultClient.Do, other goroutines obtained the buffer and reset it, which caused data contention. Of course, it led to incomplete data reading. I was really ashamed that I knew too little about http.Client. I will try to go through the source code when I have time. Summarize Use a buffer of appropriate size to reduce memory allocation. sync.Pool can help reuse buffers. Be sure to write these logics yourself and avoid using third-party packages. Even if third-party packages use the same techniques, in order to avoid data contention, they will inevitably copy a new data when returning data. For example, although jsoniter uses sync.Pool and buffer, it still needs to copy the data when returning it. In addition, this general package cannot give an initial buffer size that is very suitable for the business. Too small will lead to data copying, and too large will waste too much memory. Making good use of buffer and sync.Pool in a program can greatly improve the performance of the program, and the combination of these two is very simple to use and does not make the code complicated. |
>>: Problems that edge computing needs to solve urgently
With the advent of 5G technology, 5G has become a...
Today's topic is $1 VPS, and here's the i...
[51CTO.com original article] There is no doubt th...
The criteria for evaluating the business value of...
We know that the direct connection between two co...
On January 22, according to Feixiang.com, on Janu...
WeChat PC version 3.0 is here, and this time two ...
iWebFusion (also known as iWFHosting) is a subsid...
[[384501]] The three major operators all released...
When it comes to IT operations and maintenance, m...
[[423700]] The State Council Information Office h...
BandwagonHost has also released a Double 11 disco...
OneTechCloud is a Chinese hosting company founded...
The intelligentization of weak-current electricit...