Elegantly read http request or response data

There are many ways to read data from http.Request.Body or http.Response.Body. Most of the methods in the standard library use ioutil.ReadAll to read all the data at once. If the data is in json format, you can also use json.NewDecoder to create a parser from io.Reader. If you use pprof to analyze the program, you will always find that bytes.makeSlice allocates a lot of memory and is always ranked first. Today, let's talk about how to read data from http efficiently and elegantly.

[[256407]]

Background

We have many API services, all of which use the JSON data format. The request body is the entire JSON string. When a request reaches the server, it will go through some business processing, and then request more services. All services communicate with each other using the HTTP protocol (ah, why not use RPC, because all services will be open to third parties, and HTTP + JSON is better for docking). Most request data sizes are between 1K~4K, and response data are between 1K~8K. In the early days, all services used ioutil.ReadAll to read data. As the traffic increased, pprof was used to analyze and found that bytes.makeSlice was always ranked last and occupied 1/10 of the memory allocation of the entire program. I decided to optimize this problem. The following is a record of the entire optimization process.

pprof analysis

Here we use the API in https://github.com/thinkeridea/go-extend/blob/master/exnet/exhttp/expprof/pprof.go to implement the /debug/pprof monitoring interface of the production environment. The net/http/pprof package of the standard library is not used because it will automatically register the route and open the API for a long time. This package can set whether the API is open and automatically close the interface after a specified time to avoid tool sniffing.

After the service deployment is stable (about a day and a half later), download the allocs data through curl, and then use the following command to view the analysis.

 $ go tool pprof allocs
 File: xxx
 Type: alloc_space
 Time : Jan 25, 2019 at 3:02pm (CST)
 Entering interactive mode (type "help"   for commands, "o"   for options)
 (pprof) top  
 Showing nodes accounting for 604.62GB, 44.50% of 1358.61GB total
 Dropped 776 nodes (cum <= 6.79GB)
 Showing top 10 nodes out   of 155
      flat flat% sum % cum cum%
  111.40GB 8.20% 8.20% 111.40GB 8.20% bytes.makeSlice
  107.72GB 7.93% 16.13% 107.72GB 7.93% github.com/sirupsen/logrus.(*Entry).WithFields
   65.94GB 4.85% 20.98% 65.94GB 4.85% strings. Replace  
   54.10GB 3.98% 24.96% 56.03GB 4.12% github.com/json-iterator/go.(*frozenConfig).Marshal
   47.54GB 3.50% 28.46% 47.54GB 3.50% net/url.unescape
   47.11GB 3.47% 31.93% 48.16GB 3.55% github.com/json-iterator/go.(*Iterator).readStringSlowPath
   46.63GB 3.43% 35.36% 103.04GB 7.58% handlers.(*AdserviceHandler).returnAd
   42.43GB 3.12% 38.49% 84.62GB 6.23% models.LogItemsToBytes
   42.22GB 3.11% 41.59% 42.22GB 3.11% strings. Join  
   39.52GB 2.91% 44.50% 87.06GB 6.41% net/url.parseQuery

From the results, we can see that a total of 1358.61GB was allocated during the collection period, and the top 10 occupied 44.50%, of which bytes.makeSlice accounted for nearly 1/10. So let's see who is calling bytes.makeSlice.

 (pprof) web bytes.makeSlice

From the above figure, we can see that the final method of calling bytes.makeSlice is ioutil.ReadAll (due to the length of the article, the method above ioutil.ReadAll is not intercepted), and 90% of them are ioutil.ReadAll calls to read http data. Don't rush to think of optimization solutions when you find the place. Let's first see why ioutil.ReadAll causes so much memory allocation.

 func readAll(r io.Reader, capacity int64) (b []byte, err error) {
 var buf bytes.Buffer
    // If the buffer overflows, we will get bytes.ErrTooLarge.
    // Return that as an error. Any other panic remains.
 defer func() {
 e := recover()
 if e == nil {
 return  
 }
        if panicErr, ok := e.(error); ok && panicErr == bytes.ErrTooLarge {
 err = panicErr
 } else {
 panic(e)
 }
 }()
    if int64( int (capacity)) == capacity {
        buf.Grow( int (capacity))
 }
 _, err = buf.ReadFrom(r)
 return buf.Bytes(), err
 } 
 
 func ReadAll(r io.Reader) ([]byte, error) {
 return readAll(r, bytes.MinRead)
 }

The above is the code of the standard library ioutil.ReadAll. Each time, a var buf bytes.Buffer is created and the size of buf.Grow(int(capacity)) is initialized to bytes.MinRead, which is 512. According to the size of this buffer, 2 to 16 memory allocations are required to read data once. This is unbearable. I should create a buffer myself.

Take a look at the flame graph 🔥. The part marked with a red frame is the ioutil.ReadAll part, which has a brighter color.

Optimizing reading methods

Create a large enough buffer yourself to reduce the problem of multiple expansions due to insufficient capacity.

 buffer := bytes.NewBuffer(make([]byte, 4096))
 _, err := io.Copy(buffer, request.Body)
 if err != nil{
 return nil, err
 }

Well, this should be about the same. Why is the size of 4096 initialized? This is an average. Even if it is larger than 4096, basically you only need to allocate memory once more, and most data is smaller than 4096.

But is this really a good idea? Of course not. This buffer needs to be created for each request. Shouldn't we consider reuse? Using sync.Pool to create a buffer pool would be even better.

Here is the simplified code for optimizing read requests:

 package adapter 
 
 import (
 "bytes"  
 "io"  
 "net/http"  
 "sync"   
 
 "github.com/json-iterator/go"  
 "github.com/sirupsen/logrus"  
 "github.com/thinkeridea/go-extend/exbytes"  
 ) 
 
 type Adapter struct {
 pool sync.Pool
 } 
 
 func New() *Adapter {
 return &Adapter{
 pool: sync.Pool{
            New: func() interface{} {
 return bytes.NewBuffer(make([]byte, 4096))
 },
 },
 }
 } 
 
 func (api *Adapter) GetRequest(r *http.Request) (*Request, error) {
    buffer := api.pool.Get().(*bytes.Buffer)
 buffer.Reset()
 defer func() {
 if buffer != nil {
            api.pool.Put(buffer)
 buffer = nil
 }
 }() 
 
    _, err := io.Copy(buffer, r.Body)
 if err != nil {
 return nil, err
 } 
 
 request := &Request{}
    if err = jsoniter.Unmarshal(buffer.Bytes(), request); err != nil {
        logrus.WithFields(logrus.Fields{
 "json" : exbytes.ToString(buffer.Bytes()),
        }).Errorf( "jsoniter.UnmarshalJSON fail. error:%v" , err)
 return nil, err
 }
 api.pool.Put(buffer)
 buffer = nil 
 
 // .... 
     
 return request, nil
 }

Is the way of using sync.Pool a bit strange? It's mainly because of defer and api.pool.Put(buffer);buffer = nil. Let me explain here. In order to improve the reuse rate of buufer, the buffer will be put back to the buffer pool as soon as possible when it is not in use. The reason why defer judges buffer != nil is mainly when an error occurs in the business logic, but the buffer is not put back to the buffer pool yet, so the buffer is put back to the buffer pool. Because writing api.pool.Put(buffer) after each error handling is not a good method and it is easy to forget. However, if api.pool.Put(buffer);buffer = nil is used when it is determined that it will not be used anymore, the buffer can be put back to the buffer pool as soon as possible, which improves the reuse rate and reduces the creation of new buffers.

Is that all? Don’t worry. As mentioned before, the service will also build requests. Let’s see how to optimize the build request.

 package adapter 
 
 import (
 "bytes"  
 "fmt"  
 "io"  
 "io/ioutil"  
 "net/http"  
 "sync"   
 
 "github.com/json-iterator/go"  
 "github.com/sirupsen/logrus"  
 "github.com/thinkeridea/go-extend/exbytes"  
 ) 
 
 type Adapter struct {
 pool sync.Pool
 } 
 
 func New() *Adapter {
 return &Adapter{
 pool: sync.Pool{
            New: func() interface{} {
 return bytes.NewBuffer(make([]byte, 4096))
 },
 },
 }
 } 
 
 func (api *Adapter) Request(r *Request) (*Response, error) {
 var err error
    buffer := api.pool.Get().(*bytes.Buffer)
 buffer.Reset()
 defer func() {
 if buffer != nil {
            api.pool.Put(buffer)
 buffer = nil
 }
 }() 
 
    e := jsoniter.NewEncoder(buffer)
 err = e.Encode(r)
 if err != nil {
        logrus.WithFields(logrus.Fields{
 "request" : r,
        }).Errorf( "jsoniter.Marshal failure: %v" , err)
 return nil, fmt.Errorf( "jsoniter.Marshal failure: %v" , err)
 } 
 
 data := buffer.Bytes()
    req, err := http.NewRequest( "POST" , "http://xxx.com" , buffer)
 if err != nil {
        logrus.WithFields(logrus.Fields{
 "data" : exbytes.ToString(data),
        }).Errorf( "http.NewRequest failed: %v" , err)
 return nil, fmt.Errorf( "http.NewRequest failed: %v" , err)
 } 
 
    req.Header.Set ( "User-Agent" , "xxx" ) 
 
    httpResponse, err := http.DefaultClient.Do(req)
 if httpResponse != nil {
 defer func() {
            io.Copy(ioutil.Discard, httpResponse.Body)
            httpResponse.Body. Close ()
 }()
 } 
 
 if err != nil {
        logrus.WithFields(logrus.Fields{
 "url" : "http://xxx.com" ,
        }).Errorf( "query service failed %v" , err)
 return nil, fmt.Errorf( "query service failed %v" , err)
 } 
 
    if httpResponse.StatusCode != 200 {
        logrus.WithFields(logrus.Fields{
 "url" : "http://xxx.com" ,
 "status" : httpResponse.Status,
 "status_code" : httpResponse.StatusCode,
        }).Errorf( "invalid http status code" )
 return nil, fmt.Errorf( "invalid http status code" )
 } 
 
 buffer.Reset()
    _, err = io.Copy(buffer, httpResponse.Body)
 if err != nil {
 return nil, fmt.Errorf( "adapter io.copy failure error:%v" , err)
 } 
 
    respData := buffer.Bytes()
    logrus.WithFields(logrus.Fields{
 "response_json" : exbytes.ToString(respData),
    }).Debug( "response json" ) 
 
 res := &Response{}
    err = jsoniter.Unmarshal(respData, res)
 if err != nil {
        logrus.WithFields(logrus.Fields{
 "data" : exbytes.ToString(respData),
 "url" : "http://xxx.com" ,
        }).Errorf( "adapter jsoniter.Unmarshal failed, error:%v" , err)
 return nil, fmt.Errorf( "adapter jsoniter.Unmarshal failed, error:%v" , err)
 } 
     
 api.pool.Put(buffer)
 buffer = nil 
 
 // ...
 return res, nil
 }

This example is similar to the previous one, except that it is not only used to read http.Response.Body, but also to create a jsoniter.NewEncoder to compress the request into a json string and use it as the body parameter of http.NewRequest. If you use jsoniter.Marshal directly, it will also create a lot of memory. jsoniter also uses buffer as the buffer, and the default size is 512. The code is as follows:

 func (cfg Config) Froze() API {
 api := &frozenConfig{
        sortMapKeys: cfg.SortMapKeys,
        indentionStep: cfg.IndentionStep,
        objectFieldMustBeSimpleString: cfg.ObjectFieldMustBeSimpleString,
        onlyTaggedField: cfg.OnlyTaggedField,
        disallowUnknownFields: cfg.DisallowUnknownFields,
 }
    api.streamPool = &sync.Pool{
        New: func() interface{} {
 return NewStream(api, nil, 512)
 },
 }
 // .....
 return api
 }

And after serialization, a data copy will be made:

 func (cfg *frozenConfig) Marshal(v interface{}) ([]byte, error) {
    stream := cfg.BorrowStream(nil)
    defer cfg.ReturnStream(stream)
 stream.WriteVal(v)
 if stream.Error != nil {
 return nil, stream.Error
 }
 result := stream.Buffer()
    copied := make([]byte, len(result))
 copy(copied, result)
 return copied, nil
 }

Since we are going to use buffer, let's do it together ^_^. This can reduce multiple memory allocations. Before reading http.Response.Body, be sure to remember buffer.Reset(). This basically completes the data reading optimization of http.Request.Body and http.Response.Body. The specific effect will be checked after it runs online for a while and stabilizes.

Effect analysis

After running online for a day, let’s take a look at the results.

 $ go tool pprof allocs2
 File: connect_server
 Type: alloc_space
 Time : Jan 26, 2019 at 10:27am (CST)
 Entering interactive mode (type "help"   for commands, "o"   for options)
 (pprof) top  
 Showing nodes accounting for 295.40GB, 40.62% of 727.32GB total
 Dropped 738 nodes (cum <= 3.64GB)
 Showing top 10 nodes out   of 174
      flat flat% sum % cum cum%
   73.52GB 10.11% 10.11% 73.52GB 10.11% git.tvblack.com/tvblack/connect_server/vendor/github.com/sirupsen/logrus.(*Entry).WithFields
   31.70GB 4.36% 14.47% 31.70GB 4.36% net/url.unescape
   27.49GB 3.78% 18.25% 54.87GB 7.54% git.tvblack.com/tvblack/connect_server/models.LogItemsToBytes
   27.41GB 3.77% 22.01% 27.41GB 3.77% strings. Join  
   25.04GB 3.44% 25.46% 25.04GB 3.44% bufio.NewWriterSize
   24.81GB 3.41% 28.87% 24.81GB 3.41% bufio.NewReaderSize
   23.91GB 3.29% 32.15% 23.91GB 3.29% regexp.(*bitState).reset
   23.06GB 3.17% 35.32% 23.06GB 3.17% math/big.nat.make
   19.90GB 2.74% 38.06% 20.35GB 2.80% git.tvblack.com/tvblack/connect_server/vendor/github.com/json-iterator/go.(*Iterator).readStringSlowPath
   18.58GB 2.56% 40.62% 19.12GB 2.63% net/textproto.(*Reader).ReadMIMEHeader

Wow! bytes.makeSlice finally disappeared from the top 10. That’s great. Let’s take a look at other calls to bytes.makeSlice.

 (pprof) web bytes.makeSlice

From the figure, we can find that the allocation of bytes.makeSlice is already very small, and most of it is because http.Request.ParseForm reads http.Request.Body using ioutil.ReadAll. The effect of this optimization is very good.

Let's take a look at the more intuitive flame graph. Compared with before optimization, it is obvious that ioutil.ReadAll is no longer visible.

Problems encountered during optimization

I am ashamed to say that there was a mistake during the optimization process, which caused the production environment to fail for 2 minutes. It was quickly recovered through automatic deployment and immediate rollback. After that, the code was analyzed and the problem was solved before it was fully optimized after going online. Let me summarize the problems that occurred.

When building the http request, I divided it into two parts for optimization: serializing json and reading http.Response.Body data. I kept the idea of putting the buffer back into the buffer pool as soon as possible, because http.DefaultClient.Do(req) is a network request, which is relatively time-consuming. Before this, I put the buffer back into the buffer pool, and then re-acquire a buffer when reading http.Response.Body. The code is as follows:

 package adapter 
 
 import (
 "bytes"  
 "fmt"  
 "io"  
 "io/ioutil"  
 "net/http"  
 "sync"   
 
 "github.com/json-iterator/go"  
 "github.com/sirupsen/logrus"  
 "github.com/thinkeridea/go-extend/exbytes"  
 ) 
 
 type Adapter struct {
 pool sync.Pool
 } 
 
 func New() *Adapter {
 return &Adapter{
 pool: sync.Pool{
            New: func() interface{} {
 return bytes.NewBuffer(make([]byte, 4096))
 },
 },
 }
 } 
 
 func (api *Adapter) Request(r *Request) (*Response, error) {
 var err error
    buffer := api.pool.Get().(*bytes.Buffer)
 buffer.Reset()
 defer func() {
 if buffer != nil {
            api.pool.Put(buffer)
 buffer = nil
 }
 }() 
 
    e := jsoniter.NewEncoder(buffer)
 err = e.Encode(r)
 if err != nil {
 return nil, fmt.Errorf( "jsoniter.Marshal failure: %v" , err)
 } 
 
 data := buffer.Bytes()
    req, err := http.NewRequest( "POST" , "http://xxx.com" , buffer)
 if err != nil {
 return nil, fmt.Errorf( "http.NewRequest failed: %v" , err)
 } 
 
    req.Header.Set ( "User-Agent" , "xxx" ) 
 
 api.pool.Put(buffer)
 buffer = nil 
     
    httpResponse, err := http.DefaultClient.Do(req) 
     
     
 // .... 
 
    buffer = api.pool.Get().(*bytes.Buffer)
 buffer.Reset()
 defer func() {
 if buffer != nil {
            api.pool.Put(buffer)
 buffer = nil
 }
 }()
    _, err = io.Copy(buffer, httpResponse.Body)
 if err != nil {
 return nil, fmt.Errorf( "adapter io.copy failure error:%v" , err)
 } 
 
 // .... 
     
 api.pool.Put(buffer)
 buffer = nil 
 
 // ...
 return res, nil
 }

An error occurred immediately after going online: http: ContentLength=2090 with Body length 0. When sending a request, I read data from the buffer and found that the data was missing or insufficient. What the hell was going on? I rolled back and restored the business immediately, and then analyzed http.DefaultClient.Do(req) and http.NewRequest. When calling http.NewRequest, no data was read from the buffer, but only a req.GetBody was created and then the data was read in http.DefaultClient.Do. Because the buffer was put back into the buffer pool before http.DefaultClient.Do, other goroutines obtained the buffer and reset it, which caused data contention. Of course, it led to incomplete data reading. I was really ashamed that I knew too little about http.Client. I will try to go through the source code when I have time.

Summarize

Use a buffer of appropriate size to reduce memory allocation. sync.Pool can help reuse buffers. Be sure to write these logics yourself and avoid using third-party packages. Even if third-party packages use the same techniques, in order to avoid data contention, they will inevitably copy a new data when returning data. For example, although jsoniter uses sync.Pool and buffer, it still needs to copy the data when returning it. In addition, this general package cannot give an initial buffer size that is very suitable for the business. Too small will lead to data copying, and too large will waste too much memory.

Making good use of buffer and sync.Pool in a program can greatly improve the performance of the program, and the combination of these two is very simple to use and does not make the code complicated.

<<: Summary of the third phase of 5G technology R&D trials: the newly added 2.6GHz frequency band is consistent with the 3.5GHz test results

>>: Problems that edge computing needs to solve urgently

Ping is often used in the network. Teach you how to detect the three-layer network at one time

Blog

Digital-vm New Year 50% off, KVM VPS monthly payment starts from 2 US dollars, 8 computer rooms in the United States/Japan/Singapore

Blog

Novos: €8/month KVM-2GB/40G NVMe+1TB/25TB/Belgium

Tongmingzhi Cloud "appeared" at the 2022 Information Technology Independent Innovation Summit Forum and released the Loongson 3C5000-based application delivery gateway

Recently, the 2022 Information Technology Autonom...

Hengchuang Technology 11.11 Super Value Purchase: Overseas cloud servers start at 264 yuan/year, independent servers are 50% off and 10M bandwidth is given

Hengchuang Technology (henghost) has sent the eve...

Elegantly read http request or response data

Ping is often used in the network. Teach you how to detect the three-layer network at one time

Digital-vm New Year 50% off, KVM VPS monthly payment starts from 2 US dollars, 8 computer rooms in the United States/Japan/Singapore

Novos: €8/month KVM-2GB/40G NVMe+1TB/25TB/Belgium

How to embrace the future with Network as a Service (NaaS)

Huawei Network Energy "Innovation" on the Road

SpectraIP: €3.5/month KVM-2GB/50GB/5TB/Netherlands data center

Ruijie Networks launches new cloud-based large-screen smart classroom product: live broadcast to be held on May 18

South Korea's 5G users will reach tens of millions! It is important to drain the "water"

A detailed explanation of Brotli algorithm to save CDN traffic

How fiber optics helps businesses and people in the digital age

Recommend

Omdia: Next-generation PON equipment market to reach nearly $8.9 billion in 2025

What exactly is “cloud-network integration”?

Sharktech: 50% off on Los Angeles high-security VPS, 2GB RAM/30GB SSD/4TB bandwidth starting at $6.57 per month

5G messaging: Where does it come from? Where is it going?

5G, which frequently appears on hot searches, has really "broken the circle" in 2021!

RAKsmart: Los Angeles/San Jose server flash sale starts at $30/month, Japan server flash sale starts at $59/month

Tongmingzhi Cloud "appeared" at the 2022 Information Technology Independent Innovation Summit Forum and released the Loongson 3C5000-based application delivery gateway

Introducing social capital to solve the 5G network construction dilemma

Endpoint Technology: A one-stop digital transformation platform for enterprises

Have you been caught? Wireless router settings like this pose security risks

5G needs new Wi-Fi tech to succeed, Cisco says

Hengchuang Technology 11.11 Super Value Purchase: Overseas cloud servers start at 264 yuan/year, independent servers are 50% off and 10M bandwidth is given

China Unicom begins deploying 2G network and stops all services

Four factors driving 100Gbps network upgrades

5G market is rapidly expanding and artificial intelligence scenarios are becoming a reality