TCP source code analysis - three-way handshake Connect process

TCP source code analysis - three-way handshake Connect process

[[386167]]

This article is reprinted from the WeChat public account "Linux Kernel Matters", written by songsong001. Please contact the Linux Kernel Matters public account to reprint this article.

This article mainly analyzes the implementation of the TCP protocol. However, since the TCP protocol is relatively complex, it is analyzed in several articles. This article mainly introduces the three-way handshake process when the TCP protocol establishes a connection.

The TCP protocol should be the most complex protocol in the TCP/IP protocol stack (no one else). The complexity of the TCP protocol comes from its connection orientation and guaranteed reliable transmission.

As shown in the figure below, the TCP protocol is located at the fourth layer of the TCP/IP protocol stack, which is the transport layer, and is built on the IP protocol at the network layer.

However, since the IP protocol is a connectionless and unreliable protocol, the TCP protocol must maintain a connection state for each CS (Client - Server) connection in order to achieve connection-oriented reliable transmission. Therefore, the TCP protocol connection only maintains a connection state, not a real connection.

Since this article mainly introduces how the Linux kernel implements the TCP protocol, if you are not very clear about the principles of the TCP protocol, you can refer to the famous "TCP/IP Protocol Detailed Explanation".

Three-way handshake process

We know that the TCP protocol is built on the connectionless IP protocol. In order to achieve connection-oriented, the TCP protocol uses a negotiation method to establish a connection state, called: three-way handshake. The process of the three-way handshake is as follows:

The process of establishing a connection is as follows:

  • The client needs to send a SYN packet to the server (including the client initialization sequence number) and set the connection state to SYN_SENT.
  • After the server receives the SYN packet from the client, it needs to reply with a SYN+ACK packet to the client (including the server initialization sequence number) and set the connection state to SYN_RCVD.
  • After the client receives the SYN+ACK packet from the server, it sets the connection state to ESTABLISHED (indicating that the connection has been established) and replies with an ACK packet to the server.
  • After receiving the ACK packet from the client, the server sets the connection status to ESTABLISHED (indicating that the connection has been established).

After the above process is completed, a TCP connection is established.

TCP Header

To analyze the TCP protocol, it is necessary to understand the TCP protocol header. We use the following picture to introduce the format of the TCP header:

The following describes the functions of each field in the TCP header:

  • Source port number: used to specify the port to which the local program is bound.
  • Destination port number: used to specify the port to which the remote program is bound.
  • Sequence number: The sequence number used when sending data locally.
  • Confirmation number: It is used to confirm the local receipt of the data sequence number sent by the remote end.
  • Header Length: Indicates the length of the TCP header.
  • Flag bit: used to indicate the type of TCP data packet.
  • Window size: used for flow control and indicates the ability of the remote end to receive data.
  • Checksum: Used to verify whether the data packet is damaged during transmission.
  • Urgent pointer: Generally rarely used, used to specify the offset of urgent data (valid when the URG flag is 1).
  • Optional: The options part of TCP.

Let's take a look at how the Linux kernel defines the structure of the TCP header, as follows:

  1. struct tcphdr {
  2. __u16 source; // source port
  3. __u16 dest; // destination port
  4. __u32 seq; // sequence number
  5. __u32 ack_seq; // confirmation number
  6. __u16 doff:4, //header length
  7. res1:4, // reserved
  8. res2:2, // reserved
  9. urg:1, // Whether to include urgent data
  10. ack:1, // Whether it is ACK packet
  11. psh:1, //Whether to push the package
  12. rst:1, //Reset package?
  13. syn:1, // Is it a SYN packet?
  14. fin:1; // Is it a FIN packet?
  15. __u16 window; // sliding window
  16. __u16 check ; // checksum
  17. __u16 urg_ptr; // Urgent pointer
  18. };

From the above definition, we can see that the fields of the structure tcphdr correspond one-to-one to the fields of the TCP header.

Client connection process

A TCP connection is initiated by the client. When the client program calls the connect() system call, a TCP connection is established with the server program. The prototype of the connect() system call is as follows:

  1. int   connect ( int sockfd, const struct sockaddr *addr, socklen_t addrlen);

Here are the functions of the various parameters of the connect() system call:

  • sockfd: The file handle created by the socket() system call.
  • addr: specifies the remote IP address and port to connect to.
  • addrlen: specifies the length of parameter addr.

When the client calls the connect() function, it triggers the kernel to call the sys_connect() kernel function. The sys_connect() function is implemented as follows:

  1. int sys_connect( int fd, struct sockaddr *useraddr, int addrlen)
  2. {
  3. struct socket *sock;
  4. char address[MAX_SOCK_ADDR];
  5. int err;
  6. ...
  7. // Get the socket object corresponding to the file handle
  8. sock = sockfd_lookup(fd, &err);
  9. ...
  10. // Copy the remote IP address and port information to connect to from user space
  11. err = move_addr_to_kernel(uservaddr, addrlen, address);
  12. ...
  13. // Call inet_stream_connect() function to complete the connection operation
  14. err = sock->ops-> connect (sock, (struct sockaddr *)address, addrlen,
  15. sock->file->f_flags);
  16. ...
  17. return err;
  18. }

The sys_connect() kernel function mainly completes three steps:

  • Call the sockfd_lookup() function to obtain the socket object corresponding to the fd file handle.
  • Call move_addr_to_kernel() function to copy the remote IP address and port information to be connected from user space.
  • Call inet_stream_connect() function to complete the connection operation.

We continue to analyze the implementation of the inet_stream_connect() function:

  1. int inet_stream_connect(struct socket *sock, struct sockaddr * uaddr,
  2. int addr_len, int flags)
  3. {
  4. struct sock *sk = sock->sk;
  5. int err;
  6. ...
  7. if (sock->state == SS_CONNECTING) {
  8. ...
  9. } else {
  10. // Try to automatically bind a local port
  11. if (inet_autobind(sk) != 0)
  12. return (-EAGAIN);
  13. ...
  14. // Call tcp_v4_connect() to connect
  15. err = sk->prot-> connect (sk, uaddr, addr_len);
  16. if (err < 0)
  17. return (err);
  18. sock->state = SS_CONNECTING;
  19. }
  20. ...
  21. // If the socket is set to non-blocking and the connection has not been established, then return EINPROGRESS error
  22. if (sk->state != TCP_ESTABLISHED && (flags & O_NONBLOCK))
  23. return (-EINPROGRESS);
  24.  
  25. // Wait for the connection process to complete
  26. if (sk->state == TCP_SYN_SENT || sk->state == TCP_SYN_RECV) {
  27. inet_wait_for_connect(sk);
  28. if (signal_pending( current ))
  29. return -ERESTARTSYS;
  30. }
  31. sock->state = SS_CONNECTED; // Set the socket status to connected
  32. ...
  33. return (0);
  34. }

The main operations of the inet_stream_connect() function are as follows:

  • Call the inet_autobind() function to try to automatically bind to a local port.
  • Call the tcp_v4_connect() function to perform the TCP protocol connection operation.
  • If the socket is set to non-blocking, and the connection has not yet been established, then the EINPROGRESS error is returned.
  • Call the inet_wait_for_connect() function to wait for the connection to the server to complete.
  • Set the socket status to SS_CONNECTED, indicating that the connection has been established.

In the above steps, the most important thing is to call the tcp_v4_connect() function for the connection operation. Let's analyze the implementation of the tcp_v4_connect() function:

  1. int tcp_v4_connect(struct sock *sk, struct sockaddr *uaddr, int addr_len)
  2. {
  3. struct tcp_opt *tp = &(sk->tp_pinfo.af_tcp);
  4. struct sockaddr_in *usin = (struct sockaddr_in *)uaddr;
  5. struct sk_buff *buff;
  6. struct rtable *rt;
  7. u32 daddr, nexthop;
  8. int tmp;
  9. ...
  10. nexthop = daddr = usin->sin_addr.s_addr;
  11. ...
  12. // 1. Get the routing information for sending data
  13. tmp = ip_route_connect(&rt, nexthop, sk->saddr,
  14. RT_TOS(sk->ip_tos)|RTO_CONN|sk->localroute,
  15. sk->bound_dev_if);
  16. ...
  17. dst_release(xchg(&sk->dst_cache, rt)); // 2. Set sk's routing information
  18.  
  19. // 3. Apply for a skb data packet object
  20. buff = sock_wmalloc(sk, (MAX_HEADER + sk->prot->max_header), 0, GFP_KERNEL);
  21. ...
  22. sk->dport = usin->sin_port; // 4. Set the destination port
  23. sk->daddr = rt->rt_dst; // 5. Set the destination IP address
  24. ...
  25. if (!sk->saddr)
  26. sk->saddr = rt->rt_src; // 6. If the source IP address is not specified, the source IP address of the routing information is used
  27. sk->rcv_saddr = sk->saddr;
  28. ...
  29. // 7. Initialize TCP sequence number
  30. tp->write_seq = secure_tcp_sequence_number(sk->saddr, sk->daddr, sk->sport,
  31. usin->sin_port);
  32. ...
  33. // 8. Reset TCP maximum segment size
  34. tp->mss_clamp = ~0;
  35. ...
  36. // 9. Call tcp_connect() function to continue the connection operation
  37. tcp_connect(sk, buff, rt->u.dst.pmtu);
  38. return 0;
  39. }

The tcp_v4_connect() function just does some preparation before the connection, as follows:

  • Call the ip_route_connect() function to obtain the routing information for sending data, and save the routing information to the routing cache of the socket object.
  • Call the sock_wmalloc() function to apply for an skb data packet object.
  • Set the destination port and destination IP address.
  • If the source IP address is not specified, the source IP address in the routing information is used.
  • Call the secure_tcp_sequence_number() function to initialize the TCP sequence number.
  • Reset the maximum segment size of the TCP protocol.
  • Call the tcp_connect() function to send a SYN packet to the server program.

Since the first step of the TCP three-way handshake is for the client to send a SYN packet to the server, we mainly focus on the implementation of the tcp_connect() function, whose code is as follows:

  1. void tcp_connect(struct sock *sk, struct sk_buff *buff, int mtu)
  2. {
  3. struct dst_entry *dst = sk->dst_cache;
  4. struct tcp_opt *tp = &(sk->tp_pinfo.af_tcp);
  5.  
  6. skb_reserve(buff, MAX_HEADER + sk->prot->max_header); // Reserve all protocol header space
  7.  
  8. tp->snd_wnd = 0;
  9. tp->snd_wl1 = 0;
  10. tp->snd_wl2 = tp->write_seq;
  11. tp->snd_una = tp->write_seq;
  12. tp->rcv_nxt = 0;
  13. sk->err = 0;
  14. // Set the TCP header length
  15. tp->tcp_header_len = sizeof(struct tcphdr) +
  16. (sysctl_tcp_timestamps ? TCPOLEN_TSTAMP_ALIGNED : 0);
  17. ...
  18. tcp_sync_mss(sk, mtu); // Set the maximum length of TCP segment
  19. ...
  20. TCP_SKB_CB(buff)->flags = TCPCB_FLAG_SYN; // Set the SYN flag to 1 (indicating that this is a SYN packet)
  21. TCP_SKB_CB(buff)->sacked = 0;
  22. TCP_SKB_CB(buff)->urg_ptr = 0;
  23. buff->csum = 0;
  24. TCP_SKB_CB(buff)->seq = tp->write_seq++; // Set the sequence number
  25. TCP_SKB_CB(buff)->end_seq = tp->write_seq; // Set confirmation number
  26. tp->snd_nxt = TCP_SKB_CB(buff)->end_seq;
  27.  
  28. // Initialize the size of the sliding window
  29. tp->window_clamp = dst->window;
  30. tcp_select_initial_window(sock_rspace(sk)/2, tp->mss_clamp,
  31. &tp->rcv_wnd, &tp->window_clamp,
  32. sysctl_tcp_window_scaling, &tp->rcv_wscale);
  33. ...
  34. tcp_set_state(sk, TCP_SYN_SENT); // Set the socket state to SYN_SENT
  35.  
  36. // Call the tcp_v4_hash() function to add the socket to the tcp_established_hash hash table
  37. sk->prot->hash(sk);
  38.  
  39. tp->rto = dst->rtt;
  40. tcp_init_xmit_timers(sk); // Set the timeout retransmission timer
  41. ...
  42. // Add skb to the write_queue queue for retransmission
  43. __skb_queue_tail(&sk->write_queue, buff);
  44. TCP_SKB_CB(buff)-> when = jiffies;
  45. ...
  46. // Call tcp_transmit_skb() function to build SYN packet and send it to the server program
  47. tcp_transmit_skb(sk, skb_clone(buff, GFP_KERNEL));
  48. ...
  49. }

Although the implementation of the tcp_connect() function is relatively long, the logic is relatively simple, which is to set the values ​​of each field in the TCP header and then send the data packet to the server. The main work of the tcp_connect() function is listed below:

  • Set the SYN flag in the TCP header to 1 (indicating that this is a SYN packet).
  • Set the sequence number and acknowledgment number in the TCP header.
  • Initialize the sliding window size.
  • Set the socket status to SYN_SENT, refer to the status diagram of the three-way handshake above.
  • Call the tcp_v4_hash() function to add the socket to the tcp_established_hash hash table, which is used to quickly find the corresponding socket object by IP address and port.
  • Set the timeout retransmission timer.
  • Add skb to the write_queue queue for timeout retransmission.
  • Call the tcp_transmit_skb() function to build a SYN packet and send it to the server program.

Note: The Linux kernel uses the tcp_established_hash hash table to store all TCP connection socket objects, and the key value of the hash table is the connected IP and port, so the corresponding socket connection can be quickly found from the tcp_established_hash hash table by the connected IP and port. As shown in the following figure:

Through the above analysis, building a SYN packet and sending it to the server is done by the tcp_transmit_skb() function, so let's analyze the implementation of the tcp_transmit_skb() function:

  1. void tcp_transmit_skb(struct sock *sk, struct sk_buff *skb)
  2. {
  3. if (skb != NULL ) {
  4. struct tcp_opt *tp = &(sk->tp_pinfo.af_tcp);
  5. struct tcp_skb_cb *tcb = TCP_SKB_CB(skb);
  6. int tcp_header_size = tp->tcp_header_len;
  7. struct tcphdr *th;
  8. ...
  9. //TCP header pointer
  10. th = (struct tcphdr *)skb_push(skb, tcp_header_size);
  11. skb->h.th = th;
  12.  
  13. skb_set_owner_w(skb, sk);
  14.  
  15. // Build the TCP protocol header
  16. th->source = sk->sport; // Source port
  17. th->dest = sk->dport; // destination port
  18. th->seq = htonl(TCP_SKB_CB(skb)->seq); // Request sequence number
  19. th->ack_seq = htonl(tp->rcv_nxt); // Response sequence number
  20. th->doff = (tcp_header_size >> 2); // header length
  21. th->res1 = 0;
  22. *(((__u8 *)th) + 13) = tcb->flags; // Set the flag bit of the TCP header
  23.  
  24. if (!(tcb->flags & TCPCB_FLAG_SYN))
  25. th->window = htons(tcp_select_window(sk)); // sliding window size
  26.  
  27. th-> check = 0; // checksum
  28. th->urg_ptr = ntohs(tcb->urg_ptr); // Urgent pointer
  29. ...
  30. // Calculate the TCP header checksum
  31. tp->af_specific->send_check(sk, th, skb->len, skb);
  32. ...
  33. tp->af_specific->queue_xmit(skb); // Call ip_queue_xmit() function to send data packet
  34. }
  35. }

The implementation of the tcp_transmit_skb() function is relatively simple. It builds the TCP protocol header and then calls the ip_queue_xmit() function to hand over the data packet to the IP protocol for sending.

At this point, the client has sent a SYN packet to the server, which means that the first step of the TCP three-way handshake has been completed.

<<:  Russia launches first ultra-fast 5G network

>>:  Average tariffs to drop by another 10%. Senior management has given operators another task target! Are you ready?

Blog    

Recommend

2017 Prediction: SD-WAN will drive the development of IoT

IoT drives SD-WAN adoption The Internet of Things...

SpartanHost Seattle VPS restock, $8/month-2GB/30G NVMe/3TB/10Gbps bandwidth

SpartanHost has updated its inventory again. Some...

Thoroughly understand cross-domain issues SpringBoot helps you unimpeded

Environment: SpringBoot2.7.16 1. Introduction Cro...

Network | How to design a billion-level API gateway?

The API gateway can be seen as the entrance for t...

After the confession failed, I understood the principle of TCP implementation

A few days ago, I posted a circle of friends and ...

Gigsgigscloud Japan CN2 Special Package Simple Test

The day before yesterday, we shared the product i...

TCP Things 1: TCP Protocol, Algorithm and Principle

TCP is a very complex protocol because it has to ...

Let’s talk about the four major features of 5G

From telegraphs, telephones to mobile phones, and...

5G, how is the construction going?

Nowadays, everyone knows about 5G. 5G has taken o...

Performance: Network Communication Optimization and Communication Protocol

introduction Hi, everyone! I am Xiaomi, welcome t...

In the 5G era, will programmers lose their jobs or continue to be popular?

Recently everyone has been discussing such a thin...