PrefaceThis article mainly analyzes the IPIP network mode of the network component calico in k8s. It aims to understand the calixxxx, tunl0 and other devices generated in the IPIP network mode and the cross-node network communication method. It may seem a bit boring, but please take a few minutes to read it. If you forget the previous part after reading the latter part, please read it twice. These few minutes will definitely be worth it. 1. Introduction to calicoCalico is another popular network choice in the Kubernetes ecosystem. While Flannel is recognized as the simplest choice, Calico is known for its performance and flexibility. Calico is more comprehensive, not only providing network connectivity between hosts and pods, but also network security and management. The Calico CNI plugin encapsulates the functionality of Calico within the CNI framework. Calico is a pure three-layer network solution based on BGP, which can be well integrated with cloud platforms such as OpenStack, Kubernetes, AWS, GCE, etc. Calico uses Linux Kernel to implement an efficient virtual router vRouter on each computing node to be responsible for data forwarding. Each vRouter broadcasts the routing information of the container running on this node to the entire Calico network through the BGP1 protocol, and automatically sets the routing forwarding rules to reach other nodes. Calico ensures that the data traffic between all containers is interconnected through IP routing. When networking Calico nodes, the network structure (L2 or L3) of the data center can be directly used. There is no need for additional NAT, tunnels or Overlay Networks, and there is no additional packet unpacking, which can save CPU computing and improve network efficiency. In addition, Calico also provides a variety of network policies based on iptables, implements Kubernetes' Network Policy policy, and provides the function of limiting network accessibility between containers. Calico official website: https://www.projectcalico.org/ 2. Calico architecture and core componentsThe architecture diagram is as follows: Calico core components:
3. Working principle of calicoCalico considers the protocol stack of each operating system as a router, and all containers as network terminals connected to this router. It runs the standard routing protocol, BGP, between routers, and lets them learn how to forward packets in this network topology. Therefore, the Calico solution is actually a pure three-layer solution, which means that the three layers of the protocol stack of each machine are used to ensure the three-layer connectivity between two containers and between containers across hosts. 4. Two network modes of calico1) IPIPA tunnel that encapsulates the IP layer into the IP layer. Its function is basically equivalent to a bridge based on the IP layer! Generally speaking, ordinary bridges are based on the MAC layer and do not require IP at all. However, this ipip uses the routers at both ends to make a tunnel, connecting two originally unconnected networks through point-to-point connection. The source code of ipip can be found in the kernel net/ipv4/ipip.c. 2) BGPBorder Gateway Protocol (BGP) is a core decentralized autonomous routing protocol on the Internet. It achieves reachability between autonomous systems (AS) by maintaining IP routing tables or 'prefix' tables, and is a vector routing protocol. BGP does not use the traditional indicators of the interior gateway protocol (IGP), but uses paths, network policies or rule sets to determine routes. Therefore, it is more suitable to be called a vector protocol rather than a routing protocol. 5. IPIP Network Mode AnalysisSince the IPIP mode is used in my personal environment, I will analyze this mode here.
Perform a ping test Here, ping the demo-hello-sit pod from the demo-hello-perf pod.
Enter the pod demo-hello-perf to view the routing information in this pod
According to the routing information, ping 10.20.42.31 will match the first entry. The first route means that all data packets destined for any network segment are sent to the gateway 169.254.1.1 and then sent out from the eth0 network card. The routing information on the host of node node2.perf where demo-hello-perf is located is as follows:
You can see a route with a destination of 10.20.42.0. This means: when the ping packet comes to the master node, it will match the route tunl0. This route means: all packets destined for the 10.20.42.0/26 network segment are sent to the gateway 172.16.35.4. Because the demo-hello-perf pod is on 172.16.36.5 and the demo-hello-sit pod is on 172.16.35.4, the packets are sent to the node through the device tunl0. The routing information on the host of node node1.sit where demo-hello-sit is located is as follows:
When the node network card receives the data packet, it finds that the destination IP is 10.20.42.31, so it matches the route with Destination 10.20.42.31. This route means: 10.20.42.31 is a directly connected device on this machine, and the data packets destined for the device are sent to cali04736ec14ce Why is there such a strange device named cali04736ec14ce? What is it? In fact, this device is one end of the veth pair. When creating demo-hello-sit, calico will create a veth pair device for demo-hello-sit. One end is the network card of demo-hello-sit, and the other end is the cali04736ec14ce we see. Let's verify it. We enter the demo-hello-sit pod and check the number behind device 4: 122964
Then we log in to the host where the demo-hello-sit pod is located to view
It is found that the other end device number in pod demo-hello-sit is the same as the cali04736ec14ce number 122964 seen on the node here Therefore, the routing on the node sends the data of the cali04736ec14ce network card device to the demo-hello-sit pod. At this point, the ping packet reaches the destination. Note the route of the host where the demo-hello-sit pod is located. There is a route with a destination of 10.20.105.192.
Check the routing information in the demo-hello-sit pod again. It is the same as that in the demo-hello-perf pod. Therefore, based on the above examples, the network mode of IPIP is to encapsulate the IP network with a layer. The characteristic is that all pod data traffic is sent from the tunnel tunl0, and tunl0 adds a layer of transport layer packet operation. 6. Packet capture analysisPing the demo-hello-sit pod from the demo-hello-perf pod, and then perform a tcpdump on the host where the demo-hello-sit pod is located.
Ping demo-hello-sit in the demo-hello-perf pod
After finishing the packet capture, download icmp_ping.cap to the local windows for packet capture analysis It can be seen that the data packet has a total of 5 layers, of which there are two network layers where IP (Internet Protocol) is located, namely the network between pods and the network encapsulation between hosts. The red box selects the host where the two pods are located, the blue box selects the IP addresses of the two pods, src indicates the host IP address of the pod that initiates the ping operation and the IP address of the pod that initiates the ping operation, and dst indicates the host IP address of the pod being pinged and the IP address of the pod being pinged According to the encapsulation order of data packets, an extra layer of data packets between hosts should be encapsulated outside the ICMP packet of demo-hello-perf ping demo-hello-sit. You can see that each datagram has two IP network layers, the inner layer is the IP network message between the Pod containers, and the outer layer is the network message of the host node (2 nodes). This is done because tunl0 is a tunnel endpoint device, and a layer of encapsulation is added when the data arrives to facilitate sending to the opposite tunnel device. The specific contents of the two-layer packet are as follows: The communication between Pods is forwarded via the Layer 3 tunnel of IPIP. Compared with the Layer 2 tunnel of VxLAN, the IPIP tunnel has less overhead, but its security is also worse. 7. Access from pod to svcView service
Capture packets on the host of pod demo-hello-sit
Test access, curl demo-hello-perf's svc address and port in demo-hello-sit
Finish capturing packets, download the svc.cap file and open it in Wireshark You can see the results of Src and Dst in wireshark. They are still the same as the IP addresses used to access the pod in the above pod. Here, Src and Dst are still the intranet IP addresses of the two pod hosts and the two pods' own IP addresses. They communicate using ipip. Through the above examples, you should understand the communication method of IPIP network mode! |
The last time I shared information about Hosteons...
【51CTO.com Quick Translation】The upcoming commerc...
When there are a plethora of industry certificati...
The withdrawal of 2G network is actually not a ne...
In today’s article, let’s talk about a very popul...
6G will bring many improvements in many areas, bu...
On March 31, Samsung Electronics announced that i...
5G is currently the most eye-catching new technol...
5G network technology is mainly divided into thre...
[[420219]] There are not many interview questions...
5G is closely tied to edge computing. With a whol...
Network topology (Tpology) Topology refers to the...
According to foreign media, Australia has complet...
In the architecture of early 2G and 3G base stati...
On January 20, the State Council Information Offi...