Hello everyone, I am Fei Ge! Nowadays, server virtualization technology has developed to a deep level. Now many companies in the industry have migrated to containers. The code we develop is likely to run on containers. Therefore, it is very important to have a deep understanding of the working principle of container networks. This will help you know how to deal with problems in the future. Network virtualization, in fact, can be summarized in one sentence as using software to simulate a real physical network connection. For example, Docker is an independent network environment simulated on the host machine using pure software. Today we will build a virtual network by hand, access external network resources in this network, and monitor ports to provide external services. After reading this article, I believe you will have a better understanding of Docker virtual network. Okay, let's get started! 1. Review of basic knowledge1.1 veth, bridge and namespaceVeth in Linux is a pair of virtual network card devices, which is very similar to our common lo. In this device, after sending data from one end, the kernel will look for the other half of the device, so it can be received at the other end. However, veth can only solve the problem of one-to-one communication. For details, see Easily understand the basics of Docker network virtualization-veth device! If there are many veth pairs that need to communicate with each other, a virtual switch called bridge needs to be introduced. Each veth pair can connect one end to the interface of the bridge. The bridge can forward data between ports like a switch, so that the veths on each port can communicate with each other. Namespace solves the isolation problem. By default, each virtual network card device, process, socket, routing table and other network stack-related objects belong to the default namespace init_net. However, we hope that different virtualization environments are isolated. Taking Docker as an example, container A cannot use the device, routing table, socket and other resources of container B, or even take a look at them. Only in this way can we ensure that different containers can reuse resources without affecting the normal operation of other containers. See Through veth, namespace and bridge, we can virtualize multiple network environments on a Linux system, and they can communicate with each other and with the host machine. However, after these three articles, we still have one problem left to solve, which is the communication between the virtualized network environment and the external network. Take the Docker container as an example. The service in the container you start must need to access an external database. In addition, you may need to expose port 80 to provide services to the outside world. For example, in Docker, we use the following command to make the web service on port 80 of the container accessible to the external network. Our article today mainly solves these two problems: one is to access the external network from the virtual network, and the other is to provide services in the virtual network for the external network to use. To solve them, routing and NAT technologies are needed. 1.2 RoutingWhen Linux sends data packets, it involves the routing process. This includes both sending data packets locally and forwarding data packets passing through the current machine. Let's first look at the local sending of data packets. We have discussed the local sending in the article "25 pictures, 10,000 words, disassembling the Linux network packet sending process". Routing is actually very simple, it is to choose which network card (including virtual network card device) to write data into. Which network card should be selected? The rules are specified in the routing table. There can be multiple routing tables in Linux, the most important and commonly used are local and main. The local routing table uniformly records the routing rules of the local network card device IP in the network namespace.
Other routing rules are generally recorded in the main routing table. You can view them using ip route list table local or the shorter route -n command. Let's look at the forwarding of data packets passing through the current machine. In addition to local transmission, forwarding also involves the routing process. If Linux receives a data packet and finds that the destination address is not a local address, it can choose to forward the data packet from one of its network card devices. At this time, just like local transmission, it also needs to read the routing table. According to the configuration of the routing table, it chooses which device to forward the packet from. However, it is worth noting that the forwarding function on Linux is disabled by default. That is, if the destination address is not the local IP address, the packet will be discarded by default. Some simple configuration is required before Linux can do the same work as a router and forward data packets. 1.3 iptables and NATThe Linux kernel network stack is basically a pure kernel-mode thing in terms of operation, but in order to cater to various user-level needs, the kernel opens some holes for user-level intervention. Among them, iptables is a very commonly used tool for intervening in kernel behavior. It has buried five hook entrances in the kernel, which are commonly known as the five chains. When Linux receives data, it enters ip_rcv at the IP layer for processing. Then it performs routing judgment. If it is found to be local, it enters ip_local_deliver for local reception and finally sends it to the TCP protocol layer. In this process, two HOOKs are embedded. The first one is PRE_ROUTING. This code will execute various tables in pre_routing in iptables. After finding that it is a local reception, it will then execute LOCAL_IN, which will execute the input rules configured in iptables. When sending data, after searching the routing table to find the exit device, the packet is sent to the device layer through functions such as __ip_local_out and ip_output. In these two functions, various rules opened by OUTPUT and PREROUTING are passed respectively. If it is a forwarding process, Linux receives a data packet and finds that it is not a local packet. It can find a suitable device to forward it by searching its own routing table. Then, the packet is first sent to the ip_forward function in ip_rcv for processing, and finally forwarded in the ip_output function. In this process, the three rules of PREROUTING, FORWARD and POSTROUTING are passed respectively. To sum up, the positions of the five chains in iptables in the kernel network module can be summarized as follows. The data receiving process goes through 1 and 2, the sending process goes through 4 and 5, and the forwarding process goes through 1, 3, and 5. With this diagram, we can understand the relationship between iptable and the kernel more clearly. In iptables, there are four tables according to the different functions implemented. They are raw, mangle, nat and filter. The nat table implements the NAT (Network Address Translation) function we often say. NAT is divided into SNAT (Source NAT) and DNAT (Destination NAT). SNAT solves the problem of intranet addresses accessing external networks. It is achieved by modifying the source IP in POSTROUTING. DNAT solves the problem of making the services in the intranet accessible to the outside world. It does this by modifying the target IP through PREROUTING. 2. Realize virtual network extranet communicationBased on the above basic knowledge, we use a purely manual method to build a virtual network similar to Docker, and also need to realize the function of communicating with the external network. 1. Experimental Environment PreparationLet's create a virtual network environment with a namespace of net1. The host machine's IP is in the 10.162 network segment, which can access external machines. The virtual network is assigned the 192.168.0 network segment, which is private and cannot be recognized by external machines. The process of building this virtual network is as follows: First create a netns and name it net1.
Create a veth pair (veth1 - veth1_p), put one end of veth1 in net1, configure an IP for it, and start it.
Create a bridge and set an IP address for it. Then plug the other end of veth, veth1_p, into the bridge. Finally, start both the bridge and veth1_p.
In this way, we have created a virtual network on Linux. The creation process is the same as in the article "Switch" implemented by software on Linux - Bridge!, but today, for the sake of convenience, only one network is created, while two were created in the previous article. 2. Request external resourcesNow suppose we want to access the external network in the network environment net1 above. The external network here refers to the network outside the virtual network host. We assume that the IP address of the other machine it wants to access is 10.153.*.*. The last two parts of 10.153.*.* are hidden because they are my internal network. You can replace them with your own IP address during the experiment. Let’s visit it directly.
It prompts that the network is not connected. What's going on? Use this error keyword to search in the kernel source code:
In ip_route_output_flow, if the return value is ENETUNREACH, the function exits. The error message in the macro definition comment is "Network is unreachable". This ip_route_output_flow is mainly used to perform routing selection. So we infer that there may be a problem with the routing, and take a look at the routing table of this namespace.
No wonder, it turns out that the default routing rule for the net1 namespace is only for the 192.168.0.* network segment. The IP we pinged is 10.153.*.*, and according to this routing table, no exit can be found. Naturally, the transmission fails. Let's add a default routing rule to net. As long as no other rules are matched, it will be sent to veth1 by default. At the same time, specify that the next one is the bridge (192.168.0.1) to which it is connected.
Try pinging again.
Ok, still no connection. The above routing helps us send the data packet from veth to the bridge correctly. Next, the bridge needs to forward the data packet to the eth0 network card. So we have to open the following two forwarding related configurations
However, there is still a problem. That is, the external machines do not know the IP of the 192.168.0.* network segment. They all communicate with each other through 10.153.*.*. Imagine how we can access the Internet normally when our computers at work do not have an external IP? The external network only knows the external IP. Yes, that is the NAT technology we mentioned above. Our requirement this time is to enable the internal virtual network to access the external network, so we need to use SNAT. It replaces the IP (192.168.0.2) in the namespace request with 10.153.*.* known to the external network, thereby achieving the effect of normal access to the external network.
Let’s try pinging again. Yay, it works!
At this time, we can open tcpdump to capture packets and check. We can see that the packets captured on the bridge still have the original source IP and destination IP. If you check on eth0 again, you will find that the source IP has been replaced with the IP on eth0 that can communicate with the external network. At this point, the container can access resources on the external network through the host's network card. Let's summarize the sending process 3. Open container portsLet's consider another requirement, which is to provide the services in this namespace to the external network. Just like the above problem, the IP 192.168.0.2 in our virtual network environment is unknown to the outside world. Only the host machine knows who it is. So we also need NAT function. This time we want to achieve external network access to internal addresses, so we need DNAT configuration. One difference between DNAT and SNAT configuration is that you need to clearly specify which port in the container corresponds to on the host. For example, in the use of docker, the correspondence of ports is specified through -p.
We configure the DNAT rules with the following command:
What this means is that the host machine determines before routing that if the traffic does not come from br0 and is accessing TCP 8088, it will forward it to 192.168.0.2:80. Start a Server in the net1 environment
Choose an external IP address, such as 10.143.*.*, and try telneting to 10.162.*.* 8088. It works!
Start packet capture, # tcpdump -i eth0 host 10.143.*.*. It can be seen that when requesting, the destination is the IP port of the host machine. But after the data packet reaches the host protocol stack, it hits the DNAT rule we configured, and the host forwards it to br0. Since there are not so many network traffic packets on the bridge, you can capture the packets directly without filtering, # tcpdump -i br0. The destination IP and port captured on br0 have been replaced. Of course, bridge knows that 192.168.0.2 is veth 1. Therefore, the service listening to 80 on veth1 can receive requests from the outside world! Let's summarize this receiving process ConclusionNow many companies in the industry have migrated to containers. The code we develop is likely to run on containers. Therefore, it is very important to have a deep understanding of how container networks work. This will help you know how to deal with problems in the future. At the beginning of this article, we briefly introduced the basic knowledge of veth, bridge, namespace, routing, iptables, etc. Veth implements connection, bridge implements forwarding, namespace implements isolation, routing table controls device selection when sending, and iptables implements NAT and other functions. Then, based on the above basic knowledge, we built a virtual network environment in a purely manual way. This virtual network can access external network resources and provide port services for external network calls. This is the basic principle of how Docker container network works. I packaged the whole experiment into a Makefile and put it here: https://github.com/yanfeizhang/coder-kung-fu/tree/main/tests/network/test07 Finally, let's expand on this. Today we are discussing the issue of Docker network communication. Docker containers provide external services through port mapping. When external machines access container services, they still need to access them through the container's host IP. In Kubernetes, there are higher requirements for cross-host network communication, and containers between different hosts must be able to directly interconnect. Therefore, the network model of Kubernetes is also more complex. |
<<: 5G manufacturing involves much more than just 5G
>>: How 5G will help wearable devices like smartwatches charge automatically
Today we're taking a look at home network har...
Why did AlphaGo focus on Go instead of Mahjong? L...
[[388061]] At the ITU-T SG13 plenary meeting held...
80VPS, a well-established Chinese hosting company...
Virtualization technology is being used more and ...
[[351971]] Fans’ questions must be arranged. How ...
Thanks to advances in artificial intelligence (AI...
Didn't I buy a VPS with annual payment from B...
When it comes to the Internet domain name service...
On November 8, the "Digital China 2019 - Tec...
[[341973]] Yu Yingtao, Co-President of Tsinghua U...
1. Introduction to Ad Anti-Cheat 1.1 Definition o...
In 2019, mobile technologies and services contrib...
With the continuous emergence of high-definition ...
[[338791]] At 13:00 on the afternoon of August 20...