Glossary 1. Network namespace: Linux introduces network namespace in the network stack, isolating independent network protocol stacks into different command spaces, which cannot communicate with each other; Docker uses this feature to achieve network isolation between containers. 2. Veth device pair: also called virtual network interface pair. Veth device pair is introduced to achieve communication in different network namespaces. 3. Iptables/Netfilter: Netfilter is responsible for executing various mounted rules (filtering, modification, discarding, etc.) in the kernel and runs in kernel mode; Iptables mode is a process running in user mode and is responsible for assisting in maintaining various rule tables of Netfilter in the kernel; the cooperation of the two realizes a flexible packet processing mechanism in the entire Linux network protocol stack. 4. Bridge: A bridge is a layer 2 network device that can connect different ports supported by Linux and implement many-to-many communication similar to a switch. 5. Routing: The Linux system contains a complete routing function. When the IP layer is processing data sending or forwarding, it uses the routing table to decide where to send it. A daunting network model Kubernetes re-abstracts the network inside the cluster to flatten the entire cluster network. When we understand the network model, we can completely separate it from the physical nodes. We use pictures to explain and get a basic impression first. Among them, the following key abstract concepts are emphasized. A Service Service is a resource object introduced by Kubernetes to shield the dynamic changes of these backend instances (Pod) and to balance the load of multiple instances. Service is usually bound to deployment and defines the access entry address of the service. Applications (Pod) can access a group of cluster instances composed of Pod replicas behind it through this entry address. The mapping between Service and its backend Pod replica cluster is achieved through Label Selector. The type of Service determines how the Service provides external services. Depending on the type, the service can be visible only in the Kubernetes cluster or exposed outside the cluster. There are three types of Service: ClusterIP, NodePort, and LoadBalancer. The specific usage scenarios will be explained below. View in the test environment:
In the above information, the svc backend proxies two Pod instances: 172.16.2.125:80, 172.16.2.229:80 Two IPs Kubernetes abstracts the concepts of Cluster IP and Pod IP to describe the IP objects of its network model. PodIP is the IP address of each Pod in the Kubernetes cluster. It is allocated by Docker Engine based on the IP address segment of the docker0 bridge and is a virtual Layer 2 network. Pods in Kubernetes can communicate directly with each other. When a container in a Pod accesses a container in another Pod, the communication is done through the Pod IP. Cluster IP only works on Service and has no corresponding entity object, so Cluster IP cannot be pinged. Its function is to provide a unified access entry for the instances of Service backend. When accessing ClusterIP, the request will be forwarded to the backend instance, and the default method is polling. Cluster IP is maintained by the kube-proxy component like Service. There are two main implementation methods, iptables and IPVS. After version 1.8, kubeproxy began to support IPVS. In the above example, the SVC information contains Cluster IP. The concept of nodeip is not listed here, because it is the network card IP of the physical machine. Therefore, nodeip can be understood as the physical machine IP. Three Ports In Kubernetes, the communication between objects at multiple levels, such as containers, Pods, Services, and clusters, is abstracted here to distinguish the communication ports at each level in the network model. Port This port is not the port concept in TCP/IP in the general sense. It refers specifically to the port of the Service in Kubernetes. It is the access port between Services, such as the default port of MySQL Service 3306. It only provides access rights to containers in the cluster, and the service cannot be accessed from outside the cluster through this port. nodePort NodePort provides a way for external machines to access services within the cluster. For example, if a Web application needs to be accessed by other users, you need to configure type=NodePort and nodePort=30001. Then other machines can access the service through a browser by accessing scheme://node:30001, such as http://node:30001. targetPort targetPort is the port of the container (the most fundamental port entry), which is consistent with the port exposed when the container is created (EXPOSE in DockerFile). For example, the official nginx of docker.io exposes port 80. Let's take an example to see how to configure the Service port:
Here is a yaml file of a service, which is deployed in the namespace of abcdocker. NodePort is configured here, so its type is NodePort. Pay attention to the capitalization. If nodePort is not configured, ClusterIP needs to be filled in here, which means that only internal cluster service access is supported. Internal cluster communication Single Node Communication Communication within a single node in a cluster mainly includes two situations: communication between multiple containers in the same pod and communication between different pods on the same node. Since cross-node access is not involved, the traffic will not be forwarded through the physical network card. By looking at the routing table, we can also get a glimpse of this:
1 Communication within a Pod As shown in the following figure: In this case, the network namespace is shared within the same pod, and containers can access each other by accessing 127.0.0.1: (port). The veth* in the figure refers to one end of the veth pair (the other end is not marked, but actually appears in pairs). The veth pair is mounted on the docker0 bridge by the Docker Daemon, and the other end is added to the network namespace to which the container belongs. The figure shows eth0 in the container. The figure demonstrates the communication between containers in bridge mode. Docker1 sends a request to Docker2, and both Docker1 and Docker2 establish a veth pair with Docker0 for communication. When the request passes through docker0, since the container and docker0 belong to the same subnet, the request passes through the veth* pair of docker2 and docker0 and is forwarded to docker2. This process does not cross nodes, so it does not pass through eth0. 2. Inter-Pod Communication Communication between pods on the same node Since the network namespace is shared within the Pod (created by the pause container), it is essentially communication between containers on the same node. At the same time, the default route of the Pod in the same Node is the address of docker0. Since they are associated with the same docker0 bridge and have the same address segment, they should be able to communicate directly. Let's see how this process is actually implemented. As shown in the figure above, container 1 and container 2 in Pod1 share the network namespace, so requests outside the pod are implemented through the veth pair of pod1 and Docker0 bridge (hung on eth0 and ethx in the figure). When accessing a container in another pod, the requested address is the PodIP instead of the container's IP. In fact, it is also communication between the same subnets and can be forwarded directly through the veth pair. Cross-node communication CNI: Container Network Interface CNI is a standard that aims to provide network standardization for container platforms. Different container platforms (such as the current kubernetes, mesos and rkt) can call different network components through the same interface. Currently, kubernetes supports many types of CNI components, such as bridge calico calico-ipam dhcp flannel host-local ipvlan loopback macvlan portmap ptp sample tuning vlan. In docker, the mainstream cross-host communication solutions are mainly the following: 1) Tunnel-based overlay network: Different companies or organizations have different implementation plans based on tunnel type. Docker's native overlay network is implemented based on vxlan tunnels. Ovn needs to be implemented through geneve or stt tunnels. The latest version of flannel also begins to implement overlay networks based on vxlan by default. 2) Overlay network based on packet encapsulation: Based on data packet encapsulation methods such as UDP encapsulation, cross-host network is implemented on the Docker cluster. Typical implementation solutions include early versions of Weave and Flannel. 3) Implement SDN network based on three layers: Based on three-layer protocols and routing, cross-host network is implemented directly on the three layers, and network security isolation is achieved through iptables. The typical solution is Project Calico. At the same time, for environments that do not support three-layer routing, Project Calico also provides cross-host network implementation based on IPIP encapsulation. Communication Cross-node communication within a cluster involves communication between different subnets, which cannot be achieved by docker0 alone. It needs to be implemented with the help of CNI network plug-in. The figure shows how to use flannel to achieve cross-node communication. Simply put, flannel's user-mode process flanneld will create a flannel.1 bridge for each node, and allocate a globally unique network segment to each node based on the global unified cluster information of etcd or apiserver to avoid address conflicts. At the same time, a veth pair will be created for docker0 and flannel.1, and docker0 will drop the message to flannel.1. Flanneld maintains a global node network table. After receiving a request through flannel.1, it re-encapsulates the request into a UDP packet according to the node table, throws it to eth0, and enters the physical network from the eth0 exit and sends it to the destination node. On the other end, the process is reversed. Flanneld unpacks the data and sends it to docker0, which then sends it to the container in the destination Pod. External access to the cluster There are many ways to access a cluster from outside the cluster, such as loadbalancer, Ingress, and nodeport. Nodeport and loadbalancer are two basic types of services, which are ways to directly expose services to the outside world. Ingress provides seven-layer load balancing. Its basic principle is to forward external traffic to internal services and then forward it to backend endpoints. In daily use, we can choose different methods according to specific business needs. Here we mainly introduce nodeport and ingress. Nodeport By setting the Service type to NodePort, you can expose the service through a specified port on the host in the Cluster. Note that the service can be accessed through the specified port on each host in the Cluster, and requests sent to the host port will be routed by Kubernetes to the Pod that provides the service. With this service type, you can access the service through the host IP: port method outside the Kubernetes cluster network. Here is an example of influxdb. We can also modify this template to other types:
Ingress Ingress is recommended for production environments. It acts as a layer-7 load balancer and HTTP proxy, and can distribute ingress traffic to different backend services based on different URLs. External clients only see the server foo.bar.com, which shields the implementation of multiple internal services. This method simplifies client access and increases the flexibility of backend implementation and deployment. Backend service deployment can be adjusted without affecting the client. The deployment yaml can refer to the following template:
Here we define an ingress template, define the service to be accessed through test.name.com, and define two paths under the virtual host test.name.com, where /test is distributed to the backend service s1 and /name is distributed to the backend service s2. Multiple ingresses can be defined in the cluster to complete the forwarding of different services. Here, an ingress controller is needed to manage the Ingress rules in the cluster. Ingress Contronler interacts with the Kubernetes API to dynamically perceive the changes in the Ingress rules in the cluster, and then reads it. According to the custom rules, the rules specify which domain name corresponds to which service, and generates an Nginx configuration, which is then written to the Pod of Nginx-ingress-control. The pod of this Ingress Contronler runs an nginx service. The controller will write the generated nginx configuration to the /etc/nginx.conf file, and then reload the configuration to take effect. The Ingress Controller template provided by Kubernetes is as follows:
Summary and Outlook This article illustrates the Kubernetes network model from one service, two IPs, and three ports. It also explains how to access the Kubernetes cluster from within and outside the cluster. |
<<: Sogou opens dictation service, the voice recorder industry enters the AI era
>>: Solutions for 5G Network Security Threats
You must have seen descriptions of IP addresses o...
Wi-Fi introduced mesh technology very early, and ...
If an industry wants to develop, the first thing ...
The first model I got to implement as part of my ...
[[405467]] In order for your services to take adv...
Enterprise digital transformation has promoted t...
Recently, my country promulgated the newly revise...
Edge computing provides a new paradigm for runnin...
[[354146]] This article is reprinted from the WeC...
According to foreign media reports, in just a few...
[[406599]] From left to right: Ye Kai, Song Zifu,...
When writing crawlers, we often need to parse the...
With the continuous evolution and development of ...
GreenCloudVPS recently launched its 9th anniversa...
Quantum networks are the foundation for future hi...