The author has developed a simple, stable, and scalable delayed message queue framework for high-concurrency scenarios. It has precise timed tasks and delayed queue processing functions. Since it was open sourced for more than half a year, it has successfully provided precise timed scheduling solutions for more than a dozen small and medium-sized enterprises and has withstood the test of production environments. In order to benefit more friends, the open source framework address is now provided: https://github.com/sunshinelyz/mykit-delay Preface Programmers should not always stay at the CRUD level of the application, but also need to understand some knowledge about the underlying computer. Many friends have been rejected in the interview about computer network. Today, I will share with you the knowledge about computer network, hoping to be of substantial help to you! The article has been collected in: https://github.com/sunshinelyz/technology-binghe https://gitee.com/binghe001/technology-binghe Network seven-layer architecture (ISO/OSI protocol reference model)
TCP/IP Principles The TCP/IP protocol is not the collective name of the two protocols TCP and IP, but refers to the entire TCP/IP protocol family of the Internet. From the perspective of the protocol layering model, TCP/IP consists of four layers: network interface layer, network layer, transport layer, and application layer. Network Access Layer The Network Access Layer is not described in detail in the TCP/IP reference model, but it only points out that the host must use a certain protocol to connect to the network. Internet Layer The Internet Layer is a key part of the entire architecture. Its function is to enable the host to send packets to any network and to transmit the packets independently to the target. These packets may pass through different networks, and the order of arrival and the order of sending may be different. If the upper layer needs to send and receive in sequence, it must handle the ordering of packets by itself. The Internet Layer uses the Internet Protocol (IP). Transport Layer (TCP/UDP) The transport layer enables peer entities on the source and destination machines to conduct conversations. Two end-to-end protocols are defined at this layer: Transmission Control Protocol (TCP) and User Datagram Protocol (UDP). TCP is a connection-oriented protocol that provides reliable message transmission and connection services for upper-layer applications. To this end, in addition to basic data transmission, it also has functions such as reliability assurance, flow control, multiplexing, priority and security control. UDP is a connectionless unreliable transmission protocol, mainly used for applications that do not require TCP's sorting and flow control functions. Application Layer The application layer includes all high-level protocols, including: virtual terminal protocol (TELNET, TELecommunications NETwork), file transfer protocol (FTP, File Transfer Protocol), email transfer protocol (SMTP, Simple Mail Transfer Protocol), domain name service (DNS, Domain Name Service), network news transfer protocol (NNTP, Net News Transfer Protocol) and hypertext transfer protocol (HTTP, HyperText Transfer Protocol). The relationship between the four-layer protocol and the corresponding standard seven-layer protocol TCP three-way handshake/four-way wave Three-way handshake First handshake: Host A sends a data packet with bit code syn=1 and randomly generated seq number=1234567 to server host B. From SYN=1, we know that A wants to establish a connection; Second handshake: After receiving the request, host B needs to confirm the online information and sends ack number = (host A's seq + 1), syn = 1, ack = 1, and randomly generates a packet with seq = 7654321 to A. The third handshake: After receiving it, host A checks whether the ack number is correct, that is, the seq number + 1 sent for the first time, and whether the bit code ack is 1. If correct, host A will send ack number = (host B's seq + 1), ack = 1, and host B will confirm it after receiving it. Four waves (1) TCP requires three handshakes to establish a connection, and four to disconnect. This is due to TCP's half-close. Because TCP connections are full-duplex (that is, data can be transmitted in both directions at the same time), each direction must be closed separately when closing. This unidirectional closing is called a half-close. When one party completes its data sending task, it sends a FIN to notify the other party that it will terminate the connection in this direction. (2) Close the connection from the client to the server: First, client A sends a FIN to close the data transmission from the client to the server, and then waits for the server's confirmation. The termination flag FIN = 1, the sequence number seq = u (3) After the server receives the FIN, it sends back an ACK, where the confirmation number ack is the received sequence number plus 1. (4) Close the connection from the server to the client: also send a FIN to the client. (5) After receiving the FIN, the client sends back an ACK message to confirm, and sets the confirmation sequence number seq to the received sequence number plus 1. The party that closes first will perform an active close, while the other party will perform a passive close. After host A sends FIN, it enters the termination wait state. After server B receives the connection release segment from host A, it immediately sends a confirmation to host A, and then server B enters the close-wait state. At this time, the TCP server process notifies the high-level application process, so the connection from A to B is released. This is the "half-closed" state. That is, A cannot send to B, but B can send to A. At this time, if B has no datagram to send to A, its application process notifies TCP to release the connection, and then sends a connection release segment to A and waits for confirmation. After A sends the confirmation, it enters the time-wait state. Note that the TCP connection has not been released at this time, and then A enters the close state after the 2MSL set by the time wait timer has passed. Why is there a TIME_WAIT state?
TCP/IP Status LISTENING(listening) After the FTP service is started, it is first in the listening state. ESTABLISHED (established) Connection established. Indicates that the two machines are communicating. CLOSE_WAIT If the other party actively closes the connection or the network anomaly causes the connection to be interrupted, our status will change to CLOSE_WAIT. At this time, we need to call close() to close the connection correctly. TIME_WAIT We actively call close() to disconnect, and after receiving confirmation from the other party, the state changes to TIME_WAIT. SYN_SENT The SYN_SENT state indicates a connection request. When you want to access services on other computers, you must first send a synchronization signal to the port. At this time, the state is SYN_SENT. If the connection is successful, it becomes ESTABLISHED. TCP long connection and short connection Reasons for using long and short connections When TCP protocol is used for network communication, a connection must be established between the server and the client before the actual read and write operations. When the read and write operations are completed, both parties can release the connection when they no longer need it. The establishment of a connection requires three handshakes, and the release requires four handshakes. Therefore, the establishment of each connection requires resource consumption and time consumption. HTTP long connection and short connection HTTP's long and short connections are essentially TCP's long and short connections. In HTTP/1.0, short connections are used by default. That is, each time a client and server perform an HTTP operation, a connection is established, and the connection is terminated when the task is completed. When a client browser accesses an HTML or other type of web page that contains other web resources (such as JavaScript files, image files, CSS files, etc.), each time such a web resource is encountered, the browser will re-establish an HTTP session. Starting from HTTP/1.1, persistent connections are used by default to maintain the connection characteristics. When using the HTTP protocol with persistent connections, this line of code will be added to the response header:
When using a persistent connection, when a web page is opened, the TCP connection between the client and the server for transmitting HTTP data will not be closed. When the client accesses the server again, it will continue to use the established connection. Keep-Alive does not maintain a connection permanently. It has a hold time, which can be set in different server software (such as Apache). To achieve a persistent connection, both the client and the server must support persistent connections. TCP—Long Connection The so-called long connection means that multiple data packets can be sent continuously on a TCP connection. During the TCP connection, if no data packet is sent, both parties need to send detection packets to maintain the connection. Generally, they need to maintain the connection online by themselves (no RST packet and four waves). Connection → Data transmission → Maintain connection (heartbeat) → Data transmission → Maintain connection (heartbeat) → ... → Close connection (one TCP connection channel for multiple read and write communications); This requires that long connections send data packets (heartbeats) regularly when there is no data communication to maintain the connection status; TCP keep-alive function, keep-alive function is mainly provided for server applications, server applications want to know whether the client host crashes, so that they can use resources on behalf of the client. If the client has disappeared, leaving a half-open connection on the server, and the server is waiting for data from the client, the server will wait for the client's data. The keep-alive function attempts to detect this half-open connection on the server side. TCP—Short Connection A short connection means that when the two communicating parties have data exchange, a TCP connection is established. After the data is sent, the TCP connection is disconnected (it is relatively simple to manage, and the existing connections are all useful connections, and no additional control means are required); Connect → Data transfer → Close connection; Application Scenario Long connection: It is mostly used for frequent operations (reading and writing), point-to-point communication, and the number of connections cannot be too many. Each TCP connection requires a three-step handshake, which takes time. If each operation is connected first and then operated, the processing speed will be much slower. Therefore, it is OK to send data packets directly during the first processing without establishing a TCP connection. For example, the database connection uses a long connection. If short connections are used for frequent communication, socket errors will occur, and frequent socket creation is also a waste of resources. Short connection: HTTP services like WEB websites generally use short connections (http1.0 only supports short connections, 1.1 keep alive with time and long connections with operation times limit), because long connections will consume certain resources for the server, and short connections will save more resources for WEB websites with tens of thousands or even hundreds of millions of clients. If long connections are used, and there are tens of thousands of users at the same time, if each user occupies a connection, you can imagine the consequences. Therefore, when the concurrency is large, but each user does not need to operate frequently, it is better to use short connections; HTTP Principles HTTP is a stateless protocol. Stateless means that there is no need to establish a persistent connection between the client (Web browser) and the server, which means that when a client sends a request to the server and the server returns a response, the connection is closed and no information about the connection is retained on the server. HTTP follows the request/response model. The client (browser) sends a request to the server, and the server processes the request and returns an appropriate response. All HTTP connections are constructed as a set of requests and responses. Geocoding For example, use a client browser to request this page: http://www.lydms.com:8080/index.htm, and decompose the protocol name, host name, port, object path and other parts. For our address, the parsed result is as follows: Protocol name: http Host name: www.lydms.com Port: 8080 Object path: /index.htm In this step, the domain name system DNS is needed to resolve the domain name localhost.com and obtain the IP address of the host. Encapsulate HTTP request packets Combine the above parts with the local machine's own information and encapsulate them into an HTTP request data packet Encapsulate into TCP packet and establish connection Encapsulate into TCP packet and establish TCP connection (TCP three-way handshake) The client sends a request The client sends a request command: After establishing a connection, the client sends a request to the server. The format of the request is: Uniform Resource Identifier (URL), protocol version number, followed by MIME information including request modifiers, client information and content. Server Response After receiving the request, the server gives the corresponding response information in the format of a status line, including the protocol version number of the information, a success or error code, followed by MIME information including server information, entity information and possible content. The server closes the TCP connection Normally, once a web server sends a request to a browser, it closes the TCP connection. If the browser or server adds the line Connection:keep-alive to its header information, the TCP connection will remain open after the request is sent, so the browser can continue to send requests through the same connection. Keeping the connection alive saves the time required to establish a new connection for each request and also saves network bandwidth. HTTPS HTTPS (full name: Hypertext Transfer Protocol over Secure Socket Layer) is an HTTP channel with security as its goal. Simply put, it is a secure version of HTTP. That is, an SSL layer is added to HTTP. The security foundation of HTTPS is SSL. The port number it uses is 443. The process is roughly as follows: The relationship between SSL/TLS SSL is the abbreviation of "Secure Sockets Layer" in English. It is called "Secure Sockets Layer" in Chinese. It was designed by Netscape in the mid-1990s. Why was the SSL protocol invented? Because the HTTP protocol originally used on the Internet is plain text, which has many shortcomings - such as the transmission content can be sniffed and tampered with. The SSL protocol was invented to solve these problems. By 1999, SSL had become the de facto standard on the Internet due to its widespread use. The IETF standardized SSL that year. The name after standardization was changed to TLS (abbreviation for "Transport Layer Security"), which is called "Transport Layer Security Protocol" in Chinese. Many people refer to the two in parallel (SSL/TLS) because they can be seen as different stages of the same thing. Establish a connection to obtain a certificate After the SSL client establishes a connection with the server via TCP (port 443), it requests a certificate during the normal TCP connection negotiation (handshake) process. That is, the client sends a message to the server, which contains a list of algorithms that it can implement and other required messages. The SSL server responds with a data packet, which determines the algorithm required for this communication, and then the server returns a certificate to the client. (The certificate contains server information: domain name, company applying for the certificate, and public key). Certificate Verification After receiving the certificate returned by the server, the client determines the public issuing authority that issued the certificate and uses the public key of this authority to confirm whether the signature is valid. The client also ensures that the domain name listed in the certificate is the domain name it is connecting to. Data encryption and transmission If the certificate is confirmed to be valid, a symmetric key is generated and encrypted with the server's public key. It is then sent to the server, and the server decrypts it using its private key, so that the two computers can begin to communicate using symmetric encryption. CDN Principles CND generally includes a distribution service system, a load balancing system, and a management system. Distribution service system Its basic working unit is each Cache server. It is responsible for directly responding to user requests and quickly distributing content to users; it is also responsible for content updates to ensure synchronization with the source site content. According to the different content types and service types, the distribution service system is divided into multiple sub-service systems, such as: web acceleration service, streaming media acceleration service, application acceleration service, etc. Each sub-service system is a distributed service cluster, which is composed of distributed Cache clusters with similar functions and close geographical locations. In addition to synchronizing, updating and responding to user requests, the distribution service system also needs to feedback the health status, response status, content cache status, etc. of each Cache device to the upper-level management and scheduling system, so that the management and scheduling system can decide which Cache device will respond to the user's request according to the set strategy. Load Balancing System The load balancing system is the core of the entire CDN system. It is responsible for scheduling all user requests and determining the final access address provided to the user. It is implemented in a hierarchical manner. The most basic two-pole scheduling system includes global load balancing (GSLB) and local load balancing (SLB). GSLB determines the node that serves the user based on the user address and the content of the user request, mainly based on the principle of proximity. It is generally implemented through DNS resolution or application layer redirection (Http 3XX redirection). SLB is mainly responsible for load balancing within the node. When a user request is scheduled from GSLB to SLB, SLB will redirect the user request based on the working status of each cache device in the node and the content distribution. SLB can be implemented in four-layer scheduling (LVS), seven-layer scheduling (Nginx) and link load scheduling. Management System It is divided into operation management and network management subsystems. The network management system implements equipment management, topology management, link monitoring and fault management of the CDN system, and provides administrators with visual centralized management of the entire network resources, usually implemented in a web manner. Operation management is the business management of the CDN system, responsible for handling some collection, organization and delivery work required for interaction with external systems at the business level. Including user management, product management, billing management, statistical analysis, etc. TCP/IP protocol suite TCP/IP only provides connectionless, unreliable services. A three-way handshake is required before transmission. The main functions of IP include encapsulating upper layer data (such as TCP, UDP data) or other data at the same layer (such as ICMP data) into IP datagrams and transmitting IP datagrams to the final destination; in order to enable data to be transmitted on the link layer, the data is segmented and the path for the datagram to reach the destination in other networks is determined. Application layer protocol—File Transfer Service (FTP) Used to transfer files between computers. The actual FTP service on the Internet is an anonymous FTP service that sets a special user name - anonymous for public use. After logging into the FTP server anonymously, it works the same as regular FTP. Usually for security purposes, most anonymous FTP servers only allow downloading, not uploading files. FTP establishes two TCP connections from the client to the server, one is the control connection, which is mainly used to transfer commands and parameters (port 21); the other is the data connection, which is mainly used to transfer files (port number 20). Application layer protocol - Remote login protocol (Telnet) Remote login service is supported by Telnet protocol, which connects the user's computer and the remote host. The program runs on the remote computer, and the information entered by the user is sent to the remote host through Telnet protocol. The host listens to the user's request on the TCP port, processes it, and returns the result to the client through Telnet protocol. The client then displays it on the computer screen after appropriate conversion. Because Telnet command is used for remote login, it is called Telnet remote login. It consists of three parts: client software, server software and Telnet general protocol. Application layer protocol - Email protocol (SMTP) E-mail is an electronic media letter that uses computers to exchange information. Based on the client/server model, it consists of three parts: E-mail client software, E-mail server, and communication protocol. When sending an email, it first reaches the registered mail server host, then passes through multiple computers and routes during network transmission to reach the destination mail server host, enters the recipient's email mailbox, and finally the recipient of the email goes online and starts the email management program, which will automatically download it to his or her computer, completing the receipt of the email. SMTP: Simple Mail Transfer Protocol MIME: Internet Mail Extensions PEM: Privacy Enhanced Mail Protocol POP: It is used to store emails that users fail to retrieve in time. It is a simple plain text protocol. Each transmission is in the unit of regular E-mail and partial transmission is not provided. Transport layer protocol—TCP Based on the unreliable data service provided by IP, TCP provides a reliable, connection-oriented, full-duplex data transmission service for applications. When TCP establishes and closes a connection between the source host and the destination, it needs to confirm whether the establishment and closing are successful through a three-way handshake. Although TCP provides a reliable data transmission service, it does so at the expense of communication volume. **TCP uses retransmission technology: **When sending data, the timer is started. If no confirmation information is received within the specified time, the data packet is resent. Transport layer protocol—UDP The User Datagram Protocol is an unreliable, connectionless protocol. Compared with TCP, which is a connection-oriented protocol at the same level, UDP is a connectionless protocol (without error detection function). TCP helps to provide reliable connections, and UDP helps to improve the high transmission rate. It is not responsible for resending lost packets, sorting received data, eliminating duplicate IP datagrams, and establishing and terminating connections. (All of these are the responsibilities of the UDP application) TCP: Interactive session applications (FTP, etc.). UDP: performs error detection by itself, no need to detect errors (DNS, SNMP). Internet Layer Protocol—IP IP only provides connectionless and unreliable services, and delegates services such as error detection and flow control to protocols at other layers. The main functions of IP are:
Internet Layer Protocol—ICMP Internet Control Message Protocol, a protocol for sending error checking messages. ICMP makes IP more stable. It also uses IP to transmit messages. The ping tool uses ICMP messages to test whether the target is reachable. 5 types of error messages: (source suppression, timeout, destination unreachable, redirection, and fragmentation required) 4 types of information messages: Echo Request, Echo Reply, Address Mask Request and Address Mask Reply. Internet layer protocols—ARP and RARP Address Resolution Protocol (ARP) and Reverse Address Resolution Protocol (RARP) The function of ARP is to convert IP addresses into physical addresses, and the function of RARP is to convert physical addresses into IP addresses. Each device has a unique physical address (given by the network card). In order to shield the differences between the underlying protocol and the physical address, the IP protocol uses the IP address. Therefore, during the transmission process, the IP address and the physical address must be converted to each other. Network Interface Layer Protocol—Ethernet IEEE 802.3 Ethernet IEEE 802.3: Standard LAN, speed 10Mps, transmission medium is copper cable. Ethernet IEEE 802.3u: Fast Ethernet, speed 100Mps, transmission medium is twisted pair. Ethernet IEEE 802.3z: Gigabit Ethernet, speed 1000Mps, transmission medium is optical fiber or twisted pair. Network Interface Layer Protocol—Token Ring (Ethernet IEEE 802.5)
Network interface layer protocol—Fiber Distributed Data Interface (FDDI) Optical fiber is used as the transmission medium. The dual-ring architecture is adopted, and the information on the two rings flows in opposite directions. One ring in the dual ring is called the primary ring, and the other ring is called the secondary ring. Under normal circumstances, the primary ring transmits data and the secondary ring is idle. The purpose of the dual-ring design is to provide high reliability and stability. The transmission media defined by FDDI are single-mode optical fiber and multi-mode optical fiber. Network interface layer protocol—Point-to-Point Protocol (PPP) Mainly used for wide area connection mode such as "dial-up Internet access". Advantages: simple, with user authentication function, can solve IP allocation, etc. It is a general solution for simple connection between various hosts, bridges and routers. The method of using Ethernet resources to run PPP on Ethernet for user authentication access is called PPPoE. It is currently the most widely used technical standard for ADSL access. The way that ATM networks run PPP to manage user reauthentication is called PPPoA. PPPoA and PPPoE have the same operating principles, but they differ in the operating environments. Others—ADSL (Asymmetric Subscriber Data Line) There are three ways to dial up to the Internet via ADSL Modem, namely dedicated line (static IP), PPPoA and PPPoE. ADSL has exclusive bandwidth, is safe and reliable, has low cost, uses the old telephone lines, and can separate the telephone and ADSL Modem (Internet access). Other—IPv4 and IPv6 IPv4: 32-bit binary, can represent the number of IP addresses: 2^32=4.2 billion. IPv6: 128-bit binary system, can represent the number of IP addresses: 2^128=3.4 * 10^38. This article is reprinted from the WeChat public account "Glacier Technology". You can follow it through the QR code below. To reprint this article, please contact the Glacier Technology public account. |
<<: What impact will satellite internet have on you when it really arrives?
>>: Let’s talk about PHY register, do you know it?
Zhou Jiaxing, Director of the Authorized Service ...
In the Internet world, HTTP and HTTPS are the two...
DogYun (狗云) released a discount plan for the May ...
According to the data of "Economic Operation...
[ Promotion code expired ] Title This should be th...
Normally, the labs we are talking about are labs ...
[[401820]] This article is reprinted from the WeC...
HostKvm has launched a special summer promotion, ...
According to the website of the National Developm...
In the previous article, we gave a detailed descr...
404 Not Found When surfing the Internet, whether ...
F5 (NASDAQ: FFIV) recently announced the launch o...
[[401931]] This article is reprinted from the WeC...
【51CTO.com Quick Translation】CIOs of leading comp...
The Internet of Things is an important part of fu...