A graphic guide to selecting network equipment

Hello everyone, I am Xiao Fu.

Illustrated network equipment, layer 2 switches, layer 3 switches, firewalls, and WiFi have introduced common equipment and we know what these equipment do. However, there are many brands and models of a single device. When you really want to buy equipment, you will probably be dazzled by the variety of equipment. Next, we will introduce how to select equipment and choose the network equipment that best suits you.

Product Type

Network products include routers, switches, firewalls, wireless APs, etc. When choosing a product type, you need to consider whether it is needed in the network. Do you need a router for routing selection? Do you need a core switch? Do you need a firewall responsible for security control? How to configure these devices in the network?

Product Type:

router
Layer 2 switch
Layer 3 switch
Firewall
Wireless AP
Load Balancing
Bandwidth Control
acting
......

When building a new network, you can choose the corresponding product based on the product type in the network. When you need to replace some network equipment in the existing network, you can also choose the same type of equipment. If the performance is sufficient, you can use a three-layer switch or firewall to replace the router. It is also possible to replace a two-layer switch with a three-layer switch. When selecting security equipment, some firewalls have content-based security control functions. You can consider using firewalls to replace independent antivirus equipment, URL filtering equipment, IDS/IPS equipment, etc. to achieve the purpose of reducing costs.

Select the device model according to your needs

After determining the product type, you need to select a specific device model based on the required functions. Generally, the following aspects will be considered.

Network interface and interface rate

Whether the number of interfaces on the WAN side and LAN side meets the requirements. Whether the interface type and interface rate such as RJ-45 10/100/1000BASE-T or SFP 1000BASE-SX meet the requirements.
When using VLAN or virtual router, check whether the number of sub-interfaces (logical interfaces) meets the requirements.
When using aggregation interfaces such as IEEE 802.1ad, check whether the aggregated bandwidth meets the requirements.

performance

Is the throughput (transfer rate) sufficient?
What is the range of processing using hardware such as ASIC or FPGA and what is the range of processing using CPU software? Based on this information, which traffic can be processed with high performance and which traffic cannot be processed with high performance.
When performing content scanning, how large the file size can be scanned.
If multiple functions are running at the same time, check whether the device CPU usage and memory usage are fully utilized.

Software Features

Which protocols are supported and which independent functions are available.
Network capabilities.
Management functions.
Reporting capabilities.

Ease of migration

When replacing equipment, it is easier to migrate using existing equipment from the same manufacturer and running the same operating system.

After-sales service

The product can be maintained on-site or needs to be sent back to the original manufacturer.
Support time for after-sales service.

Network latency

Network devices such as routers will forward messages. There will always be a delay in the process of transmitting messages from the place where they are sent to the destination.

When transmitting real-time traffic such as sound and video in the network, it is necessary to collect the parameters of the delay between each router. Reducing the end-to-end network delay must be an important goal of the entire network design.

ITU-T recommended G.114 defines three types of delay.

Type of delay.

End-to-end delay: The time it takes for a message to reach its destination after it is sent from the source. The more network devices the message passes through, the greater this value will be.
Processing delay: The time from when a message enters the device's input interface to when it enters the queue of the output interface is generally only a few microseconds.

Packetization delay: Data is divided into multiple parts for transmission. The time for encoding, compression, and packaging operations is generally tens of milliseconds.
Queuing delay: The time a packet stays in the outbound interface queue. In QoS control, low-priority packets stay in the queue for a long time, which means the delay is large, usually a few milliseconds.

Serialization delay: The time it takes for electrical, optical, electromagnetic and other signal conversions to occur when a message is sent across an interface. The specific value is calculated using the formula "message size ÷ bandwidth". The higher the rate of the interface, the lower the serialization delay. When a 64-byte message is transmitted using a 64kbit/s bandwidth, the serialization process time is 8 milliseconds.

Propagation delay: The time it takes for a signal to reach the next device via a medium such as a cable or radio wave. It depends on the transmission speed of the medium. Fiber optics generally take 6 microseconds to travel a distance of 1 km.

Network delay: The time it takes for a message to pass through a WAN or the Internet. The average network delay for IP telephony is usually less than 70 milliseconds.

Latency

The delay from when a network device receives data to when it sends it again is called latency. The smaller the latency, the stronger the device's ability to process messages. Latency is equivalent to the time of "processing delay + queue delay + serial delay".

In the throughput test structure, the time it takes for a message sent from the tester to pass through the router and then return to the test instrument is called latency. The latency of network equipment is generally a few microseconds.

Jitter

There is a certain time interval when a message is sent. The phenomenon that this time interval becomes longer or shorter in actual transmission is called jitter. For example, the source sends a message every 5 milliseconds, but the actual time interval received by the receiver is 4, 3, 6, 5, and 7 milliseconds. VoIP and streaming applications can alleviate some jitter through caching, but excessive jitter will cause sudden interruption of sound and picture. When conducting real-time two-way streaming video conferences, a network environment with a delay of less than 150 milliseconds and a jitter of less than 35 milliseconds is recommended. Compared with one-way video streaming, since the application receives the cache and can handle some delays and jitters, it allows more than 10 times the network delay time for two-way.

Packet loss

The phenomenon that a message transmitted on the network does not reach the destination is called packet loss. Packet loss is expressed as a percentage of the message loss rate. Usually, the message loss rate of an IP phone network should be below 0.1%.

Round trip time

The time it takes for a message sent by a source to arrive at the destination, generate a response message at the destination, and return to the source until the source receives the response message is called the round trip time (RTT). The round trip time is checked by sending an ICMP Echo Request message using the ping command and then receiving an ICMP Echo Reply message, and is measured in milliseconds.

If the Internet has a long delay problem, it is necessary to use the QoS function of the QoS device or router to control the priority of message forwarding to ensure that high-real-time applications are forwarded first and minimize queue delays.

Performance Testing

The performance of network equipment can be statistically tested using test instruments. In the product catalog, there will be parameters such as bit/s and pps, and some manufacturers will also explain the test environment under which the value was obtained.

The test object is called DUT (Device Under Test). The test instrument gradually increases the number of messages sent to the router at the sending port (Tx), and then tests the number of messages returned by the DUT at the receiving port (Rx). When the performance limit of the DUT is reached, packet loss will occur on the DUT, and the number of messages received by the test instrument will decrease.

The DUT's ability to continuously transmit without packet loss is called NDR (non-drop rate), which is generally called maximum throughput in product catalogs.

RFC2544 defines the test method for network device throughput and recommends the data frame size used in the test. For example, in an Ethernet environment, it is recommended to use data frames of 64, 128, 256, 512, 1024, 1280, and 1518 bytes for testing.

When testing routers, test instruments often simulate actual Internet communication traffic and various data frame combinations called IMIX (Internet Mix) for testing.

The test instrument can simulate a network environment with millions of clients connected by generating various messages from layer 2 to layer 7. Through the test of the test instrument, various performance indicators such as the maximum throughput rate and the maximum number of online sessions of the network equipment can be clearly determined.

Maximum throughput

The maximum throughput, in Mbit/s, refers to the throughput of continuously processing data frames with a length of 1518 bytes. Of the 1518 bytes, 18 bytes of frame header are removed, leaving 1500 bytes of IP packets. After removing the 20 bytes of IP header, 1480 bytes of IP data payload remain, and the final processing is these 1480 bytes.

Routers forward data in packets (data frames) rather than bytes. Therefore, the maximum value of the pps (indicator of how many packets a router can process in one second) plus 1518 bytes is the maximum throughput of the router.

Product performance will vary depending on the IP data content. Compared with TCP, better throughput values can be obtained when using UDP, a message with a simple header, for testing.

Switch performance

The data frame forwarding of Layer 2 switches and Layer 3 switches is completed through ASIC, so the switching capacity and switching capability in the product catalog can be regarded as the actual performance indicators of the device.

Switching capacity

Switching capacity, also known as backplane capacity, is the bandwidth capacity of data transmission inside the switch. When the traffic is higher than the switch capacity, the switch will be unable to process it due to insufficient buffer or internal bandwidth, resulting in data frame loss and increased packet loss rate.

Exchange Capacity

In addition to bit/s to indicate the capacity of a switch, pps (packets per second) can also be used to indicate the switching capability of a switch, that is, the number of data frames that can be processed per unit time.

The switch checks the data frame header, first confirms the destination MAC address, and checks whether the data frame tail is abnormal, and finally checks whether there is a match in the access control list. If there is a match, the data frame is filtered or forwarded. As the number of data frames increases, the number of switches to process also increases, and the same is true for routers. Therefore, when the bit/s of the processed traffic is the same, the smaller the data frame, the greater the processing workload, and the system load also increases.

The minimum Ethernet data frame is 64 bytes, plus 8 bytes of preamble and SFD (frame header delimiter), and 12 bytes of IFG (data frame interval) between data frames, a total of 84 bytes. That is to say, when forwarding a data frame, the switch needs to process 672 bits of data.

The theoretical maximum line speed is also called line speed. For 1000Mbit/s Ethernet, the line speed is 1000000000 bit/s ÷ 672 bit = 1488000 pps, which is 1.488 Mbit/s. The line speed of 10 Gigabit Ethernet is 14880000 pps (14.88 Mbit/s).

A switch is composed of multiple interfaces. If a switch has 24 10/100/1000BASE-T interfaces, then the switching capacity is 24 × 1.488Mbit/s = 35.712Mpps. If the switching capacity is less than this value, blocking will occur, causing all interfaces to fail to reach the theoretical maximum line speed.

In fact, most of the data transmitted on the switch is TCP or UDP application data. In UDP, messages with high real-time requirements generally have a length of 100 to 300 bytes to communicate. In TCP, there are bandwidth controls such as window size, and the actual rate often does not reach the theoretical line speed level.

MAC table capacity

Layer 2 switches use MAC tables to manage MAC addresses, and Layer 3 switches also use Layer 3 tables to manage IP addresses. If the table exceeds the number of entries, the device cannot forward packets normally, resulting in packet loss. When testing device performance, limit the number of addresses in the message to within the range supported by the MAC table.

Broadcast Storm

A broadcast storm is a phenomenon in which multiple switches are connected in a loop and data frames are constantly forwarded back and forth. This phenomenon will cause excessive consumption of network bandwidth and switch resources, and eventually lead to the paralysis of the entire network. This problem can be avoided by using the spanning tree function. The spanning tree solves the network loop problem by closing the NDP port. In addition, when encountering a DoS attack, a bug in the operating system, or a NIC failure that causes the spanning tree to fail to work properly, a broadcast storm will also occur.

At this time, you can use the switch's broadcast storm control function to avoid this phenomenon. The principle of the broadcast storm control function is to monitor data frames on the port. If the number of data frames exceeds the preset upper limit, the part exceeding the upper limit will be discarded. The data frame upper limit uses pps as the unit, and can define and configure unicast, multicast, and broadcast separately.

If there is no loopback state, the broadcast frame will only be forwarded to all terminals in the broadcast domain. On the contrary, if there is a loop in the network, the broadcast frame will always be forwarded in the loop, resulting in an increasing number of data frames, and eventually the bandwidth of the entire network will be consumed by the broadcast frames.

Router performance

Router performance is expressed in terms of forwarding capacity per unit time, or throughput. The unit of throughput is bit/s (bits per second), or pps (packets per second).

For routers with the same pps performance, the larger the packets forwarded, the higher the bit/s value of the router. For example, a router with a processing capacity of 100 pps can process 64-byte (512-bit) packets at a rate of 51.2 kbit/s, but can process 1500-byte (12000-bit) packets at a rate of 1.2 Mbit/s.

Firewall performance

Number of concurrent online sessions

The firewall uses the session table to manage sessions and control traffic in units of sessions. The number of entries recorded in the session table indicates the number of simultaneous online sessions that the firewall can handle.

Small desktop firewalls can generally support tens of thousands of sessions, and firewalls used by telecom operators can manage millions of sessions simultaneously.

Session lifetime

When a UDP message or TCP message that passes the security policy reaches the firewall, the firewall will generate the corresponding session information. If there is no traffic in this session within a certain period of time, the session will be deleted. This period of time is called the session lifetime.

After the session information is deleted, when the message related to this session reaches the firewall, the firewall needs to regenerate the session information. If it is a UDP message, you only need to regenerate the session information, but if it is a TCP message, except for the SYN message, all other messages will be discarded. If the message other than SYN is rejected by the firewall, the client application needs to re-initiate the process, perform a 3-way handshake with the server, and re-establish the TCP connection.

The session lifetime can be set to different values according to different protocols. Generally, the TCP session lifetime is 1 hour, and the UDP and other IP protocol session lifetime is 30 seconds.

If the survival time is too long, or there is no survival time, TCP does not receive FIN or RST, the connection will remain open, and the UDP session will not end, and the session information will be retained.

Since the number of session entries in the session information table is limited, if they are not cleared for a long time, the number of entries will reach the upper limit. When the number of session entries reaches the upper limit, new sessions cannot be created, resulting in communication failure.

Sessions per second

Router performance is generally described using bit/s and pps. For firewalls, the parameter indicator of the number of sessions that can be established per second (new session per second) should also be added. This indicator indicates how many sessions can be established within 1 second. A complete session establishment process includes: monitoring the three handshakes of the TCP connection, generating session information if the handshake is normal, and recording the session information in the session table. If the value does not meet the network requirements, it will cause the failure to establish new session information in the network. During use, users will feel that the response speed of this network is very slow.

VPN

Firewalls or security devices will support site-to-site IPsec-VPN, remote access IPsec-VPN or SSL-VPN functions. Some products also support SSL (HTTPS) decryption for user communications.

When performing encryption or decryption operations, the system load will increase, resulting in performance degradation, compared to plain text communication. Although the use of ASIC chips to complete encryption will not cause performance degradation, almost all devices use CPU software processing. Therefore, when the communication traffic increases, the performance will still drop significantly.

Wireless AP Performance

In actual environments, due to interference between CSMA/CA and radio waves, and different strengths of radio waves caused by different distances, the AP cannot achieve the maximum throughput supported by theory.

CSMA/CA

IEEE 802.11 wireless AP uses CSMA/CA communication.

In CSMA, CS is used to perform carrier sense. When encountering other terminals sending data frames, this terminal stops sending and waits until other terminals have finished sending. MA refers to multiple access, that is, multiple terminals share one transmission medium. CA is conflict avoidance. When encountering other devices sending data, it waits for the device to finish sending, and then waits for a random period of time before continuing to send data. Through this mechanism, multiple nodes can be staggered to send data frames at the same time, effectively reducing the possibility of conflicts.

Since wired networks can detect conflicts in a timely manner through electrical noise, wireless networks cannot detect conflicts quickly and effectively and can only use the CSMA/CA mechanism to avoid conflicts.

ACK data frame

After receiving the data frame, the AP needs to return an ACK data frame. When the sender receives the ACK data frame, it means that the entire communication process is over. However, when the wireless signal is not good, the receiver does not receive the data frame and will not send the ACK data frame. At this time, the sender will resend the data frame. On the other hand, when the receiver successfully receives the data frame and returns the ACK data frame, but the sender does not receive the ACK data frame, the receiver will also send the ACK data frame again. The distance between the terminal and the AP and the state of the wireless signal will affect the probability of data retransmission.

In actual environments, the probability of resending data is about 20%.

Site investigation

By using a spectrum analyzer to conduct a site survey, you can confirm the wireless signal interference in the wireless network area, the impact of reflected bands, external radio waves, and noise, and deploy and configure the optimal AP.

Generally, on-site investigation can be completed by following the steps below.

1. Prepare a floor plan of the office space.

2. Test the radio waves emitted from adjacent APs to understand the radio wave conditions at the current location.

3. Determine the number of APs, radio wave strength, and channels used through simulation, and mark them on the floor plan of the office space.

4. Based on the simulation results, temporarily configure and verify the configured AP.

5. After completing the configuration, make a final confirmation on whether the AP can cover all areas.

Select a switch

Select Access Switch

The downlink port of the switch is used to connect to the terminal, and most access switches have 1G downlink ports. Currently, personal computers have 1G network interfaces, but if the switch has a 100M interface, the adaptive function will make the link speed become 100Mbit/s, and the downlink may become a network bottleneck.

Most switches use 2 or 4 10G uplink ports. Two uplink ports can simultaneously connect two aggregation switches or core switches to form a redundant network structure. Four uplink ports form two groups of channels, forming a redundant network structure with the upper switch at twice the throughput.

The number of downstream ports is determined by the number of terminals such as clients or printers.

Select aggregation switches and core switches

In large-scale networks, a hierarchical networking structure consisting of access switches, aggregation switches, and core switches is required.

The downlink of the aggregation switch is generally connected to the uplink of the access switch using a 10G network interface. In the three-layer network structure, the uplink of the aggregation switch must be connected to the downlink of the core switch, while in the two-layer network, the uplink of the access switch must be connected to the downlink of the core switch, and some are connected using 40Gbit/s and 100Gbit/s network interfaces.

If you are connected to the Internet, routers and firewalls are often the network bottleneck. If there is a lot of communication in the LAN, the switch may become the biggest network bottleneck. If the budget is limited, you can choose a switch with a throughput that meets the minimum requirements.

The number of ports on aggregation switches and core switches should be designed based on the number of access switches and terminals. Frame switches can meet the demand for increased ports by adding line card modules.

PoE

When using PoE technology to power wireless APs, IP phones, cameras and other devices, it is necessary to design and plan the power supply capacity. For example, a PoE switch can provide 420W of power, and can simultaneously support 24 ports using 802.3af (15.4W/interface) power supply, or simultaneously support 12 ports using 802.3at (30W/interface) power supply, or simultaneously support 6 ports using 802.3bt (60W/interface) power supply.

Choosing a Router

The number of interfaces on a router is selected based on the number of network segments it connects to. In Ethernet, the number of physical interfaces used only needs to meet the minimum requirement. You can increase the number of interfaces by adding a layer 2 switch, or use VLAN sub-interfaces to alleviate the problem of insufficient interfaces.

Select Firewall

When configuring a firewall on an Internet gateway, at least two ports must be prepared, namely an uplink port connected to the Internet and a downlink port connected to the intranet. This is the most commonly used and traditional firewall networking method.

In order to ensure the security of the intranet, a firewall is now deployed in the intranet, and interfaces such as RJ-45, SFP/SFP+ are configured to provide 8 to 24 network ports.

Network equipment interoperability

Interoperability refers to the situation where different types of network devices in the network can communicate normally after being connected to each other. Since network devices usually implement the same standards or protocols such as RFC or IEEE, it can be said that there is interconnectivity between devices from different manufacturers. However, on the other hand, manufacturers have independently implemented some unstandardized functions, which cannot run on all devices.

When you need to introduce network devices produced by multiple different manufacturers to form a network, you must consider the interoperability of the devices and use it as an important basis for selecting devices. For functions such as access control lists or virus scanning that can be processed within the network device and do not need to be connected to other devices on the network, there is no need to consider interoperability.

High Availability

MTBF and MTTR

Electrical products including network equipment and computer systems usually use MTBF (mean time between failures) to calculate the probability of failure. This parameter is measured in hours and can be calculated using the following formula.

MTBF = operating time / number of failures.

In actual use, MTBF can also be calculated using prediction or extrapolation. One of the extrapolation methods is data measurement. By recording multiple sample data and observing how many devices fail in a relatively short period of time, the MTBF value can be extrapolated. For example, if 10,000 identical devices are enabled at the same time and run for 100 hours, and 5 devices fail during this period, the MTBF value can be calculated: 10,000 devices × 100 hours ÷ 5 failures = 200,000 hours.

The failure rate can be calculated as the inverse of MTBF.

Failure rate = 1 / MTBF.

MTTR (mean time to repair) refers to the average time it takes to repair a system failure.

The operating probability of the system can be calculated using the following formula.

Probability of normal operation = MTBF /(MTBF + MTTR).

Simply put, the larger the MTBF value is and the smaller the MTTR value is, the higher the system availability is.

Conclusion

Of course, to complete the equipment selection, you must first collect information about the budget, requirements, network status, and future expansion, and then conduct targeted equipment selection. You can even have several equipment selection plans, which are gradually refined and decomposed, and combined with the equipment procurement process, to finally finalize the equipment procurement list. If you have a wealthy sponsor, you will definitely get what you pay for, and you can't go wrong buying the best. If your budget is limited, you should choose the most cost-effective one.

<<: 5G and Edge AI: Solving Traffic Management Problems

>>: Let's talk about UPNP and DLNA protocols