Some Discussions on the Transmission Network in DCI

Some Discussions on the Transmission Network in DCI

Preface

The exploration of DCI technology has been very popular recently, especially after SDN-WAN has been paid attention to by everyone, and articles introducing DCI technology have emerged one after another. This article focuses on the technology and current situation of the transmission network part of the DCI network, hoping to bring some help to everyone, please be gentle.

[[183704]]

1. The origin of DCI network

Data centers were originally quite simple. A random room with a few cabinets and a few high-P air conditioners, and then a single-channel ordinary mains electricity and a few UPSs became a data center. However, this type of data center is small in scale and has low reliability. Since the late 1990s, the Internet has been developing rapidly, and the demand for data centers has also skyrocketed. Therefore, the unsolvable problems of this type of data center have arisen: insufficient space, insufficient power supply, no redundancy, and no SLA guarantee, which has caused users to start looking for another data center to deploy their services. At this time, the new data center and the old data center began to have network interconnection needs, resulting in the initial DCI network, namely Data Center Inter-connect, which includes technologies at the physical network level and the logical network level.

The original DCI network was directly connected through the Internet. Later, security was taken into consideration and encryption was used. Considering the quality of service, dedicated lines were used. Considering bandwidth, direct fiber connections were used.

2. Development of DCI Network

DCI network has developed from Internet interconnection to several M dedicated lines and now tens of terabytes of wavelength division interconnection. In fact, it has not been long, and it is objectively a response to the development of the Internet. Initially, users used public network VPN tunnels to transmit their services directly through the public network. This method is affected by the public network environment (global network bandwidth congestion, inferior routing, line jitter, link reset, firewall, etc.) and cost conditions, so it is suitable for small traffic, low bandwidth quality requirements, non-real-time, and low security requirements. Later, the services in the data center received more and more attention, and the services began to be gradually deployed, and the number of servers grew linearly. After a large number of services were deployed, these services transmitted through the network had an increasing impact on companies and industries, so the requirements for the network became higher and higher, first reflected in bandwidth and link stability. Therefore, data center users began to rent operator circuit dedicated lines. MSTP dedicated lines carried on SDH networks began to sell well due to their high stability, larger bandwidth, and high degree of operator reuse.

Later, as business continued to grow explosively, data between data centers began to have requirements for latency and redundancy, especially for financial customers, which had particularly high requirements for latency, so the requirements for dedicated lines also increased. Furthermore, users began to require larger granularity, such as 2.5G and 10G single-link bandwidth, and also required dual routing protection to ensure that LSA reached 4 to 5 9s.

Even so, the power of Internet development is amazing, and businesses such as log transmission and database synchronization are growing rapidly. Considering the needs of cost, delivery time, service quality and other aspects, top companies (especially Internet companies with big data such as Google and FB) began to build their own DCI networks by renting bare optical fibers without operators. In the early days of using bare optical fibers, a single signal was first run through a single optical fiber. For example, a pair of optical fibers with a 10G ZR module can transmit 80 kilometers, which is enough for transmission between general data centers in the same city. However, the disadvantage of this method is that on the one hand, the use of optical fibers increases linearly with bandwidth and the cost increases. On the other hand, the bandwidth utilization of a single optical fiber is too low (of course, for operation and maintenance, the resource and routing management of bare optical fibers is also a long-standing problem). Moreover, at this time, the 10G bandwidth of a single optical fiber cannot meet the needs of business growth, so the DCI network entered the WDM era.

In the WDM era, two methods appeared in the DCI network, namely coarse wavelength division CWDM and dense wavelength division DWDM. Initially, some users used 10G coarse wavelength division optical modules and CWDM technology for DCI interconnection based on cost considerations, but this system supports up to 16 waves of 10G, and EDFA cannot amplify the signal of the wavelength where the coarse wavelength division is located, and its passive relay distance is also quite limited. Therefore, with the increasing demand for large-scale data transmission, users have to start using DWDM systems with higher capacity and performance.

DCI network structure of pure DWDM system:

DWDM system is the most important large bandwidth transmission system in the current communication network. In consideration of business volume, cost, operation and maintenance, various companies used switches to insert 10G color optical modules in the early stage, and the color optical Ethernet signals that came out were directly connected to passive DWDM equipment. This system is simple to operate and maintain. Under the condition of controllable cost, it can generally achieve 40 waves * 10GE signals, and the total system bandwidth can reach 400G, without excessive network operation and maintenance costs. Ordinary IP network engineers can perform operation and maintenance with low-cost learning. It was once widely used by some companies with demand. However, the development of the Internet is booming, and the 10GE single-wave signal will soon be unable to meet the demand. At this time, a higher single-wave signal system is needed, such as single-wave 40GE&100GE. However, the 40GE&100GE Ethernet color optical modules that can be placed on the switch at this time have not yet appeared on the market, and the cost after they appear is also high for a long time, but the business cannot wait, so it can only find another way. So OTN, which is a big name in telecommunications networks, appeared in the DCI networks of Internet companies such as Google and FB.

When the Internet began to use OTN, it basically started with 10GE. At that time, 10G color optical fiber + DWDM box could no longer meet the needs of business growth, and this method could not meet the operation requirements of batch networks, especially long-distance networks, because there was no management method based on the optical layer. In addition, after the launch of 100G OTN, after several years of development, especially after several centralized procurements by operators, the cost dropped significantly. For these reasons, the 10GE color optical fiber + DWDM box method began to be slowly replaced by the OTN system. Due to technological development and cost reduction, the 40G OTN system has basically not been used in the DCI network of the Internet industry. It is directly upgraded from the 10G system to the 100G system. In the middle, it was also considered that if the 100G system is used, the scope of impact after the failure is wide, but the demand for business growth is still the primary, so the line side (the signal facing the transmission optical fiber side) is directly upgraded from 10G to 100G, ensuring that the wavelength division system on the line side can meet the needs of long-term bandwidth development. As for the client side (the signal facing the switch docking side), considering that there are many existing 10G client systems, in order to protect existing investments, it is necessary to be compatible with existing 10G granular services and be able to upgrade to client-side 100G systems in the future. Therefore, in order to allow the 10G and 100G systems to transition during DCI network upgrades, a service card based on 10 ODU2s–>1 ODU4 with a client side of 10*10G and a line side of 1*100G appeared, and this board card has been widely used.

OTN network, with its rich management overhead, high reliability and diversified protection methods, centralized professional NMS management platform, and large bandwidth, has indeed played an extremely important role in the development of the Internet, making network operations more professional and segmented, and of course, most importantly, meeting the rapidly growing needs of Internet services.

Typical topology of a point-to-point DCI network using OTN:

At this time, OTN network is no longer a proprietary technology of telecommunications network. The rise of the Internet has allowed such a traditional telecommunications network technology to enter the DCI network industry.

3. Current Operational Routes of DCI Network

After the introduction of OTN technology into the DCI network, a whole new task has been added to the operation. The traditional data center network is an IP network, which belongs to the logical network technology. The OTN in DCI is a physical layer technology. How to work with the IP layer in a friendly and convenient way is a long way to go for the operation.

The purpose of the current operation work based on OTN is the same as that of each subsystem in the data center, which is to maximize the effectiveness of the high-cost resources invested in the infrastructure and provide the best support for upstream businesses. It improves the stability of the basic system, facilitates the efficient operation and maintenance work, assists in the reasonable allocation of resources, makes the invested resources play a greater role, and reasonably allocates the uninvested resources.

The operation of OTN mainly involves several aspects: operation data management, asset management, configuration management, alarm management, performance management, and DCN management.

3.1 Operational Data

Statistics are collected on fault data to distinguish between human faults, hardware faults, software faults, and third-party faults. Statistical analysis is performed on the types with higher fault rates, and targeted processing plans are developed. After standardization in the future, this will pave the way for automated fault processing. Based on fault data analysis, the system is optimized from the perspectives of architecture design and equipment selection to reduce the cost of later operation and maintenance work. Fault statistics are collected for OTN from optical amplifiers, boards, modules, combiners, splitters, cross-device fiber jumpers, trunk optical fibers, DCN networks, etc., and data analysis is performed in multiple dimensions, including manufacturer dimensions and third-party dimensions, so that the data can more accurately reflect the current status of the network.

Statistics are collected on the change data to distinguish the complexity and impact of the change, and personnel are assigned. Changes are made according to the process of demand analysis, change plan, window setting, user notification, operation execution, and summary review. Finally, different changes can be divided into windows, and even arranged to be executed during the day, so that the allocation of change personnel is more reasonable, reducing work and life pressure and improving the happiness of operation engineers. The final statistical data can be integrated and used as a reference for personnel work efficiency and work ability, while also allowing normal changes to develop in the direction of standardization and automation, reducing various expenses.

Data statistics on OTN service distribution can help you keep track of network usage and control the network and service distribution of the entire network after the business volume increases. At a high level, you can know which network service uses a single channel, such as the external network, internal network, HPC network, cloud service network, etc. At a high level, you can combine the full traffic system to analyze the specific service traffic usage, allocate different bandwidth costs to different business departments, help them optimize service traffic, recycle and adjust low-usage working channels at any time, and expand high-usage service channels.

Statistics of stability data are the main reference data for SLA and the sword of Damocles hanging over the heads of every operation and maintenance personnel. Statistics of stability data of OTN are differentiated because they have protection. For example, if a single route is interrupted and the total bandwidth at the IP level is not affected, should it be included in the SLA? If the IP bandwidth is halved but does not affect the service, should it be included in the SLA? If a single channel fails, should it be included in the SLA? If the delay of the protection path increases, although it has no impact on the network bandwidth, it has an impact on the service, should it be included in the SLA, etc. The general practice is to inform the business party of the risks of jitter, delay change, etc. before construction. In the later SLA, the number of faulty channels * the bandwidth of a single faulty channel is used as the calculation basis, divided by the total number of channels * the sum of the corresponding channel bandwidths, and then multiplied by the impact time. The value obtained is used as the calculation standard of the SLA.

3.2 Asset Management

The assets of OTN equipment also need lifecycle management (arrival, online and offline, scrapping, and troubleshooting), but unlike servers, network switches and other equipment, the structure of OTN equipment is more complex. OTN equipment involves a large number of functional boards, so a model needs to be designed during management to enable full asset management. The main IP asset management platform in the data center is based on servers and switches, and will set the master and slave device levels. On this basis, OTN will involve hierarchical management at the master and slave levels, but there are more levels. The management level is mainly based on network element->subrack->board->module:

  1. The network element is a virtual device, not a physical object. It is used for management and the first logical point in the OTN network. It is a first-level unit in OTN network management. A physical computer room may have one network element or multiple network elements. A network element contains multiple sub-racks, such as the sub-racks of the optical layer and the sub-racks of the electrical layer. The external combiner/splitter is also considered a sub-rack. Each sub-rack can be connected in series and belongs to the sub-rack within a single network element site and is numbered. In addition, the network element does not have an asset SN number, so in this regard, it must be aligned with the management platform, especially with the information in the purchase list and the later operation and maintenance management platform to avoid asset inspection mismatches. After all, the network element is a virtual asset.
  2. The largest physical unit of OTN equipment is the frame, or sub-frame, which is the next level of the first-level network element, that is, the second-level unit. A network element has at least one sub-frame device. These sub-frames are divided into different models from different manufacturers, with different functions, including electronic frames, optical frames, general sub-frames, etc. The sub-frame has a specific SN number, but its SN number cannot be automatically obtained through the network management platform and can only be checked on site. After the sub-frame is online, it is rarely moved or changed. There will be a variety of board cards installed in the sub-frame.
  3. Inside the secondary sub-rack of OTN, there are specific service slots. The slots are digitally numbered and used to insert various optical network service boards. These boards are the basis for supporting OTN network services. The SN number of each board can be queried through the network management. These boards are the third-level units in OTN asset management. Various service boards are of different sizes, occupy different numbers of slots, and have different functions. Therefore, when the board needs to be assigned to the secondary unit sub-rack, the asset platform must allow a single board to use multiple or half slots to correspond to the slot number on the sub-rack.
  4. Optical module asset management: modules are used on service boards. All service boards should allow for the attribution of optical modules, but not all boards of OTN equipment need to be equipped with optical modules, so it is also possible to allow boards without modules. Each optical module has an SN number, and the module inserted on the board should be aligned with the port number of the board for easy location search.

All this information can be collected through the northbound interface of the network management platform, and the accuracy of asset information can be managed through online collection and offline verification. In addition, OTN equipment also involves optical attenuators, short jump fibers, etc. These consumable devices can be directly managed as consumables.

3.3 Configuration Management

When configuring channels, it is necessary to configure services, configure optical layer logical links, and configure link virtual topology. If a single channel may be configured with a protection path, the channel configuration will be more complicated, and the configuration management will be more complicated. A dedicated service table is required to manage the channel direction alone, and the service direction must be distinguished in the table using solid and dotted lines. When OTN channels and IP links are managed in correspondence, especially in the case of OTN protection, one IP link needs to correspond to multiple OTN channels. At this time, the management volume increases and the management is complicated. The need to manage Excel tables is increased. To fully manage all elements of a service, there are up to 15. When an engineer wants to manage a certain link, he needs to find this Excel table, then go to the manufacturer's NMS to find the corresponding one, and then perform operation management. This requires more synchronization of information on both sides. Since the OTN NMS platform and the Excel made by the engineer are both artificial data, it is easy to have information asynchrony. Any error will cause the service information to not correspond to the actual relationship, which may affect the service when making changes and adjustments. Therefore, the manufacturer's equipment data is collected to a management platform through the northbound interface, and then the IP link information is matched on this platform so that the information can be automatically adjusted according to changes in the existing network services, ensuring centralized management of information and a single accurate source, and ensuring the accuracy of configuration management information.

When configuring OTN services, prepare a description of each interface, then collect OTN information through the northbound interface provided by the OTN NMS. Pair the relevant description with the port information collected by the IP device through the northbound interface, and the OTN channel and IP link can be managed on a platform, eliminating the need for manual information updates.

When using the DCI transmission network, try to avoid using electrical cross-connection service configuration. This method is extremely complex to manage and is not suitable for the DCI network model. It can be avoided from the beginning of DCI design.

3.4 Alarm Management

Due to the complex management overhead, signal monitoring during long-distance transmission, multiplexing and nesting of different service granules, etc., a fault in OTN may report dozens or even hundreds of alarm messages. Although the manufacturer has made four levels of alarm classification, each alarm has a different name. From the perspective of engineer operation and maintenance, it is still extremely complicated, and experienced personnel are required to determine the cause of the fault in the first place. The fault outbound function of traditional OTN equipment is mainly to use SMS modems or email push, but these two functions are relatively special for the existing network alarm management platform integrated with the basic system of Internet companies, and the cost of separate development is high, so a more standard northbound interface is needed to collect alarm information, expand the function while retaining the company's existing related platforms, and then push the alarm to the operation and maintenance engineers.

Therefore, for the operation and maintenance personnel, what is needed is to let the platform automatically converge the alarm information generated by the OTN fault, and then receive this information. Therefore, first set the alarm classification on the OTN NMS, and then send and filter on the last alarm information management platform. The general OTN alarm practice is that the NMS will set up to push all the first and second category alarms to the alarm information management platform, and then the platform will push the alarm information of single service interruption, main optical path interruption alarm information, and (if any) protection switching alarm information to the operation and maintenance engineer according to time, summarized information, recipient range and other dimensions. With the above three pieces of information, the fault can be judged and handled. When setting up the reception, you can set up telephone notifications for major alarms such as combined wave signal failures that are only generated by broken optical fibers, such as the following:

The northbound interface of NMS, such as the XML interface currently supported by Huawei, ZTE and Alcatel-Lucent for pushing alarm information, is also commonly used.

3.5 Performance Management

The stability of the OTN system is highly dependent on the performance data of various aspects of the system, such as the optical power management of the trunk fiber, the optical power management of each channel in the combined signal, and the system OSNR margin management. These contents should be included in the monitoring project of the company's network system so that the system performance can be understood at any time and the performance can be optimized in time to ensure the stability of the network. In addition, long-term fiber performance quality monitoring can also be used to detect changes in fiber routes to prevent some fiber suppliers from changing fiber routes without notification, resulting in blind spots in operation and maintenance and the risk of fiber routing. Of course, this requires a large amount of data for model training in order to detect route changes more accurately.

3.6 DCN management

The DCN here refers to the management communication network of OTN equipment, which is responsible for the networking structure of managing each OTN network element. The networking of OTN will also affect the scale and complexity of the DCN network. There are two general DCN network methods:

  1. The active and standby gateway network elements are confirmed in the entire OTN network, and other non-gateway network elements are ordinary network elements. The management signals of all ordinary network elements pass through the OSC channel across the OTS layer in the OTN to reach the active and standby gateway network elements, and then access the IP network where the NMS is located. This method can reduce the deployment of network elements in the IP network where the NMS is located, and use the OTN system itself to solve network management problems. However, if the trunk optical fiber is interrupted, the corresponding remote network elements will also be affected and out of management.
  2. All network elements in the OTN network are configured as gateway network elements. Each gateway network element communicates with the IP network where the NMS is located independently, without using the OSC channel. This ensures that the management communication of the network elements is not affected by the interruption of the trunk optical fiber. The network elements can still be managed remotely and are all on the IP network, which will also reduce the operation and maintenance costs of traditional IP network engineers.

At the beginning of DCN network construction, network element planning and IP address allocation should be carried out well, especially when the network management server is deployed, it should be isolated from other networks as much as possible. Otherwise, there will be too many mesh links in the network in the later stage, network jitter will be common during maintenance, and ordinary network elements will not be able to connect to gateway network elements. It will also be easy for the production network address and the DCN network address to be reused, causing the production network to be affected.

4. DCI Network Development Direction

When building inter-data center network interconnection, data center owners mainly consider large bandwidth, low latency, high density, fast deployment, easy operation and maintenance, and high reliability. The current mainstream large-bandwidth OTN technology is mainly controlled by several large telecommunications equipment manufacturers (chips are another matter), such as Huawei, ZTE, and Alcatel-Lucent. Their main customers are traditional telecommunications operators, so the OTN product features are mainly designed based on the business characteristics of these operators. Because of this, there are more and more discordant problems in the DCI network applications of OTN in the Internet industry.

The characteristics of OTN equipment are also the problems encountered by DCI, rich business overhead, the network has strong OAM capabilities, scheduling and multiplexing capabilities of different granular bandwidths, line fault tolerance in long-distance situations, low-voltage direct current, and low equipment power consumption utilization.

  1. Rich business spending capabilities require operation and maintenance personnel to have more professional capabilities, rely more on manufacturer technical support, and make the technology more closed.
  2. Powerful OAM capabilities, inconsistent standards, more difficult and independent cross-network connections, and useless functions also bring more transmission operating costs to the DCI network.
  3. The scheduling capabilities of different granularities make the service encapsulation frame structure more complex and contain more nested bytes.
  4. The fault tolerance of long-distance lines makes the FEC algorithm complex, consumes more overhead and takes longer to process.
  5. The 48V-DC power supply mode of OTN equipment is different from the standard 19-inch 220V-AD (or 240V-DC) cabinets used in most data centers. The installation is complicated and requires power transformation of the computer room.
  6. Traditional OTN equipment frames are large and not suitable for installation in standard cabinets. They also have low capacity density, making subsequent expansion difficult and requiring cabinet relocation or renovation.

At present, our DCI network mainly provides a pipeline for data across data centers. The main characteristics of the business model are: unified and single bandwidth granularity requirements, large bandwidth, low latency requirements for cross-data center services (especially multi-active IDC and big data services), and high requirements for network stability. At the same time, due to the lack of relevant professional and technical personnel in the Internet industry, the operation and maintenance of the DCI network needs to be "simple", "simple", and "simple" - important things should be said three times (which network is not?). The explosive development of the Internet has shortened the construction and expansion cycle requirements (the operator's OTN expansion cycle is generally half a year to one year, while the Internet's own DCI expansion requirement is 1 to 3 months), so it is necessary to compress the time in all aspects.

Therefore, OTN provides a usable solution for DCI, but OTN is by no means the most suitable solution for DCI. As DCI networks are booming, more and more suitable solutions are needed to solve various problems from cost to construction and operation and maintenance. These problems are nothing more than the six requirements of DCI networks (large bandwidth, low latency, high density, fast deployment, easy operation and maintenance, and high reliability):

  1. Large bandwidth: Unlike operators, DCI transmission networks have a variety of types of granularity. The bandwidth granularity of DCI transmission networks is simpler, with 10G or 100G currently in common use and 200G/400G in the future. Therefore, with large bandwidth, there is no need to make bandwidths of other granularity. Due to the fact that the distance range of DCI transmission networks is generally not too long, using a 400G system based on 200G PM-16QAM dual-carrier modulation, the transmission distance without electrical relay can be about 500 kilometers (about 200 kilometers for PM-64QAM), so that DCI metropolitan backbone transmission will not be limited by distance.
  2. Low latency is a business requirement of DCI. Especially when cloud computing is used for pooled resources and multi-active data centers, latency is calculated at the microsecond level. The shorter the data transmission time, the better. We should try to exceed the speed of light by eliminating unnecessary data processing and reducing the signal transmission path. For example, removing the SD-FEC function used in 100G OTN can save 200 microseconds for a single back-to-back connection, and removing the cross-stage OTN encapsulation can save tens of microseconds. For key services, the hubspoke topology can be used reasonably to ensure the shortest path. Of course, it can also cooperate with MPLS, QOS, etc. at the IP level to try to ensure that the latency at the data forwarding level is also better.
  3. High density, a single U, or 2U, can achieve bandwidth of more than terabytes, decouple the DWDM optical layer from the electrical signal layer, improve the density interface of the equipment, and reduce the size of the optical module. For example, the use of QSFP28 optical modules can greatly improve the 100G access capability of a single device. The use of CFP2 light-collecting modules on the line side can ensure the improvement of the overall equipment transmission bandwidth capability, 1U can be 1.6T, 3.2T. At present, there are many related products appearing in the world, such as companies such as ADVA, coriant, and ciena. Of course, Huawei in China has also launched related 902 products, but as of the completion of this article, it seems that the network access test of the Ministry of Industry and Information Technology has not been completed. High density will cause high power consumption and heat dissipation problems, so the original OTN left and right air inlet and outlet, up and down air inlet and outlet heat dissipation methods must be abandoned. High-density equipment needs to adopt the same front and back air inlet heat dissipation method as the data center server switch to meet the equipment heat dissipation requirements.
  4. Fast deployment, using the current standardized IDC 19-inch rack, similar to the mainstream server form, using AC-220V direct power supply, eliminating the need for power and cabinet modifications, and enabling goods to be put on the shelves immediately after arriving at the computer room, and to be configured for business after plugging in the power supply. Standardized acceptance work is also done to achieve fast deployment.
  5. Easy operation and maintenance. The business model of DCI requires that the distance between data centers should not be very far. Complex management overhead, OAM and other functions are not necessary in this scenario. Complex processing also reduces data transmission efficiency, increases data processing time, and places higher technical requirements and is more closed. Direct Ethernet signal connection eliminates the complex overhead of OTN, so traditional IP network engineers can operate and maintain the DCI system. After combining new northbound interfaces such as YANG model, REST API, netconf, etc., the same interface is used to develop DCI transmission equipment management and IP network equipment management, so as to better carry out unified platform-based network centralized management.
  6. High reliability, multi-physical routing and protection technologies that are unaware of the upper layer will continue to play a role in the DCI transmission network. Unless there is a complete interruption, failures at the underlying link level should not have any perception or impact on the service, whether it is protection switching, link jitter, increased latency, etc.

Based on these characteristics, there are currently two conventional DCI solutions:

  1. Use pure DWDM equipment, and use color optical modules + DWDM combiners and demultiplexers on the switch. In the case of single-channel 10G, the cost is extremely low, and there are rich product options. 10G color optical modules have been produced in China for a long time, and the cost is already very low (in fact, the 10G DWDM system became popular a few years ago, but the arrival of some larger bandwidth requirements had to eliminate it, and the 100G color optical modules had not yet appeared at that time.) 100G related color optical modules have just begun to appear in China, and the cost is not low enough, but it will always make a strong mark in the DCI network.
  2. Use high-density transmission OTN equipment, which has 220V AC, 19-inch equipment, 1~2U height, and is more convenient to deploy. Disable the SD-FEC function to reduce latency, cooperate with the optical layer routing protection, improve stability, and the controllable northbound interface also improves the equipment expansion function development capabilities. However, OTN technology is still retained, and management will still be relatively complex.

In addition, the first-tier DCI network builders are mainly working on decoupling the DCI transmission network, including the decoupling of the optical layer at layer 0 and the electrical layer at layer 1, as well as the decoupling of the NMS and hardware devices of traditional manufacturers. The traditional approach is that the electrical processing layer equipment of a certain manufacturer must be coordinated with the optical layer equipment of the same manufacturer, and the hardware equipment must be managed with the manufacturer's proprietary NMS software. This traditional approach has several major drawbacks:

  1. The technology is closed. In theory, the optoelectronic levels can be decoupled from each other, but traditional manufacturers deliberately do not decouple them in order to control the authority of the technology.
  2. The cost of DCI transmission network is mainly concentrated in the electrical signal processing layer. The initial construction cost of the system is low, but when expanding capacity, manufacturers will raise prices by threatening the uniqueness of technology, which greatly increases the cost of expansion.
  3. After the optical layer of the DCI transmission network is put into use, it is limited to the use of electrical layer equipment from the same manufacturer, and the utilization rate of equipment resources is low, which does not conform to the development direction of network resource pooling, and is even more detrimental to unified optical layer resource scheduling. The decoupled optical layer is invested separately in the initial stage of construction, and is not subject to the restriction that multiple manufacturers will use the same optical layer system in the future. In addition, the northbound interface of the optical layer is combined with SDN technology to perform directional scheduling of channel resources at the optical layer, thereby improving business flexibility.
  4. Network devices can be seamlessly connected to the Internet company's own network management platform directly through data structures such as YANGmodel, saving investment in management platform development and eliminating the need for NMS software provided by the manufacturer, thereby improving data collection efficiency and network management efficiency.

Therefore, optoelectronic decoupling is a new direction for the development of DCI transmission networks. In the foreseeable future, the optical layer of DCI transmission networks can be SDN technology composed of ROADM+ north-south interfaces, which can arbitrarily open, schedule and recycle channels. It will be possible to mix electrical layer devices from multiple manufacturers in the system, and even mix Ethernet interfaces and OTN interfaces on the same optical system. By then, the efficiency of system expansion and modification will be greatly improved, the optoelectronic layer will be easier to distinguish, the network logic management will be clearer, and the cost will be greatly reduced.

For SDN, the core premise is the centralized management and allocation of network resources. So, what are the DWDM transmission network resources that can be managed on the current DCI transmission network?

Channels, paths, and bandwidth (frequency) are the three things. Therefore, the light in the collaboration of light + IP is actually managed and allocated around these three points.

IP and DWDM channels are decoupled, so if the correspondence between an IP logical link and a DWDM channel is configured in the early stage, and the correspondence between the channel and IP needs to be adjusted later, OXC can be used to perform millisecond-level fast channel switching, which can make the IP layer imperceptible. Through the management of OXC, centralized management of transmission channel resources at each site can be achieved, so as to cooperate with the SDN of services.

The decoupling adjustment of a single channel from IP is only a small part. If you consider adjusting the bandwidth while adjusting the channel, you can solve the problem of adjusting the bandwidth demand of different services in different time periods, and greatly improve the utilization rate of the built bandwidth. Therefore, while cooperating with OXC to adjust the channel, the combiner/splitter combined with flexible grid technology can make a single channel no longer have a fixed central wavelength, but cover a scalable frequency range, so as to achieve flexible adjustment of the bandwidth size. In addition, in the case of using multiple services in a network topology, the frequency utilization rate of the DWDM system can be further improved, and the existing resources can be saturated.

With the dynamic management capabilities of the first two, the path management of the transmission network can help the entire network topology have higher stability. According to the characteristics of the transmission network, each path has independent transmission channel resources, so it is of great significance to uniformly manage and allocate the channels on each transmission path. This will provide the best path selection for multi-path services and maximize the use of channel resources on all paths. Just like in ASON, different services are divided into gold, silver and bronze to ensure the stability of the highest level of services.

For example, there is a ring network consisting of three data centers A, B, and C. There is service S1 (such as intranet big data service), from A to B to C, occupying waves 1 to 5 of this ring network, each wave has a bandwidth of 100G, and the frequency interval is 50GHz; there is service S2 (extranet service), from A to B to C, occupying waves 6 to 9 of this ring network, each wave has a bandwidth of 100G, and the frequency interval is 50GHz.

Normally, this bandwidth and channel usage meets the demand. However, sometimes, for example, when a new data center is added and the business needs to migrate the database in a short time, the demand for intranet bandwidth in this period of time will increase several times. The original 500G bandwidth (5 100G) now requires 2T bandwidth. Then the channels at the transmission level can be recalculated, and 5 400G channels are deployed in wave layers. The frequency interval of each 400G channel is changed from the original 50GHz to 75GHz. With the flexible grating ROADM and combiner/splitter, the path of the entire transmission level is opened up, so these 5 channels occupy 375GHz of spectrum resources. After the resources at the transmission level are ready, the OXC is adjusted through the centralized management platform. Under the millisecond delay, the transmission channels used by the original 1~5 waves of 100G service signals are adjusted to the newly prepared 5 400G service channels. In this way, the bandwidth and channels can be flexibly adjusted according to the business needs of DCI, which can be done in real time. Of course, the network connector of the IP device here needs to support 100G/400G rate adjustable and optical signal frequency (wavelength) adjustment functions, which will not be a problem.

: : : : : : : : : : : : : : : Perhaps in the near future, OTN will also disappear in telecommunications-grade networks, leaving only DWDM.

Author profile: Li Yan, who has been responsible for the construction and operation of transmission networks in the Internet industry for many years, has a thorough understanding of DCI transmission networks, and is currently mainly responsible for the operation of the basic data center network (L1~L4).

<<:  Public transport Wi-Fi is too difficult to monetize and too costly to be shut down in more than a dozen cities

>>:  How to increase the speed of the router

Recommend

Global 6G market expected to reach $20 billion by 2028

In the dynamic world of telecommunications, the a...

Detailed explanation of HTTP protocol security-related headers

HTTP security headers are a fundamental part of w...

What problems do HTTP/1, HTTP/2, and HTTP/3 solve?

What problems does each generation of HTTP solve?...

The development of 5G will open up a new track for the Internet of Things

The Ministry of Industry and Information Technolo...

2018 F5 China Application Service Summit Forum was successfully held

The "F5 China Application Service Summit For...

Is HTTP really that difficult?

HTTP is the most important and most used protocol...

IDC: 5G commercialization will greatly benefit the manufacturing industry

The issuance of 5G licenses in China has greatly ...

5G latency: Why faster networks matter

When you look at your mobile network or home broa...