Five-minute technical talk | Semantic communication technology helps build a safe countryside

Five-minute technical talk | Semantic communication technology helps build a safe countryside

Part 01

Semantic Communication Technology

The rapid popularization of safe rural services and the high definition of cameras have brought a sense of security to users' lives, but at the same time, they are also faced with challenges such as massive access to terminals, continuous growth in bit rates, and increasingly complex scenarios. The coding optimization path that uses computational complexity to exchange for compression rate in the traditional framework has seen a gradual decline in bit rate, showing a bottleneck trend; at the same time, the capacity of the communication channel has approached its limit, making it difficult to meet the needs of rapidly growing massive video data in terms of transmission, storage, and analysis. The human brain can achieve ultra-high image and video compression performance. The mechanism is that the visual cortex has functions such as edge detection, shape recognition, and motion recognition, and the inferior temporal lobe can recognize complex objects and faces, that is, extract structured semantic information. Traditional image and video communication uses pixels as representation units, which cannot match the structural characteristics of natural images such as symmetry, repetitiveness, and correlation, and it is difficult to significantly improve the representation efficiency. Learning from the visual perception and cognitive mechanisms of the human brain and exploring video semantic representation models based on artificial intelligence can improve the representation efficiency to a certain extent. Semantic communication draws on the human brain's ultra-high image and video compression performance mechanism, breaks through the existing theoretical framework, and integrates the human brain's visual perception and cognitive mechanism into the communication process, achieving efficient semantic representation and video clarity and smoothness at extremely low bit rates.

Research semantic-based multimedia communication technology, achieve high-quality, low-bandwidth, low-storage multimedia semantic communication in network-restricted scenarios, and promote the verification and application of related technical achievements in safe villages, with technical indicators and application scale reaching leading levels at home and abroad. Different from traditional video compression that uses pixels as units, semantic communication extracts image semantic information to achieve efficient compression, and achieves efficient and accurate semantic representation under limited resources on the encoding end, and accurate images on the receiving end.

- Semantic communication coding and decoding technology

Semantic communication coding and decoding technology establishes a shared prior knowledge base based on scene tasks, and links the semantic extraction of target at the encoding end with the target generation at the decoding end. The encoding end detects the target in the video frame based on prior knowledge, extracts the semantics and converts it into a binary sketch image for encoding and transmission. The decoding end generates the target based on the knowledge base and the sketch image, and fuses it with the background image to reconstruct the video. Through the compact feature representation and efficient feature retrieval of joint video semantic coding, rapid retrieval of massive videos is achieved, which can be used in business scenarios such as security.

picture

Among them, the performance requirements for massive video feature retrieval are high. In order to ensure fast and accurate video retrieval, semantic communication proposes a joint optimization solution for video encoding and compact feature representation to obtain a more compact feature descriptor. A tree index structure is constructed based on reinforcement learning to improve retrieval efficiency while ensuring accuracy.

picture

picture

- Key technologies for QoE measurement in video semantic communication

Current QoE optimizes the QoE experience of multimedia content by studying the impact of objective video factors such as video resolution, freeze time, frame rate and bit rate on the user's subjective experience. However, these QoE influencing factor studies focus on the objective characteristics of the video and cannot effectively reflect the impact of semantic information on user experience. This paper proposes a QoE evaluation method based on semantic factors and establishes an evaluation-feedback mechanism for semantic communication.

For the QoE evaluation of general scenarios of semantic communication systems, the average key point distance, key point missing rate and average Euclidean distance are used as influencing factors, combined with traditional QoS start time, buffer ratio, average media bit rate, as well as objective factors such as video resolution, frame rate and bit rate.

After calculating the QoE evaluation of semantic communication video, this indicator needs to be used as feedback to adjust and optimize the entire semantic communication system. Based on the characteristics and processes of semantic communication, the indicators and feedback adjustment mechanism of semantic QoE are designed. Add semantic factors to the subjective QoE prediction so that the predicted value of the prediction model is close to the real user evaluation. At the same time, in the objective QoE indicator calculation, the indicators are designed based on three levels: pixels, parts, and timing. Feedback adjustment is performed through the QoE calculation results of the cloud and the client. When the system produces key point offsets, frame rate drops, contour distortion, and timing instability, it means that the video reconstruction quality is low at this time. Enable contour constraints, adjust the transmission bit rate, increase the number of key points, and adjust the encoding and decoding model to optimize the system to meet user needs.

picture

Part 02

Conclusion

Compared with mainstream H.265 encoding, under the condition of equivalent subjective quality, the average bit rate of video transmission based on semantic communication is reduced by more than 80%. In multi-user scenarios, compared with mainstream H.265 codec transmission, the computation and storage overhead based on semantic communication is reduced by more than 50%. In order to promote the application of multimedia semantic communication technology in safe villages, the digital village demonstration application platform was built in the Fumin Village Demonstration Application Platform in Nantong City, Jiangsu Province, to verify the application of multimedia semantic communication in the four major scenarios of safe villages and the effect of semantic communication QoE feedback evaluation. Through scene detection and the use of the semantic characteristics of strong consistency of static scenes, it is estimated that cloud storage and bandwidth can be saved by more than 60% for safe village scenarios, about 750 million yuan per year.

picture

<<:  DNS Troubleshooting Collection

>>:  The future of industrial communications: embracing the power of 5G and the Internet of Things

Recommend

Five strategies for WAN data acceleration

Flash storage, hyperconverged infrastructure, Lin...

What will 5G rely on to disrupt data centers?

In a January 2017 survey, research firm IHS Marki...

5G concepts are performing well. Who will become the best among the strong?

On Monday, the two markets showed a weak and vola...

I secretly monitored their communication traffic...

I am a monitoring software. My master spent sever...

Manually simulate and implement Docker container network!

[[435189]] Hello everyone, I am Fei Ge! Nowadays,...

Digital currency: Don’t be fooled by the “blockchain” cover

Since the beginning of this year, the digital cur...

The inheritance of mobile communications from the 1G era

This article is reprinted from the WeChat public ...

What is 6G and when can we expect it?

Since 5G networks are still being deployed around...