[51CTO.com original article] On June 21, the WOT2019 Global Artificial Intelligence Technology Summit kicked off on time at the Beijing Yuecai JW Marriott Hotel. As a well-known summit for offline exchanges among global technical personnel in 2019, this conference focused on three core chapters: general technology, application fields, and enterprise empowerment. More than 60 first-line AI experts from around the world gathered together to share thematic technical content such as deep learning, neural networks, visual technology, unmanned driving, machine learning, algorithm models, and knowledge graphs with more than a thousand participants.
On the afternoon of June 21, in the Knowledge Graph Technology Forum of the General Technology Chapter, three senior experts, including Swiss Re Data Scientist Wang Guan, Meituan Dianping Senior Algorithm Expert Pan Lu, and Chinese Academy of Sciences Institute of Automation Associate Researcher He Shizhu, were invited to attend and give wonderful speeches. After the meeting, 51CTO organized the experts' speeches into articles, hoping that the essence of their speeches will be helpful to everyone.
Swiss Re Data Scientist Wang Guan Knowledge graph construction: data, algorithms and architecture Knowledge graphs have many applications in the insurance industry. Wang Guan listed four major application scenarios: First, intelligent interaction. When customers make claims, they want to know whether their insurance can be compensated. This cannot be answered by listing the terms on the insurance policy. It is more necessary to integrate various information such as insurance products, customer data, conversation records, and medical bills into a knowledge graph, and then quickly feedback to customers through intelligent customer service. The second is accurate recommendation. By mastering the customer's data through the knowledge graph, you can accurately recommend matching products to customers. The third is automatic claims. Currently, most claims are manually handled, especially large policies require manual investigation. However, through knowledge graph technology, some historical data can be found to draw conclusions, thereby realizing automated claims. The fourth is anti-fraud. In the face of insurance fraud, knowledge graphs can be used to easily find the fraud track of the fraudster. By writing some corresponding query statements, common fraud patterns can be found. So how to build a knowledge graph for the insurance industry? This is a very complex process, involving knowledge system construction, knowledge base acquisition, fusion, storage, reasoning, application and other links. Wang Guan focused on the extraction algorithm of entities and relationships on the spot, and constructed a knowledge graph by extracting entities and extracting relationships between entities from unstructured and semi-structured text data. Wang Guan emphasized that in the extraction algorithm, word embedding is very important, which realizes the conversion from text to vector. As long as a vector is trained, it can express the semantics of Chinese very well and automatically capture related words. Information Extraction Tool Architecture Text Standards Tool Architecture Entity recognition and relationship extraction are both very important tasks in natural language processing, and Wang Guan did not elaborate on them. He said that there are many ways to build knowledge graphs according to different scenarios. Entities are nodes, and relationships are edges. The relationships behind them are found through shortest path mining. "Currently, the application of knowledge graphs is mainly concentrated in three aspects, namely visualization/exploration, graph algorithms, and graph databases (relational and NoSQL)." In his speech, he also gave a very practical information extraction tool and text annotation tool architecture. He particularly emphasized that with these designs, human-computer interaction will become more intelligent, and knowledge graphs will become data stewards and full-process machine learning tools. Pan Lu, senior algorithm expert at Meituan Dianping Application and evolution of knowledge graph-based question answering in O2O intelligent interaction scenarios
Pan Lu first reviewed the evolution of human-computer interaction and the types of intelligent interaction, and then he focused on the fact that in real-life scenarios at Meituan, if information acquisition, resource query, or even task-based interaction is to be performed, then the question-answering system must be inseparable from the knowledge graph. This article extracts the question-answering content in a restricted scenario. Pan Lu said that traditional KBQA (knowledge graph-based question answering) is mainly divided into two major technical schools: semantic parsing and information retrieval. Semantic parsing is to convert the original question into a logical form that can be understood by the machine. This form is closer to the storage structure of the knowledge graph and can be queried directly or indirectly. Information retrieval directly locates candidate answers by extracting effective information. There are two ways to do it. One is to use triples to generate natural language and compare it with the original question to realize the query; the other is to encode the candidate answer and the surrounding path, and compare it with the original question after encoding to get the answer. In Meituan's restricted scenarios (taking ordering as an example, the range of dishes is limited, and the location and time of food delivery are also limited), what technical path should be chosen? Pan Lu said that Meituan involves many fields, the correlation between fields is weak, and there is not enough labeled data, and it must also meet the needs of rapid field migration. Can we learn from the idea of information retrieval, but at the same time construct query statements to query the graph? So Meituan proposed the information retrieval + semantic parsing solution, which determines the subgraph through entity linking, followed by relationship identification, slot identification, and finally generates SparQL to execute the query. Each step can be cold-started with simple rules, or unsupervised or supervised models can be used. Pan Lu emphasized that in restricted scenarios, the main characteristics of the problems faced by Meituan are limited intention space, limited resources, limited number of interaction rounds, and limited knowledge extension. Therefore, on this basis, the KBQA they proposed has four major capabilities: basic attribute question and answer, resource query with constraints, resource information comparison, and dynamic attribute value calculation. He Shizhu, Associate Researcher, State Key Laboratory of Pattern Recognition, Institute of Automation, Chinese Academy of Sciences Key technologies for question answering based on knowledge graph
He Shizhu started by talking about the development history of information representation. He said that knowledge graphs are the basis for machines to understand the world, language systems and knowledge graphs are also the cornerstones of knowledge application, and question-answering systems will be the basic form of the next generation of search engines. He introduced that there are usually two types of methods for knowledge question answering: one is the semantic parsing method, which has high accuracy and low recall rate, can solve complex problems, is suitable for limited fields and limited language expressions, and can be solved empirically, does not require training machine learning methods, and is easier to control and intervene. The second is the natural question answering method, which has a more friendly interactive interface and can achieve the integration of knowledge-driven and data-driven, but requires high-quality original data and supporting knowledge resources. In He Shizhu's view, the difference between natural language question-answering and precise knowledge question-answering and chatbots is that precise knowledge question-answering mainly answers knowledge questions. First of all, the answer must be accurate. Secondly, only on the basis of accuracy can it meet emotional needs and be able to respond in natural language. "For question-answering tasks, the key is whether resources and existing models can meet the needs. In fact, there is a serious lack of content, far from enough resources, and a small number of models. The performance of open-domain question-answering systems is far from being usable, but there is still a lot of room for application in limited fields," He Shizhu concluded. The above content is compiled by 51CTO reporters based on the speech content of the "Knowledge Graph" sub-forum of the WOT2019 Global Artificial Intelligence Technology Summit. For more complete WOT content, please pay attention to .com. [51CTO original article, please indicate the original author and source as 51CTO.com when reprinting on partner sites] |
<<: VLAN Centralized Management Protocol (VCMP) You should know
>>: After the fourth retail revolution, three experts from WOT tell you what real smart retail is!
This is a very "pure" partner conferenc...
Since June 6, when the Ministry of Industry and I...
01 There has been a lot of discussion about 5G re...
Several years ago, a company proposed a relations...
DMIT has released the latest special package for ...
Recently, the National Intellectual Property Admi...
[51CTO.com original article] "Visualization&...
ExtraVM's 2023 Black Friday event is mainly f...
Starting from July 1, the three major operators o...
4G has not yet been fully popularized, but the re...
At the 2021 China Optical Network Conference whic...
Have you ever encountered a situation at work whe...
It is undeniable that with the development of the...
V5.NET has launched new products. This month, the...
If 5GToC helped operators achieve a return to bas...