首页 | 本学科首页   官方微博 | 高级检索  
     

融合BERT-WWM和指针网络的旅游知识图谱构建研究
引用本文:徐春,李胜楠.融合BERT-WWM和指针网络的旅游知识图谱构建研究[J].计算机工程与应用,2022,58(12):280-288.
作者姓名:徐春  李胜楠
作者单位:新疆财经大学 信息管理学院,乌鲁木齐 830012
摘    要:针对旅游信息呈现出散乱、无序和关联性不强的问题,提出一种融合BERT-WWM(BERT with whole word masking)和指针网络的实体关系联合抽取模型构建旅游知识图谱。借助BERT-WWM预训练语言模型从爬取的旅游评论中获得含有先验语义知识的句子编码。针对传统的实体关系抽取方法存在错误传播、实体冗余、交互缺失等问题,以及旅游评论中的实体关系存在一词多义、关系重叠等特征,提出直接对三元组建模,利用句子编码抽取头实体,根据关系类别抽取尾实体,并建立级联结构和指针网络解码输出三元组。基于Neo4j图数据库存储三元组构建旅游知识图谱。实验在建立的旅游数据集上进行,融合BERT-WWM与指针网络的实体关系联合抽取模型的准确率、召回率和F1值分别为93.42%、86.59%和89.88%,与现有模型相比三项指标均显示出优越性,验证了该方法进行实体关系联合抽取的有效性。构建的旅游知识图谱实现了旅游景区信息的整合与存储,对进一步促进旅游业发展具有一定的实际参考意义。

关 键 词:BERT-WWM  指针网络  旅游知识图谱  关系重叠  实体关系联合抽取  

Research on Construction of Tourism Knowledge Graph Integrating BERT-WWM and Pointer Network
XU Chun,LI Shengnan.Research on Construction of Tourism Knowledge Graph Integrating BERT-WWM and Pointer Network[J].Computer Engineering and Applications,2022,58(12):280-288.
Authors:XU Chun  LI Shengnan
Affiliation:School of Information Management, Xinjiang University of Finance and Economics, Urumqi 830012, China
Abstract:Aiming at the problems of scattered, disordered, and weak relevance of tourism information, a joint entity relationship extraction model integrating BERT-WWM(BERT with whole word masking) and pointer network is proposed to construct a tourism knowledge graph. With the help of the BERT-WWM pre-training language model, a sentence code containing a priori semantic knowledge is obtained from the crawled travel reviews. Because of the problems of error propagation, entity redundancy, and lack of interaction in the traditional entity relationship extraction method, as well as the characteristics of polysemy and overlapping relationship in tourism comments, it is proposed to directly model the triplet, extract the header entity by sentence coding, extract the tail entity according to the relationship category, and establish a cascade structure and pointer network to decode the output triplet. A tourism knowledge graph based on Neo4j graph database storage triples is built. The experiment is carried out on the eatablished tourism data set, the accuracy, recall, F1 value of the entity-relationship joint extraction model integrating BERT-WWM and pointer network are 93.42%, 86.59%, and 89.88%, respectively. Compared with the existing models, the three indicators show advantages. The constructed tourism knowledge graph realizes the integration and storage of scenic spot information. It has a particular practical reference significance for further promoting the development of the tourism industry.
Keywords:BERT with whole word masking(BERT-WWM)  pointer network  tourism knowledge graph  relationship overlap  joint extraction of entity and relation  
点击此处可从《计算机工程与应用》浏览原始摘要信息
点击此处可从《计算机工程与应用》下载全文
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号