首页 | 本学科首页   官方微博 | 高级检索  
     

融合文本概念化与网络表示的观点检索
引用本文:廖祥文,刘德元,桂林,程学旗,陈国龙.融合文本概念化与网络表示的观点检索[J].软件学报,2018,29(10):2899-2914.
作者姓名:廖祥文  刘德元  桂林  程学旗  陈国龙
作者单位:福州大学数学与计算机科学学院, 福建 福州 350116;福建省网络计算与智能信息处理重点实验室(福州大学), 福建 福州 350116,福州大学数学与计算机科学学院, 福建 福州 350116;福建省网络计算与智能信息处理重点实验室(福州大学), 福建 福州 350116,福州大学数学与计算机科学学院, 福建 福州 350116;福建省网络计算与智能信息处理重点实验室(福州大学), 福建 福州 350116,中国科学院网络数据科学与技术重点实验室, 北京 100190,福州大学数学与计算机科学学院, 福建 福州 350116;福建省网络计算与智能信息处理重点实验室(福州大学), 福建 福州 350116
基金项目:国家自然科学基金(U1605251);中国科学院网络数据科学与技术重点实验室开放基金课题(CASNDST201606);可信分布式计算与服务教育部重点实验室主任基金(2017KF01);福建省自然科学基金(2017J01755)
摘    要:观点检索是自然语言处理领域中的一个热点研究课题.现有的观点检索模型在检索过程中往往无法根据上下文将词汇进行知识、概念层面的抽象,在语义层面忽略词汇之间的语义联系,观点层面缺乏观点泛化能力.因此,提出一种融合文本概念化与网络表示的观点检索方法.该方法首先利用知识图谱分别将用户查询和文本概念化到正确的概念空间,并利用网络表示将知识图谱中的词汇节点表示成低维向量,然后根据词向量推出查询和文本的向量并用余弦公式计算用户查询与文本的相关度,接着引入基于统计机器学习的分类方法挖掘文本的观点.最后利用概念空间、网络表示空间以及观点分析结果构建特征,并服务于观点检索模型,相关实验表明,本文提出的检索模型可以有效提高多种检索模型的观点检索性能.其中,基于统一相关模型的观点检索方法在两个实验数据集上相比基准方法在MAP评价指标上分别提升了6.1%和9.3%,基于排序学习的观点检索方法在两个实验数据集上相比于基准方法在MAP评价指标上分别提升了2.3%和14.6%.

关 键 词:信息检索  观点检索  知识图谱  文本概念化  网络表示
收稿时间:2017/7/20 0:00:00
修稿时间:2017/11/8 0:00:00

Opinion Retrieval Method Combining Text Conceptualization and Network Embedding
LIAO Xiang-Wen,LIU De-Yuan,GUI Lin,CHENG Xue-Qi and CHEN Guo-Long.Opinion Retrieval Method Combining Text Conceptualization and Network Embedding[J].Journal of Software,2018,29(10):2899-2914.
Authors:LIAO Xiang-Wen  LIU De-Yuan  GUI Lin  CHENG Xue-Qi and CHEN Guo-Long
Affiliation:College of Mathematics and Computer Science, Fuzhou University, Fuzhou 350116, China;Fujian Provincial Key Laboratory of Networking Computing and Intelligent Information Processing(Fuzhou University), Fuzhou 350116, China,College of Mathematics and Computer Science, Fuzhou University, Fuzhou 350116, China;Fujian Provincial Key Laboratory of Networking Computing and Intelligent Information Processing(Fuzhou University), Fuzhou 350116, China,College of Mathematics and Computer Science, Fuzhou University, Fuzhou 350116, China;Fujian Provincial Key Laboratory of Networking Computing and Intelligent Information Processing(Fuzhou University), Fuzhou 350116, China,Key Laboratory of Network Data Science and Technology(Chinese Academy of Sciences), Beijing 100190, China and College of Mathematics and Computer Science, Fuzhou University, Fuzhou 350116, China;Fujian Provincial Key Laboratory of Networking Computing and Intelligent Information Processing(Fuzhou University), Fuzhou 350116, China
Abstract:Opinion retrieval is a hot topic in the research of Natural Language Processing. Most existing approaches intext opinion retrieval cannot extract knowledge and concept from context. There is also lacking of opinion generalization ability and ignoring the semantic relations between words. Hence, we propose an opinion retrieval method which is based on knowledge graph conceptualization and network embedding. In this approach, we use the conceptual knowledge graph to conceptualize the queries and texts into the correct conceptual space, andembed the nodes in the knowledge graph into low dimensional vectors space by network embedding technology. Then, we calculate the similarity between queries and texts based on embedding vectors. According to the similarity score, we can capture the opinion scores of texts based on statistics machine learning methods. Finally, the concept space, knowledge representation space, and opinion mining result serve opinion retrieval models. The experiment shows that the retrieval model proposed in this paper can effectively improve the retrieval performance of multiple retrieval models. Compared with referenced method based on unified opinion, our approach improves the MAP scores by 6.1% and 9.3%, respectively. Compared with referenced method based on learning to rank, our approach improves the MAP scores by 2.3% and 14.6%, respectively.
Keywords:Information Retrieval  Opinion Retrieval  Knowledge Graph  Text Conceptualization  Network Embedding
点击此处可从《软件学报》浏览原始摘要信息
点击此处可从《软件学报》下载全文
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号