首页 | 本学科首页   官方微博 | 高级检索  
     

基于深度语义信息的查询扩展
引用本文:刘高军,方晓,段建勇.基于深度语义信息的查询扩展[J].计算机应用,2020,40(11):3192-3197.
作者姓名:刘高军  方晓  段建勇
作者单位:北方工业大学 信息学院, 北京 100144
基金项目:CNONIX国家标准应用与推广实验室资助项目;国家自然科学基金
摘    要:随着互联网时代的到来,搜索引擎开始被普遍使用。在针对冷门数据时,由于用户的搜索词范围过小,搜索引擎无法检索出需要的数据,此时查询扩展系统可以有效辅助搜索引擎来提供可靠服务。基于全局文档分析的查询扩展方法,提出结合神经网络模型与包含语义信息的语料的语义相关模型,来更深层地提取词语间的语义信息。这些深层语义信息可以为查询扩展系统提供更加全面有效的特征支持,从而分析词语间的可扩展关系。在近义词林、语言知识库“HowNet”义原标注信息等语义数据中抽取局部可扩展词分布,利用神经网络模型的深度挖掘能力将语料空间中每一个词语的局部可扩展词分布拟合成全局可扩展词分布。在与分别基于语言模型和近义词林的查询扩展方法对比实验中,使用基于语义相关模型的查询扩展方法拥有较高的查询扩展效率;尤其针对冷门搜索数据时,语义相关模型的查全率比对比方法分别提高了11.1个百分点与5.29个百分点。

关 键 词:查询扩展  语义相关度  深度学习  全局文档分析  语言模型  
收稿时间:2020-04-17
修稿时间:2020-06-26

Query extension based on deep semantic information
LIU Gaojun,FANG Xiao,DUAN Jianyong.Query extension based on deep semantic information[J].journal of Computer Applications,2020,40(11):3192-3197.
Authors:LIU Gaojun  FANG Xiao  DUAN Jianyong
Affiliation:College of Information Science, North China University of Technology, Beijing 100144, China
Abstract:With the advent of the Internet era, search engines begin to be widely used. In the case of unpopular data, the search engine is unable to retrieve the required data due to the small range of the user's search term. At this time, the query extension system can effectively assist the search engine to provide the reliable services. Based on the query extension method of global document analysis, a semantic relevance model which combines the neural network model with the corpus containing semantic information was proposed to extract semantic information between words in a deeper level. This deep semantic information can provide more comprehensive and effective feature support for the query extension system, so as to analyze the extensible relationship between words. The local extensible word distribution was extracted from the semantic data such as thesaurus and language knowledge base "HowNet" sememe annotation information, and the local extensible word distribution of each word in corpus space was fitted to the global extensible word distribution by using the deep mining ability of the neural network model. In the comparison experiment with the query extension methods based on language model and thesaurus respectively, the query extension method based on semantic relevance model has a higher query extension efficiency; especially for the unpopular search data, the recall rate of semantic relevance model increases by 11.1 percentage points and 5.29 percentage points compared to those of the comparison methods respectively.
Keywords:query extension  semantic relevance  deep learning  global document analysis  language model  
本文献已被 万方数据 等数据库收录!
点击此处可从《计算机应用》浏览原始摘要信息
点击此处可从《计算机应用》下载全文
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号