首页 | 本学科首页   官方微博 | 高级检索  
     

基于跨语言词向量模型的蒙汉查询词扩展方法研究
引用本文:马路佳,赖文,赵小兵.基于跨语言词向量模型的蒙汉查询词扩展方法研究[J].中文信息学报,2019,33(6):27-34.
作者姓名:马路佳  赖文  赵小兵
作者单位:中央民族大学 国家语言资源监测与研究少数民族语言中心,北京 100081
基金项目:国家自然科学基金(61331013)
摘    要:跨语言信息检索指以一种语言为检索词,检索出用另一种或几种语言描述的一种信息的检索技术,是信息检索领域重要的研究方向之一。近年来,跨语言词向量为跨语言信息检索提供了良好的词向量表示,受到很多学者的关注。该文首先利用跨语言词向量模型实现汉文查询词到蒙古文查询词的映射,其次提出串联式查询扩展、串联式查询扩展过滤、交叉验证筛选过滤三种查询扩展方法对候选蒙古文查询词进行筛选和排序,最后选取上下文相关的蒙古文查询词。实验结果表明:在蒙汉跨语言信息检索任务中引入交叉验证筛选方法对信息检索结果有很大的提升。

关 键 词:查询扩展  跨语言词向量  信息检索

Cross-Language Word Embedding Based Query Expansion for Chinese Mongolian Cross-Language Retrieval
MA Lujia,LAI Wen,ZHAO Xiaobing.Cross-Language Word Embedding Based Query Expansion for Chinese Mongolian Cross-Language Retrieval[J].Journal of Chinese Information Processing,2019,33(6):27-34.
Authors:MA Lujia  LAI Wen  ZHAO Xiaobing
Affiliation:National Language Resource Monitoring & Research Center of Minority Languages, Minzu University of China, Beijing 100081, China
Abstract:Cross-Language information retrieval is supposed to retrieve information in one language according to the queries of other languages. This paper uses cross language word vectors model to realize the mapping of Chinese query words to Mongolian query words. Three methods are proposed in this paper to perform mapping, namely Series, Series_opt and Cross_valid. These methods are used to map the Chinese queries as well as to select and sort the mapped words. Experimental conducted in the real environments show that our proposed algorithm can obtain improvement on Chinese-Mongolian information retrieval.
Keywords:query expansion  cross-language word vectors  information retrieval  
本文献已被 维普 等数据库收录!
点击此处可从《中文信息学报》浏览原始摘要信息
点击此处可从《中文信息学报》下载全文
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号