首页 | 本学科首页   官方微博 | 高级检索  
     

一种基于排序学习方法的查询扩展技术
引用本文:徐博,林鸿飞,林原,王健. 一种基于排序学习方法的查询扩展技术[J]. 中文信息学报, 2015, 29(3): 155-161
作者姓名:徐博  林鸿飞  林原  王健
作者单位:大连理工大学,辽宁 大连 116024
基金项目:国家自然科学基金,国家863高科技计划,辽宁省自然科学基金,教育部留学回国人员科研启动基金和高等学校博士学科点专项科研基金,中央高校基本科研业务费专项资金
摘    要:查询扩展作为一门重要的信息检索技术,是以用户查询为基础,通过一定策略在原始查询中加入一些相关的扩展词,从而使得查询能够更加准确地描述用户信息需求。排序学习方法利用机器学习的知识构造排序模型对数据进行排序,是当前机器学习与信息检索交叉领域的研究热点。该文尝试利用伪相关反馈技术,在查询扩展中引入排序学习算法,从文档集合中提取与扩展词相关的特征,训练针对于扩展词的排序模型,并利用排序模型对新查询的扩展词集合进行重新排序,将排序后的扩展词根据排序得分赋予相应的权重,加入到原始查询中进行二次检索,从而提高信息检索的准确率。在TREC数据集合上的实验结果表明,引入排序学习算法有助于提高伪相关反馈的检索性能。

关 键 词:信息检索  查询扩展  伪相关反馈  排序学习  

A Query Expansion Method Based on Learning to Rank
XU Bo,LIN Hongfei,LIN Yuan,WANG Jian. A Query Expansion Method Based on Learning to Rank[J]. Journal of Chinese Information Processing, 2015, 29(3): 155-161
Authors:XU Bo  LIN Hongfei  LIN Yuan  WANG Jian
Affiliation:Dalian University of Technology, Dalian, Liaoning 116024, China
Abstract:Query Expansion is an important technique for improving retrieval performance. It uses some strategies to add some relevant terms to the original query submitted by the user, which could express the user’s information need more exactly and completely. Learning to rank is a hot machine learning issue addressed in in information retrieval, seeking to automatically construct ranking models determining the relevance degrees between objects. This paper attempts to improve pseudo-relevance feedback by introducing learning to rank algorithm to re-rank expansion terms. Some term features are obtained from the original query terms and the expansion terms, learning from which we can get a new ranking list of expansion terms. Adding the expansion terms list to the original query, we can acquire more relevant documents and improve the rate of accuracy. Experimental results on the TREC dataset shows that incorporating ranking algorithms in query expansion can lead to better retrieval performance.
Keywords:information retrieval  query expansion  pseudo-relevance feedback  learning to rank
点击此处可从《中文信息学报》浏览原始摘要信息
点击此处可从《中文信息学报》下载全文
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号