首页 | 本学科首页   官方微博 | 高级检索  
     

排序学习中数据噪音敏感度分析
引用本文:牛树梓,程学旗,郭嘉丰.排序学习中数据噪音敏感度分析[J].中文信息学报,2012,26(5):53-59.
作者姓名:牛树梓  程学旗  郭嘉丰
作者单位:中国科学院 计算技术研究所,北京 100190
基金项目:国家自然科学基金资助项目,国家863计划重点资助项目
摘    要:排序学习是当前信息检索领域研究热点之一。为了避免训练集中噪音的影响,当前排序学习算法较多关注鲁棒性。已有的工作发现相同的排序学习方法的性能在不同的数据集上会有截然不同的噪音敏感度。模型改变是导致性能下降的直接原因,而模型又是从训练集学习到的,因此根源在于训练数据的某些特性。该文根据具体排序学习场景分析得出影响噪音敏感度的根本原因在于训练集中文档对分布的结论,并在LETOR3.0上的实验验证了这一结论。

关 键 词:排序学习  数据质量  噪音敏感  

Noise Sensitivity in Learning to Rank
NIU Shuzi,CHENG Xueqi,GUO Jiafeng.Noise Sensitivity in Learning to Rank[J].Journal of Chinese Information Processing,2012,26(5):53-59.
Authors:NIU Shuzi  CHENG Xueqi  GUO Jiafeng
Affiliation:Institute of Computing Technology, Chinese Academy of Sciences, Beijing 100190, China
Abstract:Learning to rank is one of the most attractive areas in information retrieval. Much attention has been paid on the robustness of ranking algorithms to deal with noise which is inevitable in the training set. Previous work observes that ranking performance of the same algorithm showed totally different noise sensitivities. The performance degradation of ranking models boils down to the training set. Thus the underlying reason for different sensitivities lies in some attribute of training data. Experimental results on LETOR3.0 suggest that if the document pairs of the same training set scatter more dispersedly, the model from this training set is less influenced by the error document pairs and the training set is thus less sensitive to noise.
Key wordslearning to rank; data quality; noise sensitivity
Keywords:learning to rank  data quality  noise sensitivity  
本文献已被 万方数据 等数据库收录!
点击此处可从《中文信息学报》浏览原始摘要信息
点击此处可从《中文信息学报》下载全文
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号