首页 | 本学科首页   官方微博 | 高级检索  
     

适应文档检索的半监督多样本排序学习算法
引用本文:何海江,龙跃进.适应文档检索的半监督多样本排序学习算法[J].计算机应用,2011,31(11):3108-3111.
作者姓名:何海江  龙跃进
作者单位:长沙学院 计算机系,长沙 410003
基金项目:湖南省教育厅科学研究项目
摘    要:针对标记训练集不足的问题,提出了一种协同训练的多样本排序学习算法,从无标签数据挖掘隐含的排序信息。算法使用了两类多样本排序学习机,从当前已有的标记数据集分别构造两个不同的排序函数。相应地,每一个无标签查询都有两个不同的文档排列,由似然损失来计算这两个排列的相似性,为那些文档排列相似度低的查询贴上标签,使两个多样本排序学习机新增了训练数据。在排序学习公开数据集LETOR上的实验结果证实,协同训练的排序算法很有效。另外,还讨论了标注比例对算法的影响。

关 键 词:文档检索    半监督    排序学习    似然损失    协同训练
收稿时间:2011-04-25
修稿时间:2011-07-13

Semi-supervised learning listwise ranking functions for document retrieval
HE Hai-jiang,LONG Yue-jin.Semi-supervised learning listwise ranking functions for document retrieval[J].journal of Computer Applications,2011,31(11):3108-3111.
Authors:HE Hai-jiang  LONG Yue-jin
Affiliation:Department of Computer Science and Technology, Changsha University, Changsha Hunan 410003, China
Abstract:An iterative co-ranking algorithm, which aimed to extend learning to rank from a supervised setting into a semi-supervised setting, was proposed. The approach employed two listwise rankers to identify document permutations for an unlabeled query. In particular, the use of likelihood listwise loss was introduced to measure the difference score of two learners for a given query. The unlabeled query which showed significant difference score was then chosen for constructing the newly training dataset at next iteration, and its ideal document permutation for a listwise ranker was defined by another learner. The experimental results show that the proposed method can improve the ranking performance of supervised listwise ranking algorithm on the public dataset LETOR. In addition, the labeling ratio was also discussed.
Keywords:document retrieval                                                                                                                          semi-supervised                                                                                                                          rank learning                                                                                                                          likelihood loss                                                                                                                          co-training
本文献已被 CNKI 万方数据 等数据库收录!
点击此处可从《计算机应用》浏览原始摘要信息
点击此处可从《计算机应用》下载全文
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号