首页 | 本学科首页   官方微博 | 高级检索  
     

哈希快速多标记学习算法
引用本文:胡海峰.,耿静静.,冯巧遇.,孙永.,吴建盛.哈希快速多标记学习算法[J].信号处理,2017,33(8):1065-1072.
作者姓名:胡海峰.  耿静静.  冯巧遇.  孙永.  吴建盛
作者单位:南京邮电大学通信与信息工程学院
基金项目:国家自然科学基金(61571233)
摘    要:本文针对多标记学习耗时大、很难处理大规模数据的问题,提出了一种哈希快速多标记学习算法(HFMLL),该算法将哈希算法与多标记学习算法结合,采用局部敏感哈希算法快速获得每个样本的近邻样本,并通过最小独立置换的MinHash算法快速找到每个标记的相关标记,根据其近邻样本及相关标记的信息,运用最大后验概率准则来预测新样本的标记集。实验表明HFMLL 算法在保持较高分类性能的情况下,算法速度明显优于目前的多标记算法,可以广泛应用于大规模的数据集。 

关 键 词:多标记学习    哈希    快速    标记相关性
收稿时间:2017-01-03

Fast Multi-label Learning based on Hashing
Affiliation:School of Telecommunication and Information Engineering, Nanjing University of Posts and Telecommunications
Abstract:A Fast Multi-label Learning based on Hashing algorithm (HFMLL) is proposed to solve the problem that many current multi-label learning algorithms are usually time-consuming and difficult to handle large-scale data. The method combines the hashing algorithm and the multi-label algorithm. The HFMLL algorithm takes advantage of a Locality Sensitive Hashing (LSH) to get its neighboring instances for each unseen instance, and calculates the label correlation by estimating the similarity of labels through a min-wise independent permutations locality sensitive hashing (MinHash) scheme. Then, maximum a posteriori principle is used to predict the label set for unseen instances by considering their statistical information attained from all related labels of the neighboring instances. Experiments show that our proposed HFMLL algorithm is superior to current multi-label algorithm in maintaining high classification performance, besides, the method is significantly faster than and achieves the comparable performance with the state-of-art multi-label learning methods ,which can be widely applied to large-scale data sets. 
Keywords:
点击此处可从《信号处理》浏览原始摘要信息
点击此处可从《信号处理》下载全文
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号