首页 | 本学科首页   官方微博 | 高级检索  
     

一种基于正则化的半监督多标记学习方法
引用本文:李宇峰,黄圣君,周志华.一种基于正则化的半监督多标记学习方法[J].计算机研究与发展,2012,49(6):1272-1278.
作者姓名:李宇峰  黄圣君  周志华
作者单位:计算机软件新技术国家重点实验室(南京大学) 南京210093
基金项目:国家自然科学基金项目,江苏省自然科学基金项目,国家“九七三”重点基础研究发展计划基金项目
摘    要:多标记学习主要用于解决单个样本同时属于多个类别的问题.传统的多标记学习通常假设训练数据集含有大量有标记的训练样本.然而在许多实际问题中,大量训练样本中通常只有少量有标记的训练样本.为了更好地利用丰富的未标记训练样本以提高分类性能,提出了一种基于正则化的归纳式半监督多标记学习方法——MASS.具体而言,MASS首先在最小化经验风险的基础上,引入两种正则项分别用于约束分类器的复杂度及要求相似样本拥有相似结构化多标记输出,然后通过交替优化技术给出快速解法.在网页分类和基因功能分析问题上的实验结果验证了MASS方法的有效性.

关 键 词:机器学习  多标记学习  半监督学习  网页分类  基因功能分析

Regularized Semi-Supervised Multi-Label Learning
Li Yufeng , Huang Shengjun , Zhou Zhihua.Regularized Semi-Supervised Multi-Label Learning[J].Journal of Computer Research and Development,2012,49(6):1272-1278.
Authors:Li Yufeng  Huang Shengjun  Zhou Zhihua
Affiliation:(National Key Laboratory for Novel Software Technology(Nanjing University),Nanjing 210093)
Abstract:Multi-label learning is proposed to deal with examples which are associating with multiple class labels simultaneously.Previous multi-label studies usually assume that large amounts of labeled training examples are available to obtain good performance.However,in many real world applications,labeled examples are few and amounts of unlabeled examples are readily available.In order to exploit the abundant unlabeled examples to help improve the generalization performance,we propose a novel regularized inductive semi-supervised multi-label method named MASS.Specifically,aside from minimizing the empirical risk,MASS employs two regularizers to constrain the final decision function.One is to characterize the classifier’s complexity with consideration of label relatedness,and the other requires that similar examples share with similar structural multi-label outputs.This leads to a large scale convex optimization problem,and an efficient alternating optimization algorithm is provided to achieve its global optimal solution in super-linear convergence rate due to the strong convexity of the objective function.Comprehensive experimental results on two real-world data sets,i.e.,webpage categorization and gene functional analysis with varied numbers of labeled examples,demonstrate the effectiveness of the proposal.
Keywords:machine learning  multi-label learning  semi-supervised learning  webpage categorization  gene functional analysis
本文献已被 CNKI 万方数据 等数据库收录!
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号