首页 | 本学科首页   官方微博 | 高级检索  
     

一种针对弱标记文档的分类方法
引用本文:梁伟超,宋 斌.一种针对弱标记文档的分类方法[J].计算机与现代化,2016,0(1):77.
作者姓名:梁伟超  宋 斌
摘    要:多标记学习不同于传统的监督学习,它是为了解决客观世界中多义性对象的建模问题而提出的一种学习框架。在该框架下,一个实例可以同时隶属于多个标记。已有的多标记学习算法大多假设每个样本的标记集合都是完整的,但有时某些实例对应的标记会出现缺失。为了应对这一问题,本文提出一种针对弱标记文档的分类方法,该方法基于标记之间不同的相关性和相似实例具有相似标记的假设,构造一个最优化问题,以尽可能地补全缺失的标记。实验结果表明,该方法可以有效地提升学习系统的泛化性能。 

关 键 词:弱标记    文档分类    多标记学习    机器学习    数据挖掘  
收稿时间:2016-01-26

A Text Classification Method for Weak Labeling
LIANG Wei-chao,SONG Bin.A Text Classification Method for Weak Labeling[J].Computer and Modernization,2016,0(1):77.
Authors:LIANG Wei-chao  SONG Bin
Abstract:Multi-label learning is different from traditional supervised learning. It is a framework which is proposed to represent objects which might have multiple semantic meanings simultaneously in the external world. Under this framework, an instance might be associated with a set of labels. The majority of the existing multi-label learning algorithms assume that each label set corresponding to the example is complete. However, the label sets associated with some examples may he incomplete. To deal with this problem, we propose a text classification method for weak labeling. The method tries to replenish missing labels by constructing an optimization problem, which is based on the assumptions that correlations between different labels are different and similar instances may have similar labels. Extensive experiments show that the proposed method can effectively improve the generalization performance of the learning system. 
Keywords:weak labeling  document classification  multi-label learning  machine learning  data mining  
点击此处可从《计算机与现代化》浏览原始摘要信息
点击此处可从《计算机与现代化》下载全文
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号