首页 | 本学科首页   官方微博 | 高级检索  
     

基于稳健模糊粗糙集模型的多标记文本分类
引用本文:张 晶,李德玉,王素格,李 华.基于稳健模糊粗糙集模型的多标记文本分类[J].计算机科学,2015,42(7):270-275.
作者姓名:张 晶  李德玉  王素格  李 华
作者单位:山西大学计算机与信息技术学院 太原030006,山西大学计算机与信息技术学院 太原030006;山西大学计算智能与中文信息处理教育部重点实验室 太原030006,山西大学计算机与信息技术学院 太原030006;山西大学计算智能与中文信息处理教育部重点实验室 太原030006,山西大学计算机与信息技术学院 太原030006
基金项目:本文受国家自然科学基金项目(61175067,5),山西省科技攻关项目(20110321027-02),山西省回国留学人员科研项目(2013-014)资助
摘    要:针对多标记数据的不确定性以及噪声数据的存在,提出了一种新的多标记稳健模糊粗糙分类模型。该模型是处理单标记分类问题的k-mean稳健统计量模糊粗糙分类模型的扩展应用。对于每个待分类数据,首先根据相似性计算方法,得到它们相对于各标记的隶属度;然后根据隶属度定义待分类数据与各标记的相关度;最后为每一组相关度赋予合适的阈值,得到相关的标记集合。在3个标准多标记数据集和1个真实多标记文本数据集上的实验结果表明,对于多标记文本分类问题,所提模型在 6个常用的多标记评测指标上较常用的ML-kNN和rank-SVM多标记学习方法具有更高的准确率。

关 键 词:模糊粗糙集  k-mean稳健统计量  隶属度  多标记学习

Multi-label Text Classification Based on Robust Fuzzy Rough Set Model
ZHANG Jing,LI De-yu,WANG Su-ge and LI Hua.Multi-label Text Classification Based on Robust Fuzzy Rough Set Model[J].Computer Science,2015,42(7):270-275.
Authors:ZHANG Jing  LI De-yu  WANG Su-ge and LI Hua
Affiliation:School of Computer & Information Technology,Shanxi University,Taiyuan 030006,China,School of Computer & Information Technology,Shanxi University,Taiyuan 030006,China;Key Laboratory of Computational Intelligence and Chinese Information Processing of Ministry of Education, Shanxi University,Taiyuan 030006,China,School of Computer & Information Technology,Shanxi University,Taiyuan 030006,China;Key Laboratory of Computational Intelligence and Chinese Information Processing of Ministry of Education, Shanxi University,Taiyuan 030006,China and School of Computer & Information Technology,Shanxi University,Taiyuan 030006,China
Abstract:Owing to the uncertainty of multi-label data and noise data,a novel multi-label robust fuzzy rough classification model was proposed,which is an extension of k-mean robust statistics fuzzy rough classification model that is used to solve the single label classification problem.First,for each unlabeled instance,the membership with respect to each label was obtained by similarity measures.Second,according to the membership,the degree of correlation was defined.Finally,an appropriate threshold was given to demarcate the correlated and uncorrelated labels. The experimental results on three benchmark multi-label datasets and one actual multi-label datasets indicate that the proposed model is superior to ML-kNN and rank-SVM across six popular multi-label evaluation metrics.
Keywords:Fuzzy rough set  k-mean robust statistics  Membership  Multi-label learning
点击此处可从《计算机科学》下载全文
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号