首页 | 本学科首页   官方微博 | 高级检索  
     

一种基于多视图数据的半监督特征选择和聚类算法
引用本文:汪荆琪,徐林莉.一种基于多视图数据的半监督特征选择和聚类算法[J].数据采集与处理,2015,30(1):106-116.
作者姓名:汪荆琪  徐林莉
作者单位:中国科学技术大学计算机科学与技术学院
基金项目:国家自然科学基金(61375060)资助项目;中央高校基本科研业务费专项资金(WK0110000036)资助项目
摘    要:高维数据中许多特征之间互不相关或冗余,这给传统的学习算法带来了巨大的挑战。为了解决该问题,特征选择应运而生。与此同时,许多实际问题中数据存在多个视图而且数据的标签难以获取,多视图学习和半监督学习成为机器学习中的热点问题。本文研究怎样从"部分标签"的多视图数据中选择最大相关最小冗余的特征子集,提出一种基于多视图的半监督特征选择方法。为了剔除冗余和无关的特征,探索蕴含于多视图数据中的互补信息以及每个视图中不同特征之间的冗余关系,并利用少量标签数据蕴含的信息协同未标签数据同时进行特征选择。实验结果验证了本算法能够获得很好的特征选择效果及聚类效果。

关 键 词:聚类  半监督  特征选择  多视图

Semi-supervised Feature Selection and Clustering for Multi-view Data
Wang Jingqi,Xu Linli.Semi-supervised Feature Selection and Clustering for Multi-view Data[J].Journal of Data Acquisition & Processing,2015,30(1):106-116.
Authors:Wang Jingqi  Xu Linli
Affiliation:Wang Jingqi;Xu Linli;School of Computer Science and Technology,University of Science and Technology of China;
Abstract:Lots of features in high-dimensional data are redundant or irrelevant. To tackle this problem, the concept of feature selection is introduced. In the meantime, many problems in machine learning involve examples that are naturally comprised of multiple views and with a limited number of labels. Multi-view learning and semi-supervised learning become the hotspots in machine learning. Hence authors investigate how to select relevant features with minimum redundancy from multi-view data with a limited number of labels, and propose a semi-supervised feature selection and clustering framework. To remove redundant and irrelevant features, authors exploit relations among views and relations among features in each view, and use a limited number of labeled data to help feature selection. The proposed framework in multi-view datasets is systematically evalated, and the results demonstrate the effectiveness and potential of the proposed method.
Keywords:clustering  semi-supervised  feature selection  multi-view
本文献已被 CNKI 万方数据 等数据库收录!
点击此处可从《数据采集与处理》浏览原始摘要信息
点击此处可从《数据采集与处理》下载全文
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号