首页 | 本学科首页   官方微博 | 高级检索  
     

基于核主成分分析的数据流降维研究
引用本文:高宏宾,侯 杰,李瑞光.基于核主成分分析的数据流降维研究[J].计算机工程与应用,2013,49(11):105-109.
作者姓名:高宏宾  侯 杰  李瑞光
作者单位:五邑大学 计算机学院,广东 江门 529020
摘    要:分析了数据流降维算法PCA和KPCA的原理和实现方法。针对在大型数据集上PCA线性降维无法有效实现降维且KPCA的降维效率差,提出了一种新的降维策略GKPCA算法。该算法将数据集先分组,对每一组执行KPCA,然后过滤重新组合数据集,再次应用KPCA算法,达到简化样本空间,降低了时间复杂度和空间复杂度。实验分析表明,GKPCA算法不仅能取得良好的降维效果,而且时间消耗少。

关 键 词:核主成分分析  数据流  降维  

Research on dimension reduction of data stream based on kernel principal component analysis
GAO Hongbin,HOU Jie,LI Ruiguang.Research on dimension reduction of data stream based on kernel principal component analysis[J].Computer Engineering and Applications,2013,49(11):105-109.
Authors:GAO Hongbin  HOU Jie  LI Ruiguang
Affiliation:School of Computer Science and Technology, Wuyi University, Jiangmen, Guangdong 529020, China
Abstract:Theory and implementation of two data stream dimension reduction algorithms, PCA and KPCA, are analyzed. Due to linear PCA and KPCA can not effectively reduce data stream dimension when applied over large scale stream data, a new dimension reduction technique called GKPCA is proposed. With GKPCA, data sets are first partitioned into groups, and then KPCA is applied over each group. Data sets are filtered and regrouped into a new dataset. KPCA is again evaluated over the new data sets. This process is preceding recursively when some reduction threshold is reached which simplifies data stream sampling space and reduces time and space complexity of KPCA. Experimental analysis over different datasets illustrates that GKPCA can reduce data stream dimension excellently with less time consumption.
Keywords:Kernel Principal Component Analysis(KPCA)  data stream  dimension reduction  
点击此处可从《计算机工程与应用》浏览原始摘要信息
点击此处可从《计算机工程与应用》下载全文
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号