首页 | 本学科首页   官方微博 | 高级检索  
     

用KPCA-SVM的方法检测垃圾标签的研究
引用本文:习扬,苏一丹,覃希. 用KPCA-SVM的方法检测垃圾标签的研究[J]. 计算机技术与发展, 2014, 0(5): 65-69
作者姓名:习扬  苏一丹  覃希
作者单位:广西大学计算机与电子信息学院,广西南宁530004
基金项目:基金项目:教育部人文社会科学研究项目(11YJAZH080)
摘    要:高维数据中进行各种处理时所需样本数量会成指数级增加,同时样本间距离的价值也逐渐减小,将导致维数灾问题。文本标签数据通常会面临数据维数过高的问题,会影响用户对垃圾标签的检测。文中借助支持向量机的数学模型构建出针对Folksonomy的大规模垃圾标签检测模型。为了减少检测垃圾标签时维数过高的影响,在核主成分分析理论的启发下,将数据降维思想引入数据约简领域,提出基于核主成分分析法的大规模SVM数据集约简模型。最终实例化形成一种新的垃圾标签检测方法,即基于核主成分分析支持向量机( KPCA-SVM)的大规模垃圾标签检测模型。该模型在垃圾标签检测中可以在不影响数据特征的前提下,缩短模型的测试时间且检测性能良好。

关 键 词:数据降维  核主成分分析法  支持向量机  垃圾标签

Research on Detecting Social Spam with KPCA-SVM Method
XI Yang,SU Yi-dan,QIN Xi. Research on Detecting Social Spam with KPCA-SVM Method[J]. Computer Technology and Development, 2014, 0(5): 65-69
Authors:XI Yang  SU Yi-dan  QIN Xi
Affiliation:( College of Computer and Electronic Information, Guangxi University, Nanning 530004, China)
Abstract:The needed sample will increase exponentially when processing high-dimensional data,the value of the distance between the sample also gradually reduced at the same time,which will lead to the dimension disaster problem. Text label data usually face this prob-lem of high-dimensional data,it will affect the users to detect social spam. In this paper,take advantage of the mathematical model of Support Vector Machine ( SVM) to construct the large-scale social spam detection model for Foklsonomy. In order to reduce the influ-ence of high-dimensional data,inspired by the kernel principal component analysis theory,the ideas of data dimension reduction are intro-duced,the large-scale SVM data set reduction model is proposed which is based on kernel principal component analysis. Finally form a new social spam detection method,the large-scale social spam detection model based on kernel principal component analysis and support vector machine. This model would not affect the characteristics in the social spam detection,and it will shorten the test time and have a good detection performance.
Keywords:data dimension reduction  kernel principal component analysis theory  support vector machine  social spam
本文献已被 维普 等数据库收录!
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号