首页 | 本学科首页   官方微博 | 高级检索  
     

孤立点一类支持向量机算法研究
引用本文:田江, 顾宏. 孤立点一类支持向量机算法研究[J]. 电子与信息学报, 2010, 32(6): 1284-1288. doi: 10.3724/SP.J.1146.2009.00861
作者姓名:田江  顾宏
作者单位:大连理工大学电子与信息工程学院,大连,116023;大连理工大学电子与信息工程学院,大连,116023
摘    要:一类支持向量机将数据样本映射到高维空间,通过与坐标原点保持最大间隔的特征超平面检测孤立点。实际应用中算法对坐标原点的选择依赖性较强,检测性能受数据样本的分布影响较大;将算法转化为求解二类问题在一定程度上克服了这些不足,但其带来的数据不平衡问题受到现实中孤立点样本稀少或者不存在的影响。该文提出了孤立点一类支持向量机算法,并在此基础上设计了一种无监督的孤立点检测方法。分别基于超平面距离和概率输出大小定义两种孤立点异常程度,设定不同权值合并两种异常程度输出,将获得的可疑孤立点特征信息引入算法;在特征空间划分距离可疑孤立点最大间隔的超平面,分析在全部样本上的预测输出大小进而交互更新两部分的数据样本。在UCI数据集上进行了仿真实验,数据结果表明了该文方法能有效的提高检测率,降低误报率;同时样本交叉更新提高了检测的稳定性。

关 键 词:孤立点挖掘  一类支持向量机  癌症检测
收稿时间:2009-06-09
修稿时间:2009-10-16

Outlier One Class Support Vector Machines
Tian Jiang, Gu Hong. Outlier One Class Support Vector Machines[J]. Journal of Electronics & Information Technology, 2010, 32(6): 1284-1288. doi: 10.3724/SP.J.1146.2009.00861
Authors:Tian Jiang  Gu Hong
Affiliation:School of Electronic and Information Engineering, Dalian University of Technology, Dalian 116023, China
Abstract:One-Class Support Vector Machines (OCSVMs) distinguish outliers by computing a hyper-plane in feature space. The choice of the origin as separation point is arbitrary, which affects the decision boundary, and the distribution of samples has impact on the performance. Expanding the algorithm into solving two-class classification problems overcomes the drawbacks to a certain degree. However, the class imbalance problem is serious and the labeled outliers are rare or even non-existing. In this paper, a new “Outlier OCSVM” is proposed and a framework is designed for unsupervised outlier detection. Respectively scored by distance from hyper-plane and probabilistic output value, two definitions of outlier degree are presented. After picking out some suspicious outliers via combining the two criterions of outlier degree, the adjusted “Outlier OCSVM” starts the training operations, two parts of the dataset are updated interactively through comparison of the outputs. Experiment results on benchmark datasets show that the method can effectively improve the detection rate and reduce false positive rate, easy and reliable.
Keywords:Outlier mining  OCSVMs  Cancer detection
本文献已被 万方数据 等数据库收录!
点击此处可从《电子与信息学报》浏览原始摘要信息
点击此处可从《电子与信息学报》下载全文
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号