首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
This paper presents a feature selection method for data classification, which combines a model-based variable selection technique and a fast two-stage subset selection algorithm. The relationship between a specified (and complete) set of candidate features and the class label is modeled using a non-linear full regression model which is linear-in-the-parameters. The performance of a sub-model measured by the sum of the squared-errors (SSE) is used to score the informativeness of the subset of features involved in the sub-model. The two-stage subset selection algorithm approaches a solution sub-model with the SSE being locally minimized. The features involved in the solution sub-model are selected as inputs to support vector machines (SVMs) for classification. The memory requirement of this algorithm is independent of the number of training patterns. This property makes this method suitable for applications executed in mobile devices where physical RAM memory is very limited.An application was developed for activity recognition, which implements the proposed feature selection algorithm and an SVM training procedure. Experiments are carried out with the application running on a PDA for human activity recognition using accelerometer data. A comparison with an information gain-based feature selection method demonstrates the effectiveness and efficiency of the proposed algorithm.  相似文献   

2.
The focus of this paper is on joint feature re-extraction and classification in cases when the training data set is small. An iterative semi-supervised support vector machine (SVM) algorithm is proposed, where each iteration consists both feature re-extraction and classification, and the feature re-extraction is based on the classification results from the previous iteration. Feature extraction is first discussed in the framework of Rayleigh coefficient maximization. The effectiveness of common spatial pattern (CSP) feature, which is commonly used in Electroencephalogram (EEG) data analysis and EEG-based brain computer interfaces (BCIs), can be explained by Rayleigh coefficient maximization. Two other features are also defined using the Rayleigh coefficient. These features are effective for discriminating two classes with different means or different variances. If we extract features based on Rayleigh coefficient maximization, a large training data set with labels is required in general; otherwise, the extracted features are not reliable. Thus we present an iterative semi-supervised SVM algorithm embedded with feature re-extraction. This iterative algorithm can be used to extract these three features reliably and perform classification simultaneously in cases where the training data set is small. Each iteration is composed of two main steps: (i) the training data set is updated/augmented using unlabeled test data with their predicted labels; features are re-extracted based on the augmented training data set. (ii) The re-extracted features are classified by a standard SVM. Regarding parameter setting and model selection of our algorithm, we also propose a semi-supervised learning-based method using the Rayleigh coefficient, in which both training data and test data are used. This method is suitable when cross-validation model selection may not work for small training data set. Finally, the results of data analysis are presented to demonstrate the validity of our approach. Editor: Olivier Chapelle.  相似文献   

3.
拓守恒 《系统仿真技术》2010,6(3):202-208,240
针对训练子集随机性强、规模大、算法时空复杂度高等问题,提出了基于量子微粒群的支持向量机(QPSO-SVM)核函数集成学习算法。该方法首先采用K-Means算法对训练样本进行聚类分析,然后根据其聚类分布选择少量具有代表性的样本,并通过基于量子行为的粒子群算法来训练单个支持向量机(SVM),最后通过贝叶斯投票方法得到集成的SVM分类学习器。实验表明该方法在非线性高复杂度的数据分类中对分类精度有较大提高。  相似文献   

4.
支持向量机最优模型选择的研究   总被引:18,自引:0,他引:18  
通过对核矩阵的研究,利用核矩阵的对称正定性,采用核校准的方法提出了一种SVM最优模型选择的算法——OMSA算法.利用训练样本不通过SVM标准训练和测试过程而寻求最优的核参数和相应的最优学习模型,弥补了传统SVM在模型选择上经验性强和计算量大的不足.采用该算法在UCI标准数据集和FERET标准人脸库上进行了实验,结果表明,通过该算法找到的核参数以及相应的核矩阵是最优的,得到的SVM分类器的错误率最小.该算法为SVM最优模型选择提供了一种可行的方法,同时对其他基于核的学习方法也具有一定的参考价值.  相似文献   

5.
提出一种模式识别算法——双层支持量机算法,用来提高表面肌电识别精度。该算法融合集成学习中元学习的并行方法和叠加法的递进思想,把基本SVM分类器并行分布在第1层,第1层的预测结果作为第2层的输入,由第2层再进行分类识别,从而通过多层分类器组合来融合多源特征。以手臂表面肌电数据集为测试数据,采用文中的双层支持向量机,各肌肉的肌电信号分别输入基支持向量机,组合器融合各肌肉电信号特征,集成识别前臂肌肉群的肌电信号,从而实现运动意图的精确识别。实验结果显示,在预测精度上,此算法优于单个SVM分类器。在预测性能上(识别精度、耗时、鲁棒性),此算法优于随机森林和旋转森林等集成分类器。  相似文献   

6.
针对图像型火灾探测方法检测准确度和实时性间的矛盾,提出了基于粗糙集的火灾图像特征选择和识别算法。首先通过对火焰图像特征的深入研究发现,在燃烧能量的驱动下火焰的上边缘极不规则,出现明显的震动现象,而下边缘却恰恰相反; 基于此特点,可利用上下边缘抖动投影个数比作为火焰区别于边缘形状较规则的干扰。然后,选择火焰的6个显著特征构造训练样本,在火灾分类能力不受影响的前提下,使用实验所得的特征量归类表对训练样本进行属性约简,并将约简后的信息系统属性训练支持向量机模型,实现火灾探测。最后与传统支持向量机火灾探测算法做了比较。实验结果表明:将粗糙集作为支持向量机分类器的前置系统,把粗糙集理论的属性约简引入到支持向量机中,可以大大消除样本集冗余属性,降低了火灾图像特征空间的维数,减少了分类器训练和检测数据,在保证识别精度的同时,提高了算法的速度和泛化能力。  相似文献   

7.
Top Scoring Pair (TSP) and its ensemble counterpart, k-Top Scoring Pair (k-TSP), were recently introduced as competitive options for solving classification problems of microarray data. However, support vector machine (SVM) which was compared with these approaches is not equipped with feature or variable selection mechanism while TSP itself is a kind of variable selection algorithm. Moreover, an ensemble of SVMs should also be considered as a possible competitor to k-TSP. In this work, we conducted a fair comparison between TSP and SVM-recursive feature elimination (SVM-RFE) as the feature selection method for SVM. We also compared k-TSP with two ensemble methods using SVM as their base classifier. Results on ten public domain microarray data indicated that TSP family classifiers serve as good feature selection schemes which may be combined effectively with other classification methods.  相似文献   

8.
李剑  江成顺  董丽英 《计算机工程》2010,36(13):180-182
提出基于选择性集成支持向量机的语音、话带数据信号分类方法,根据集成算法的差异性定义,采用两层级联结构的动态叠加算法完成决策输出。该方法能够在训练阶段准确地选择具有较高识别精度和差异性的成员分类器,在测试阶段对各成员分类器进行动态集成,保证最终的分类结果最优。构建时域、频域相结合的特征向量,并具有较好的抗噪声能力。实验结果表明,该方法无论在分类还是在运算复杂度上都取得较好的效果。  相似文献   

9.
基于训练样本自动选取的SVM彩色图像分割方法   总被引:1,自引:0,他引:1  
张荣  王文剑  白雪飞 《计算机科学》2012,39(11):267-271
图像分割是模式识别、图像理解、计算机视觉等领域的重要研究内容。基于支持向量机((Support Vcctor Ma- chine, SVM)的方法现已广泛应用于图像分割,但其在训练样本的选取上大多是人工选择,这降低了图像分割的自适 应性,且影响了SVM的分类性能。提出一种基于训练样本自动选取的SVM彩色图像分割方法,算法首先使用模糊 C均值(Fuzzy C-Mcans, FCM)聚类算法自动获取训练样本,然后分别提取图像颜色特征和纹理特征,将其作为SVM 模型训练样本的特征属性进行训练,最后用训练好的分类器对图像进行分割。实验结果表明,提出的方法可取得很好 的分割结果。  相似文献   

10.
姚全珠  田元 《计算机工程》2008,34(15):223-225
支持向量机中参数设置对训练支持向量机分类的精确度有不可忽视的影响。支持向量机参数的选取可看作参数的组合优化。免疫算法是一种有效的随机全局优化技术,它具有不易陷入局部最优解、解精度高、收敛速度快等优点。该文利用人工免疫算法进行支持向量机模型选择。该算法主要包括克隆选择、高频变异、受体编辑等操作。试验证明,该算法能够有效提高支持向量机分类的正确性。  相似文献   

11.
在无向加权图上进行距离检索和对象查询是使用无向加权图的重要工作,也是解决实际问题的重要步骤。该文提出一种基于距离签名的处理方法来实现距离检索和查询,通过距离分级、签名编码和压缩等,实现了检索和查询的高效率,减少了存储空间。描述了建模及处理KNN查询的过程,实验证明了该方法的有效性。  相似文献   

12.
基于改进离散二进制粒子群的SVM选择集成算法   总被引:1,自引:0,他引:1       下载免费PDF全文
针对基于离散二进制粒子群(BPSO)的SVM选择集成算法的分类精度不高,以及所选分类器个数过多等问题,利用改进的离散二进制粒子群算法(IBPSO)和SVM选择集成算法相结合,提出基于IBPSO的SVM选择集成算法。通过选用合适的适应度函数以及调节因子[k],进行多次仿真,实验表明,对由boostrap方式生成的SVM集合,基于IBPSO的SVM选择集成在精度和分类器个数方面均优于基于BPSO的SVM选择集成,证明了IBPSO算法的优越性。  相似文献   

13.
Currently, web spamming is a serious problem for search engines. It not only degrades the quality of search results by intentionally boosting undesirable web pages to users, but also causes the search engine to waste a significant amount of computational and storage resources in manipulating useless information. In this paper, we present a novel ensemble classifier for web spam detection which combines the clonal selection algorithm for feature selection and under-sampling for data balancing. This web spam detection system is called USCS. The USCS ensemble classifiers can automatically sample and select sub-classifiers. First, the system will convert the imbalanced training dataset into several balanced datasets using the under-sampling method. Second, the system will automatically select several optimal feature subsets for each sub-classifier using a customized clonal selection algorithm. Third, the system will build several C4.5 decision tree sub-classifiers from these balanced datasets based on its specified features. Finally, these sub-classifiers will be used to construct an ensemble decision tree classifier which will be applied to classify the examples in the testing data. Experiments on WEBSPAM-UK2006 dataset on the web spam problem show that our proposed approach, the USCS ensemble web spam classifier, contributes significant classification performance compared to several baseline systems and state-of-the-art approaches.  相似文献   

14.
Financial distress prediction (FDP) is of great importance to both inner and outside parts of companies. Though lots of literatures have given comprehensive analysis on single classifier FDP method, ensemble method for FDP just emerged in recent years and needs to be further studied. Support vector machine (SVM) shows promising performance in FDP when compared with other single classifier methods. The contribution of this paper is to propose a new FDP method based on SVM ensemble, whose candidate single classifiers are trained by SVM algorithms with different kernel functions on different feature subsets of one initial dataset. SVM kernels such as linear, polynomial, RBF and sigmoid, and the filter feature selection/extraction methods of stepwise multi discriminant analysis (MDA), stepwise logistic regression (logit), and principal component analysis (PCA) are applied. The algorithm for selecting SVM ensemble's base classifiers from candidate ones is designed by considering both individual performance and diversity analysis. Weighted majority voting based on base classifiers’ cross validation accuracy on training dataset is used as the combination mechanism. Experimental results indicate that SVM ensemble is significantly superior to individual SVM classifier when the number of base classifiers in SVM ensemble is properly set. Besides, it also shows that RBF SVM based on features selected by stepwise MDA is a good choice for FDP when individual SVM classifier is applied.  相似文献   

15.
提出了一种基于PIDC和二叉决策树SVM的人耳识别方法。采用PIDC方法以类间概率信息距离为监督提取人耳特征,降低了提取特征的维数;将PIDC方法与二叉决策树SVM分类方法相结合,实现了利用多类间概率信息距离监督人耳特征提取和分类。利用该方法对400个人耳进行识别实验,并将识别结果同PCA方法进行了比较,实验表明,文中方法降低了分类难度,提高了人耳识别率。  相似文献   

16.
基于支持向量机的自动人脸识别   总被引:1,自引:0,他引:1  
田雪  纪玉波  杨旭 《计算机工程》2005,31(5):191-193
首先应用K-L变换对人脸图像进行特征提取,然后利用支持向量机进行识别。由于支持向量机参数对其性能有较大影响,为此采用遗传算法对其参数进行选取。为了能用较少的特征个数得到较高的识别率以提高识别速度,对所需提取的有效特征个数一并进行了选择。算法既解决了支持向量机参数选取的难题,又能够利用较少的人脸特征得到较高的识别率。利用ORL人脸库进行仿真实验。得到了97.5%的正确识别结果,验证了算法的有效性。  相似文献   

17.
针对三维掌纹特征表示的鲁棒性和准确性问题,提出一种融合曲面的几何特征和 方向特征的三维掌纹识别方法。基于现有的曲面类型编码提取掌纹几何特征的基础上,提出使 用基于形状指数的编码来共同表达三维掌纹的几何特征,从而有效减少由阈值所引起的错误编 码带来的准确性上的影响。此外,提出一种多尺度的改进竞争编码来表达掌纹的方向特征。在 决策层,使用基于多字典的协同表示框架融合上述几何特征和方向特征以完成掌纹识别。在公 开的三维掌纹数据集上的大量实验表明,所提方法可以在保持较低计算复杂度的同时实现最佳 的识别精度。  相似文献   

18.
基于结构优化的DDAG-SVM上肢康复训练动作识别方法   总被引:1,自引:0,他引:1  
针对上肢康复训练系统中训练评估方法核心的动作识别问题,提出一种面向Brunnstrom 4~5期患者上肢康复训练动作的SODDAG-SVM(Structure-optimized decision directed acyclic graph-support vector machine)多分类识别方法.首先将多分类问题分解成一组二分类问题,并使用支持向量机构建各二分类器,分别采用遗传算法和特征子集区分度准则对各二分类器的核函数参数及特征子集进行优化.然后使用类对的SVM二分类器泛化误差来衡量每个类对的易被分离程度,并由其建立类对泛化误差上三角矩阵.最后由根节点开始,依次根据各节点的泛化误差矩阵,通过选择其中最易被分离类对的SVM分类器构成该节点的方式,来构建SODDAG-SVM多分类器结构.当待预测的实例较少时,直接构建实例经过的SODDAG-SVM部分结构并对实例进行预测;当待预测的实例较多时,先构建完整的SODDAG-SVM结构,再代入所有实例进行预测.通过人体传感技术获得Brunnstrom 4~5阶段上肢康复训练的常用动作样本集,进行SODDAG-SVM动作识别实验,准确率达到了95.49%,结果均优于常规的决策有向无环图(Decision directed acyceic graph,DDAG)和MaxWins方法,实验表明本文方法能有效地提高上肢康复训练动作识别的准确率.  相似文献   

19.
徐文轩  张莉 《计算机应用》2015,35(10):2808-2812
为高效地判别人类基因启动子,提出了一种基于单核苷酸统计和支持向量机集成的人类基因启动子识别算法。首先通过基因单核苷酸统计,从而将一个基因数据集分为C偏好和G偏好两个子集;然后分别对这两个子集提取DNA刚性特征、词频统计特征和CpG岛特征;最后采用多个支持向量机(SVM)集成的方式来学习这三种特征,并讨论了三种集成方式,包括单层SVM集成、双层SVM集成和级联SVM集成。实验结果表明所提算法能够提高人类基因启动子识别的敏感性和特异性,其中双层SVM集成的敏感性达到79.51%,且级联SVM集成的特异性高达84.58%。  相似文献   

20.
基于自适应步长的支持向量机快速训练算法*   总被引:1,自引:0,他引:1  
支持向量机训练问题实质上是求解一个凸二次规划问题。当训练样本数量非常多时, 常规训练算法便失去了学习能力。为了解决该问题并提高支持向量机训练速度,分析了支持向量机的本质特征,提出了一种基于自适应步长的支持向量机快速训练算法。在保证不损失训练精度的前提下,使训练速度有较大提高。在UCI标准数据集上进行的实验表明,该算法具有较好的性能,在一定程度上克服了常规支持向量机训练速度较慢的缺点、尤其在大规模训练集的情况下,采用该算法能够较大幅度地减小计算复杂度,提高训练速度。  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号