首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 62 毫秒
1.
针对传统支持向量机(SVM)在封装式特征选择中分类精度低、特征子集选择冗余以及计算效率差的不足,利用元启发式优化算法同步优化SVM与特征选择。为改善SVM分类效果以及选择特征子集的能力,首先,利用自适应差分进化(DE)算法、混沌初始化与锦标赛选择策略对斑点鬣狗优化(SHO)算法改进,以增强其局部搜索能力并提高其寻优效率与求解精度;其次,将改进后的算法用于特征选择与SVM参数调整的同步优化中;最后,在UCI数据集进行特征选择仿真实验,采取分类准确率、选择特征数、适应度值及运行时间来综合评估所提算法的优化性能。实验结果证明,改进算法的同步优化机制能够在高分类准确率下降低特征选择的数目,该算法比传统算法更适合解决封装式特征选择问题,具有良好的应用价值。  相似文献   

2.
In many pattern classification applications, data are represented by high dimensional feature vectors, which induce high computational cost and reduce classification speed in the context of support vector machines (SVMs). To reduce the dimensionality of pattern representation, we develop a discriminative function pruning analysis (DFPA) feature subset selection method in the present study. The basic idea of the DFPA method is to learn the SVM discriminative function from training data using all input variables available first, and then to select feature subset through pruning analysis. In the present study, the pruning is implement using a forward selection procedure combined with a linear least square estimation algorithm, taking advantage of linear-in-the-parameter structure of the SVM discriminative function. The strength of the DFPA method is that it combines good characters of both filter and wrapper methods. Firstly, it retains the simplicity of the filter method avoiding training of a large number of SVM classifier. Secondly, it inherits the good performance of the wrapper method by taking the SVM classification algorithm into account.  相似文献   

3.
唐寿洪  朱焱  杨凡 《计算机科学》2015,42(1):239-243
网页作弊不仅造成信息检索质量下降,而且给互联网的安全也带来了极大的挑战.提出了一种基于Bag-ging-SVM集成分类器的网页作弊检测方法.在预处理阶段,首先采用K-means方法解决数据集的不平衡问题,然后采用CFS特征选择方法筛选出最优特征子集,最后对特征子集进行信息熵离散化处理.在分类器训练阶段,通过Bagging方法构建多个训练集并分别对每个训练集进行SVM学习来产生弱分类器.在检测阶段,通过多个弱分类器投票决定测试样本所属类别.在数据集WEBSPAM-UK2006上的实验结果表明,在使用特征数量较少的情况下,本检测方法可以获得非常好的检测效果.  相似文献   

4.
This paper presents a hybrid filter-wrapper feature subset selection algorithm based on particle swarm optimization (PSO) for support vector machine (SVM) classification. The filter model is based on the mutual information and is a composite measure of feature relevance and redundancy with respect to the feature subset selected. The wrapper model is a modified discrete PSO algorithm. This hybrid algorithm, called maximum relevance minimum redundancy PSO (mr2PSO), is novel in the sense that it uses the mutual information available from the filter model to weigh the bit selection probabilities in the discrete PSO. Hence, mr2PSO uniquely brings together the efficiency of filters and the greater accuracy of wrappers. The proposed algorithm is tested over several well-known benchmarking datasets. The performance of the proposed algorithm is also compared with a recent hybrid filter-wrapper algorithm based on a genetic algorithm and a wrapper algorithm based on PSO. The results show that the mr2PSO algorithm is competitive in terms of both classification accuracy and computational performance.  相似文献   

5.
目的 在极化合成孔径雷达(synthetic aperture radar,SAR)图像中常用直线检测进行机场跑道的识别,但是河流、道路等与机场跑道具有相似直线的地物容易对检测结果造成干扰,出现检测目标难定位、目标模糊、多虚警等问题。为此,本文设计了一种利用目标散射特性结合局部二值模式(local binary patterns,LBP)特征分类的极化SAR图像机场跑道区域检测方法,采用LBP特征对极化SAR图像进行有监督的分类来提取真实的机场区域。方法 首先利用异化散射功率对极化SAR图像进行阈值分割,然后通过形态学处理得到疑似机场跑道区域,同时构建机场跑道和非机场跑道两类训练样本,并提取、统计样本的LBP特征,形成直方图,得到特征向量训练支持向量机(support vector machine,SVM)二分类器,其中SVM二分类器采用了径向基函数(radial basis function,RBF)核函数;接着对疑似机场跑道区域构建LBP特征,送入SVM二分类器中分类,对机场跑道进行检测识别,最终得到真实的机场跑道区域。结果 利用UAVSAR(uninhabited aerial vehicle synthetic aperture radar)系统采集的7幅极化SAR图像数据进行实验检测,并选取基于几何特征辨识跑道的两种算法进行对比,3种方法均有效检测出了7幅场景中的真实跑道,但是本文方法在7幅数据中总的虚警和漏警个数均为1,而两种对比算法中的虚警个数分别为2和11、漏警个数分别为8和1。结论 本文方法不仅能有效检测出机场跑道区域,且检测效果更好,计算量较小,虚警和漏警率低,效率更高。  相似文献   

6.
Support vector machine (SVM) was initially designed for binary classification. To extend SVM to the multi-class scenario, a number of classification models were proposed such as the one by Crammer and Singer (2001). However, the number of variables in Crammer and Singer’s dual problem is the product of the number of samples (l) by the number of classes (k), which produces a large computational complexity. This paper presents a simplified multi-class SVM (SimMSVM) that reduces the size of the resulting dual problem from l × k to l by introducing a relaxed classification error bound. The experimental results demonstrate that the proposed SimMSVM approach can greatly speed-up the training process, while maintaining a competitive classification accuracy.  相似文献   

7.
针对人脸识别中因特征个数较多对识别的实时性和准确性影响较大的问题,提出了ReliefF-SVM RFE组合式特征选择的人脸识别方法。利用离散余弦变换提取特征和ReliefF对人脸图像特征集做特征初选,降低特征维数空间,再用改进的SVM RFE(Support Vector Machine Recursive Feature Elimination)选择最优特征,解决了利用SVM RFE特征选择时因特征数多而算法需多次训练耗时长的问题。对训练得到的特征排序表采用交叉留一验证方法选取最优子集,再由SVM分类识别。在UMIST人脸库上实验证明,可以在特征数为52时,达到98.84%的识别率,识别时间仅需0.037 s。  相似文献   

8.
The problem of feature definition in the design of a pattern recognition system where the number of available training samples is small but the number of potential features is excessively large has not received adequate attention. Most of the existing feature extraction and feature selection procedures are not feasible due to computational considerations when the number of features exceeds, say, 100, and are not even applicable when the number of features exceeds the number of patterns. The feature definition procedure which we have proposed involves partitioning a large set of highly correlated features into subsets, or clusters, through hierarchical clustering. Almost any feature selection or extraction procedure, including the constrained maximum variance approach introduced here, can then be applied to each subset to obtain a single representative feature. The original set of correlated features is thus reduced to a small set of nearly uncorrelated features. The utility of this procedure has been demonstrated on a speaker-identification data base which consists of 20 subjects, 156 features, and 180 samples.  相似文献   

9.
10.
Traditional Support Vector Machine (SVM) solution suffers from O(n 2) time complexity, which makes it impractical to very large datasets. To reduce its high computational complexity, several data reduction methods are proposed in previous studies. However, such methods are not effective to extract informative patterns. In this paper, a two-stage informative pattern extraction approach is proposed. The first stage of our approach is data cleaning based on bootstrap sampling. A bundle of weak SVM classifiers are constructed on the sampled datasets. Training data correctly classified by all the weak classifiers are cleaned due to lacking useful information for training. To further extract more informative training data, two informative pattern extraction algorithms are proposed in the second stage. As most training data are eliminated and only the more informative samples remain, the final SVM training time is reduced significantly. Contributions of this paper are three-fold. (1) First, a parallelized bootstrap sampling based method is proposed to clean the initial training data. By doing that, a large number of training data with little information are eliminated. (2) Then, we present two algorithms to effectively extract more informative training data. Both algorithms are based on maximum information entropy according to the empirical misclassification probability of each sample estimated in the first stage. Therefore, training time can be further reduced for training data further reduction. (3) Finally, empirical studies on four large datasets show the effectiveness of our approach in reducing the training data size and the computational cost, compared with the state-of-the-art algorithms, including PEGASOS, LIBLINEAR SVM and RSVM. Meanwhile, the generalization performance of our approach is comparable with baseline methods.  相似文献   

11.
样例约简支持向量机   总被引:1,自引:0,他引:1       下载免费PDF全文
支持向量机(support vector machine,SVM)仅利用靠近分类边界的支持向量构造最优分类超平面,但求解SVM需要整个训练集,当训练集的规模较大时,求解SVM需要占用大量的内存空间,寻优速度非常慢。针对这一问题,提出了一种称为样例约简的寻找候选支持向量的方法。在该方法中,支持向量大多靠近分类边界,可利用相容粗糙集技术选出边界域中的样例,作为候选支持向量,然后将选出的样例作为训练集来求解SVM。实验结果证实了该方法的有效性,特别是对大型数据库,该方法能有效减少存储空间和执行时间。  相似文献   

12.
网络故障诊断中大量无关或冗余的特征会降低诊断的精度,需要对初始特征进行选择。Wrapper模式特征选择方法分类算法计算量大,为了降低计算量,本文提出了基于支持向量的二进制粒子群(SVB-BPSO)的故障特征选择方法。该算法以SVM为分类器,首先通过对所有样本的SVM训练选出SV集,在封装的分类训练中仅使用SV集,然后采用异类支持向量之间的平均距离作为SVM的参数进行训练,最后根据分类结果,利用BPSO在特征空间中进行全局搜索选出最优特征集。在DARPA数据集上的实验表明本文提出的方法能够降低封装模式特征选择的计算量且获得了较高的分类精度以及较明显的降维效果。  相似文献   

13.
Selecting relevant features for support vector machine (SVM) classifiers is important for a variety of reasons such as generalization performance, computational efficiency, and feature interpretability. Traditional SVM approaches to feature selection typically extract features and learn SVM parameters independently. Independently performing these two steps might result in a loss of information related to the classification process. This paper proposes a convex energy-based framework to jointly perform feature selection and SVM parameter learning for linear and non-linear kernels. Experiments on various databases show significant reduction of features used while maintaining classification performance.  相似文献   

14.
We discuss a Lagrangian-relaxation-based heuristics for dealing with feature selection in the Support Vector Machine (SVM) framework for binary classification. In particular we embed into our objective function a weighted combination of the L1 and L0 norm of the normal to the separating hyperplane. We come out with a Mixed Binary Linear Programming problem which is suitable for a Lagrangian relaxation approach.Based on a property of the optimal multiplier setting, we apply a consolidated nonsmooth optimization ascent algorithm to solve the resulting Lagrangian dual. In the proposed approach we get, at every ascent step, both a lower bound on the optimal solution as well as a feasible solution at low computational cost.We present the results of our numerical experiments on some benchmark datasets.  相似文献   

15.
一种新的特征提取方法及其在模式识别中的应用   总被引:2,自引:0,他引:2  
刘宗礼  曹洁  郝元宏 《计算机应用》2009,29(4):1032-1035
核典型相关分析(KCCA)是一种有监督的机器学习方法,可以有效地提取非线性特征。然而随着训练样本数目的增加,标准的KCCA方法的计算复杂度会随之增加。针对此缺点,提出一种改进的KCCA方法:首先用几何特征选择方法选择一个训练样本子集并将其映射到再生核希尔伯特空间(RKHS),然后设计了一种提升特征提取效率的算法,该算法按照对特征分类贡献的大小巧妙地选取样本的特征值,进而求出其相应的特征向量,最后将改进的KCCA与支持向量数据描述(SVDD)多分类器相结合用于分类识别。在ORL人脸图像数据库上的实验结果表明,改进的方法相对传统的KCCA方法,在不影响识别率的情况下提高了人脸识别速度,减小了系统存储量。  相似文献   

16.
A new hidden Markov model (HMM) based feature generation scheme is proposed for face recognition (FR) in this paper. In this scheme, HMM method is used to model classes of face images. A set of Fisher scores is calculated through partial derivative analysis of the parameters estimated in each HMM. These Fisher scores are further combined with some traditional features such as log-likelihood and appearance based features to form feature vectors that exploit the strengths of both local and holistic features of human face. Linear discriminant analysis (LDA) is then applied to analyze these feature vectors for FR. Performance improvements are observed over stand-alone HMM method and Fisher face method which uses appearance based feature vectors. A further study reveals that, by reducing the number of models involved in the training and testing stages of LDA, the proposed feature generation scheme can maintain very high discriminative power at much lower computational complexity comparing to the traditional HMM based FR system. Experimental results on a public available face database are provided to demonstrate the viability of this scheme.  相似文献   

17.
针对支持向量机(Support Vector Machine,SVM)处理大规模数据集的学习时间长、泛化能力下降等问题,提出基于边界样本选择的支持向量机加速算法。首先,进行无监督的K均值聚类;然后,在各个聚簇内依照簇的混合度、支持度因素应用K近邻算法剔除非边界样本,获得最终的类别边界区域样本,参与SVM模型训练。在标准数据集上的实验结果表明,算法在保持传统支持向量机的分类泛化能力的同时,显著降低了模型训练时间。  相似文献   

18.
提出一种基于偏最小二乘回归的鲁棒性特征选择与分类算法(RFSC-PLSR)用于解决特征选择中特征之间的冗余和多重共线性问题。首先,定义一个基于邻域估计的样本类一致性系数;然后,根据不同k近邻(kNN)操作筛选出局部类分布结构稳定的保守样本,用其建立偏最小二乘回归模型,进行鲁棒性特征选择;最后,在全局结构角度上,用类一致性系数和所有样本的优选特征子集建立偏最小二乘分类模型。从UCI数据库中选择了5个不同维度的数据集进行数值实验,实验结果表明,与支持向量机(SVM)、朴素贝叶斯(NB)、BP神经网络(BPNN)和Logistic回归(LR)四种典型的分类器相比,RFSC-PLSR在低维、中维、高维等不同情况下,分类准确率、鲁棒性和计算效率三种性能上均表现出较强的竞争力。  相似文献   

19.
In this paper, we developed a prediction model based on support vector machine (SVM) with a hybrid feature selection method to predict the trend of stock markets. This proposed hybrid feature selection method, named F-score and Supported Sequential Forward Search (F_SSFS), combines the advantages of filter methods and wrapper methods to select the optimal feature subset from original feature set. To evaluate the prediction accuracy of this SVM-based model combined with F_SSFS, we compare its performance with back-propagation neural network (BPNN) along with three commonly used feature selection methods including Information gain, Symmetrical uncertainty, and Correlation-based feature selection via paired t-test. The grid-search technique using 5-fold cross-validation is used to find out the best parameter value of kernel function of SVM. In this study, we show that SVM outperforms BPN to the problem of stock trend prediction. In addition, our experimental results show that the proposed SVM-based model combined with F_SSFS has the highest level of accuracies and generalization performance in comparison with the other three feature selection methods. With these results, we claim that SVM combined with F_SSFS can serve as a promising addition to the existing stock trend prediction methods.  相似文献   

20.
Most existing content-based video retrieval (CBVR) systems are now amenable to support automatic low-level feature extraction, but they still have limited effectiveness from a user's perspective because of the semantic gap. Automatic video concept detection via semantic classification is one promising solution to bridge the semantic gap. To speed up SVM video classifier training in high-dimensional heterogeneous feature space, a novel multimodal boosting algorithm is proposed by incorporating feature hierarchy and boosting to reduce both the training cost and the size of training samples significantly. To avoid the inter-level error transmission problem, a novel hierarchical boosting scheme is proposed by incorporating concept ontology and multitask learning to boost hierarchical video classifier training through exploiting the strong correlations between the video concepts. To bridge the semantic gap between the available video concepts and the users' real needs, a novel hyperbolic visualization framework is seamlessly incorporated to enable intuitive query specification and evaluation by acquainting the users with a good global view of large-scale video collections. Our experiments in one specific domain of surgery education videos have also provided very convincing results.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号