首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
入侵检测中基于SVM的两级特征选择方法   总被引:2,自引:0,他引:2  
针对入侵检测中的特征优化选择问题,提出基于支持向量机的两级特征选择方法。该方法将基于检测率与误报率比值的特征评测值作为特征筛选的评价指标,先采用过滤模式中的Fisher分和信息增益分别过滤噪声和无关特征,降低特征维数;再基于筛选出来的交叉特征子集,采用封装模式中的序列后向搜索算法,结合支持向量机选取最优特征子集。仿真测试结果表明,采用该方法筛选出来的特征子集具有更好的分类性能,并有效降低了系统的建模时间和测试时间。  相似文献   

2.
特征子集搜索是数据挖掘分类任务中一个关键性的难题,常用的过滤器方法忽略了基因之间的相关性,此外,现有的解决方法并不是专门针对处理小样本数据,因此在特征选择方面表现出了不稳定性.为了解决上述问题,在实例学习的基础上提出了一种新型的混合封装过滤算法,并且提出了一种具有封装器评价体系的分类器算法——协同性子集搜索(CSS).选取几个高维小样本的癌症数据集作为数据来源,对提出的评价体系进行了实验测试,结果表明,该方法在准确性及稳定性方面较其他方法表现更好.  相似文献   

3.
针对复杂场景下目标检测和目标检测中特征选择问题,该文将二值粒子群优化算法(BPSO)用于特征选择,结合支持向量机(SVM)技术提出了一种新颖的基于BPSO-SVM特征选择的自动目标检测算法。该算法将目标检测转化为目标识别问题,采用wrapper特征选择模型,以SVM为分类器,通过样本训练分类器,根据分类结果,利用BPSO算法在特征空间中进行全局搜索,选择最优特征集进行分类。基于BPSO-SVM的特征选择方法降低了特征维数,显著提高了分类器性能。实验结果表明,该文算法不仅有效提高了复杂场景下目标姿态、尺度、光照变化和局部被遮挡时的检测准确率,还大大缩短了检测时间。  相似文献   

4.
基于微粒群算法和支持向量机的特征子集选择方法   总被引:9,自引:0,他引:9  
乔立岩  彭喜元  彭宇 《电子学报》2006,34(3):496-498
在模式分类系统中,大量无关或冗余的特征往往会降低分类器的性能,因此需要特征选择.本文提出了基于离散微粒群(BPSO)和支持向量机(SVM)封装模式的特征子集选择方法,首先随机产生若干种群(特征子集),然后用BPSO算法对特征进行优化,并用SVM的10阶交叉验证结果指导算法的搜索,最后选出最佳适应度的子集对SVM进行训练.两个UCI机器数据集(户外图像和电离层)的实验结果表明了提出算法的有效性.  相似文献   

5.
An effective data mining system lies in the representation of pattern vectors. For many bioinformatic applications, data are represented as vectors of extremely high dimension. This motivates the research on feature selection. In the literature, there are plenty of reports on feature selection methods. In terms of training data types, they are divided into the unsupervised and supervised categories. In terms of selection methods, they fall into filter and wrapper categories. This paper will provide a brief overview on the state-of-the-arts feature selection methods on all these categories. Sample applications of these methods for genomic signal processing will be highlighted. This paper also describes a notion of self-supervision. A special method called vector index adaptive SVM (VIA-SVM) is described for selecting features under the self-supervision scenario. Furthermore, the paper makes use of a more powerful symmetric doubly supervised formulation, for which VIA-SVM is particularly useful. Based on several subcellular localization experiments, and microarray time course experiments, the VIA-SVM algorithm when combined with some filter-type metrics appears to deliver a substantial dimension reduction (one-order of magnitude) with only little degradation on accuracy.  相似文献   

6.
特征评价和选择是机器学习和模式识别的重要步骤。为了获得稀疏特征子集,结合间隔损失评估策略和L1范数调节技术来获得一种有效的特征选择方法(MLFWL-L1),并将其应用到RBFSVM分类器。实验中,在UCI数据集上将提出的算法与Simba和ReliefF对比表明,验证所提出的算法是一种有效的特征选择方法。  相似文献   

7.
刘云  肖雪  黄荣乘 《信息技术》2020,(5):28-31,36
特征选择是机器学习和数据挖掘中处理高维数据的初步步骤,通过消除冗余或不相关的特征来识别数据集中最重要和最相关的特征,从而提高分类精度和降低计算复杂度。文中提出混合蒙特卡罗树搜索特征选择算法(HMCTS),首先,根据蒙特卡罗树搜索方法迭代生成一个初始特征子集,利用ReliefF算法过滤选择前k个特征形成候选特征子集;然后,利用KNN分类器的分类精度评估候选特征,通过反向传播将模拟结果更新到迭代路径上所有选择的节点;最后,选择高精度的候选特征作为最佳特征子集。仿真结果表明,对比HPSO-LS和MOTiFS算法,HMCTS算法具有良好的可扩展性,且分类精度高。  相似文献   

8.
基于相像系数的雷达辐射源信号特征选择   总被引:10,自引:0,他引:10  
提出一种基于相像系数(RC)的特征选择新方法,给出了RC的定义和基于RC的类别可分离性判据,描述了 基于RC和量子遗传算法的雷达辐射源信号特征选择算法,设计了神经网络分类器,并将该方法与基于距离准则的顺序前 进法(SFSDC)和吕铁军的方法(GADC)作了特征选择和分类识别的对比实验。结果表明,本文方法无需事先指定最优特征 子集的维数,能可靠有效地选择出最佳特征子集,不仅大大降低了特征向量的维数,简化了分类器的设计,而且获得了比 原始特征集、SFSDC和GADC更高的正确识别率和识别效率。  相似文献   

9.
In this paper, we investigate feature extraction and feature selection methods as well as classification methods for automatic facial expression recognition (FER) system. The FER system is fully automatic and consists of the following modules: face detection, facial detection, feature extraction, selection of optimal features, and classification. Face detection is based on AdaBoost algorithm and is followed by the extraction of frame with the maximum intensity of emotion using the inter-frame mutual information criterion. The selected frames are then processed to generate characteristic features using different methods including: Gabor filters, log Gabor filter, local binary pattern (LBP) operator, higher-order local autocorrelation (HLAC) and a recent proposed method called HLAC-like features (HLACLF). The most informative features are selected based on both wrapper and filter feature selection methods. Experiments on several facial expression databases show comparisons of different methods.  相似文献   

10.
吴麒  陈兴蜀  朱锴  王春晖 《电子学报》2012,40(11):2320-2323
 针对以往主题描述方法未充分考虑主题上下文的问题,提出了基于ODP(开放式分类目录)的上下文主题描述方法.使用新的特征选择算法对主题特征进行了确定,并使用分类主题树的上下文对主题描述方法进行优化以提高主题爬行的性能.实验表明,该特征选择算法能够有效地提取出主题特征,并在保证正确率的基础上尽量减少特征维数以提高计算效率.同时,该主题描述算法充分考虑了主题上下文关系,且无论是在准确性还是在信息量总和上都有良好的性能.  相似文献   

11.
Bayesian class discovery in microarray datasets   总被引:1,自引:0,他引:1  
A novel approach to class discovery in gene expression datasets is presented. In the context of clinical diagnosis, the central goal of class discovery algorithms is to simultaneously find putative (sub-)types of diseases and to identify informative subsets of genes with disease-type specific expression profile. Contrary to many other approaches in the literature, the method presented implements a wrapper strategy for feature selection, in the sense that the features are directly selected by optimizing the discriminative power of the used partitioning algorithm. The usual combinatorial problems associated with wrapper approaches are overcome by a Bayesian inference mechanism. On the technical side, we present an efficient optimization algorithm with guaranteed local convergence property. The only free parameter of the optimization method is selected by a resampling-based stability analysis. Experiments with Leukemia and Lymphoma datasets demonstrate that our method is able to correctly infer partitions and corresponding subsets of genes which both are relevant in a biological sense. Moreover, the frequently observed problem of ambiguities caused by different but equally high-scoring partitions is successfully overcome by the model selection method proposed.  相似文献   

12.
In this paper, we propose a new feature subset evaluation method for feature selection in object tracking. According to the fact that a feature which is useless by itself could become a good one when it is used together with some other features, we propose to evaluate feature subsets as a whole for object tracking instead of scoring each feature individually and find out the most distinguishable subset for tracking. In the paper, we use a special tree to formalize the feature subset space. Then conditional entropy is used to evaluating feature subset and a simple but efficient greedy search algorithm is developed to search this tree to obtain the optimal k-feature subset quickly. Furthermore, our online k-feature subset selection method is integrated into particle filter for robust tracking. Extensive experiments demonstrate that k-feature subset selected by our method is more discriminative and thus can improve tracking performance considerably.  相似文献   

13.
Minimum probability of error image retrieval   总被引:3,自引:0,他引:3  
We address the design of optimal architectures for image retrieval from large databases. Minimum probability of error (MPE) is adopted as the optimality criterion and retrieval formulated as a problem of statistical classification. The probability of retrieval error is lower- and upper-bounded by functions of the Bayes and density estimation errors, and the impact of the components of the retrieval architecture (namely, the feature transformation and density estimation) on these bounds is characterized. This characterization suggests interpreting the search for the MPE feature set as the search for the minimum of the convex hull of a collection of curves of probability of error versus feature space dimension. A new algorithm for MPE feature design, based on a dictionary of empirical feature sets and the wrapper model for feature selection, is proposed. It is shown that, unlike traditional feature selection techniques, this algorithm scales to problems containing large numbers of classes. Experimental evaluation reveals that the MPE architecture is at least as good as popular empirical solutions on the narrow domains where these perform best but significantly outperforms them outside these domains.  相似文献   

14.
稀疏多元逻辑回归(SMLR)作为一种广义的线性模型被广泛地应用于各种多分类任务场景中。SMLR通过将拉普拉斯先验引入多元逻辑回归(MLR)中使其解具有稀疏性,这使得该分类器可以在进行分类的过程中嵌入特征选择。为了使分类器能够解决非线性数据分类的问题,该文通过核技巧对SMLR进行核化扩充后得到了核稀疏多元逻辑回归(KSMLR)。KSMLR能够将非线性特征数据通过核函数映射到高维甚至无穷维的特征空间中,使其特征能够充分地表达并最终能进行有效的分类。此外,该文还利用了基于中心对齐的多核学习算法,通过不同的核函数对数据进行不同维度的映射,并用中心对齐相似度来灵活地选取多核学习权重系数,使得分类器具有更好的泛化能力。实验结果表明,该文提出的基于中心对齐多核学习的稀疏多元逻辑回归算法在分类的准确率指标上都优于目前常规的分类算法。  相似文献   

15.
周涛  陆惠玲  张飞飞 《光电子.激光》2020,31(12):1288-1298
在高维特征选择过程中最优特征子集生成和分类器 参数优化方面,提出一种基于贝叶斯粗糙集(BRS)、遗传算法(GA)和布谷鸟算法(CS) 的两阶段优化高维特征选择算法。该算法首先分析3000例肺部肿瘤CT图像的形状、灰度和纹理特征,提取104维特 征分量共同量化ROI;然后进行两阶段优化:(1) 从全局相对增益函数的角度分析了属性 重要度,结合属性约简长度和基因编码权值函数的加权和构造适应度函数,通过选择、交叉 和变异等遗传操作生成最优特征子集,在不降低分类精确度的前提下降低特征维度;(2) 利用CS对支持向量机(SVM)参数进行全局寻优;最后通过实验验证本文算法的可行性和有 效性。实验结果表明,该算法有效提升了肺部肿瘤良恶性识别能力,降低了算法的时间复杂 度。  相似文献   

16.
基于改进的混合学习模型的手写阿拉伯数字识别方法   总被引:1,自引:0,他引:1  
在特征空间维数较高的手写阿拉伯数字识别问题中,冗余的特征往往会意外增加学习模型刻画问题空间的复杂度,影响手写阿拉伯数字识别的效率和精确度。该文提出了一种基于边界对特征的敏感度值进行特征选择的支持向量机树混合学习模型,依据当前中间节点上的分类曲面对子样本空间中的样例特征的敏感程度选择特征,在新构建的子样本集上训练子节点上的支持向量机。UCI机器学习数据库中手写阿拉伯数字识别问题的仿真结果表明,与其他算法相比,该文提出的方法能够在提高或保持手写阿拉伯数字高识别精确率的同时,精简问题空间,从而简化混合学习模型的中间节点和整体结构。  相似文献   

17.
In this paper, we present a deep neural network model to enhance the intrusion detection performance. A deep learning architecture combining convolution neural network and long short‐term memory learns spatial‐temporal features of network flows automatically. Flow features are extracted from raw network traffic captures, flows are grouped, and the consecutive N flow records are transformed into a two‐dimensional array like an image. These constructed two‐dimensional feature vectors are normalized and forwarded to the deep learning model. Transformation of flow information assures deep learning in a computationally efficient manner. Overall, convolution neural network learns spatial features, and long short‐term memory learns temporal features from a sequence of network raw data packets. To maximize the detection performance of the deep neural network and to reach at the highest statistical metric values, we apply the tree‐structured Parzen estimator seeking the optimum parameters in the parameter hyper‐plane. Furthermore, we investigate the impact of flow status interval, flow window size, convolution filter size, and long short‐term memory units to the detection performance in terms of level in statistical metric values. The presented flow‐based intrusion method outperforms other publicly available methods, and it detects abnormal traffic with 99.09% accuracy and 0.0227 false alarm rate.  相似文献   

18.
Li ZHANG  Cong WANG 《通信学报》2018,39(5):111-122
Feature selection has played an important role in machine learning and artificial intelligence in the past decades.Many existing feature selection algorithm have chosen some redundant and irrelevant features,which is leading to overestimation of some features.Moreover,more features will significantly slow down the speed of machine learning and lead to classification over-fitting.Therefore,a new nonlinear feature selection algorithm based on forward search was proposed.The algorithm used the theory of mutual information and mutual information to find the optimal subset associated with multi-task labels and reduced the computational complexity.Compared with the experimental results of nine datasets and four different classifiers in UCI,the proposed algorithm is superior to the feature set selected by the original feature set and other feature selection algorithms.  相似文献   

19.
Given several related tasks, multi-task feature selection determines the importance of features by mining the correlations between them. There have already many efforts been made on the supervised multi-task feature selection. However, in real-world applications, it’s noticeably time-consuming and unpractical to collect sufficient labeled training data for each task. In this paper, we propose a novel feature selection algorithm, which integrates the semi-supervised learning and multi-task learning into a joint framework. Both the labeled and unlabeled samples are sufficiently utilized for each task, and the shared information between different tasks is simultaneously explored to facilitate decision making. Since the proposed objective function is non-smooth and difficult to be solved, we also design an efficient iterative algorithm to optimize it. Experimental results on different applications demonstrate the effectiveness of our algorithm.  相似文献   

20.
作为一种非线性维数约减算法,高斯过程隐变量模型(Gaussian process latent variable model,GPLVM)由于其适合处理小样本、高维数据,因而在模式识别、计算机视觉等领域得到了广泛应用.基于此,提出一种基于改进GPLVM的SAR图像目标特征提取及自动识别方法,其中利用改进的GPLVM进行特征提取,高斯过程分类进行目标识别.传统GPLVM使用共轭梯度法对似然函数进行优化,为避免梯度估值易受噪声干扰、步长对算法影响严重等缺点,提出基于免疫克隆选择算法的GPLVM,利用其具有快速收敛到全局最优的特性提高算法性能.实验结果表明,该算法不仅降低了特征维数,且提高了识别精度,从而验证了算法用于SAR图像目标识别的有效性.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号