首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到18条相似文献,搜索用时 156 毫秒
1.
针对网络流量分类过程中出现的类不平衡问题,该文提出一种基于加权对称不确定性(WSU)和近似马尔科夫毯(AMB)的特征选择算法。首先,根据类别分布信息,定义了偏向于小类别的特征度量,使得与小类别具有强相关性的特征更容易被选择出来;其次,充分考虑特征与类别间、特征与特征之间的相关性,利用加权对称不确定性和近似马尔科夫毯删除不相关特征及冗余特征;最后,利用基于相关性度量的特征评估函数以及序列搜索算法进一步降低特征维数,确定最优特征子集。实验表明,在保证算法整体分类精确率的前提下,算法能够有效提高小类别的分类性能。  相似文献   

2.
吕子敬  韩顺利  张志辉  刘磊 《红外》2016,37(1):40-44
大规模的红外光谱数据集中存在大量无关冗余的特征。针对这一问题,提出了一种动态赋权红外光谱特征选择算法(Dynamic Weight Infrared Spectrum Feature Selection Algorithm, MBDWFS)。 该算法把对称不确定性度量标准与近似Markov Blanket相结合,以删除原始光谱数据集中无关冗余的特征,从而获取数据规模较小且最优的特征子集。通过与 FCBF、ID$_3$ 和ReliefF三种经典特征选择算法的性能仿真对比试验,证明所提出的MBDWFS算法在整体分类性能上优于其他三种算法,用于红外光谱的物质分析领域时效果更好。  相似文献   

3.
张俐  陈小波 《电子与信息学报》2022,43(10):3028-3034
特征选择是机器学习、自然语言处理和数据挖掘等领域中数据预处理阶段必不可少的步骤.在一些基于信息论的特征选择算法中,存在着选择不同参数就是选择不同特征选择算法的问题.如何确定动态的非先验权重并规避预设先验参数就成为一个急需解决的问题.该文提出动态加权的最大相关性和最大独立性(WMRI)的特征选择算法.首先该算法分别计算新分类信息和保留类别信息的平均值.其次,利用标准差动态调整这两种分类信息的参数权重.最后,WMRI与其他5个特征选择算法在3个分类器上,使用10个不同数据集,进行分类准确率指标(fmi)验证.实验结果表明,WMRI方法能够改善特征子集的质量并提高分类精度.  相似文献   

4.
张俐  陈小波 《电子与信息学报》2021,43(10):3028-3034
特征选择是机器学习、自然语言处理和数据挖掘等领域中数据预处理阶段必不可少的步骤。在一些基于信息论的特征选择算法中,存在着选择不同参数就是选择不同特征选择算法的问题。如何确定动态的非先验权重并规避预设先验参数就成为一个急需解决的问题。该文提出动态加权的最大相关性和最大独立性(WMRI)的特征选择算法。首先该算法分别计算新分类信息和保留类别信息的平均值。其次,利用标准差动态调整这两种分类信息的参数权重。最后,WMRI与其他5个特征选择算法在3个分类器上,使用10个不同数据集,进行分类准确率指标(fmi)验证。实验结果表明,WMRI方法能够改善特征子集的质量并提高分类精度。  相似文献   

5.
提出了一种基于加权类间类内距离比及主成分分析的组合式特征约简算法,并将其应用于选择低截获概率(LPI)雷达信号的高阶循环统计量特征。该算法首先用熵对传统类间类内距离比加权,克服了其只从宏观上刻画特征的类别区分能力,没有反应类别之间分布信息的缺陷,并用其选择鉴别能力较强的特征作为粗选特征子集。然后用主成分分析(PCA)对粗选特征子集去相关,最终得到维数更少的不相关的特征子集。仿真实验结果表明本文特征约简算法的性能优于MFDR和NMIFS。  相似文献   

6.
LWF链图结构学习旨在发现链图中所有节点的父节点、子节点、邻居节点以及配偶节点.然而,目前最新的LWF链图结构学习算法是基于Growing-Shrinking(GS)思想得到节点的局部结构(即节点的马尔科夫毯)来学习全局网络结构,该类算法的条件独立测试是以整个马尔科夫毯为条件集的,为了保证条件独立测试的可靠性,算法要求样本数量是马尔科夫毯大小的指数级,从而使得算法的数据效率较差.针对该问题,本文提出了一种基于约束的局部-全局LWF链图结构学习算法.该算法通过迭代的学习邻接集和配偶集来降低对数据样本量的要求;与此同时,在学习邻接集时采用后向策略保障了条件独立测试的正确性.算法的基本思想如下:首先学习网络中每个节点的马尔科夫毯,将节点马尔科夫毯学习拆分为学习邻接集和学习配偶集;然后利用节点的马尔科夫毯信息恢复网络骨架,根据链图复合体有向边的特点,利用条件独立测试确定网络复合体有向边,从而恢复链图结构.理论分析证明了该算法的正确性,在仿真数据集和标准数据集上的实验测试验证了算法的有效性.  相似文献   

7.
针对说话人分段与聚类算法中先验知识不足的问题,利用基于信息瓶颈(IB)准则和基于隐马尔科夫模型(HMM)/高斯混合模型(GMM)方法间的互补性,提出了一种基于特征层融合的说话人分段与聚类算法。该算法将基于IB准则算法的输出结果进行对数变换和降维处理;然后利用变换后的特征与传统梅尔频率倒谱系数(MFCC)特征分别训练说话人GMM模型,并在得分域对说话人类别的得分进行加权融合;根据融合的得分,进行基于HMM/GMM模型的说话人分段与聚类。实验表明,融合后的特征可以为系统提供更多的先验信息,比传统方法的误配率降低了1.2%。  相似文献   

8.
基于复小波邻域隐马尔科夫模型的图像去噪   总被引:14,自引:1,他引:13       下载免费PDF全文
刘芳  刘文学  焦李成 《电子学报》2005,33(7):1284-1287
多分辨信号和图像模型可用于捕获图像中平滑和奇异区域的统计结构,但是,基于正交小波变换的模型受到平移变化的影响从而降低了其准确性和实时性.本文将邻域隐马尔科夫模型LCHMM( Local Contextual Hidden Markov Model)扩展到复小波的范围,提出了一种基于复小波的邻域隐马尔科夫模型C-LCHMM( Local Contextual Hidden Markov Model Based On Complex Wavelet),该模型具有近似平移不变性及分辨率高的特点、能够捕获小波系数的邻域的统计特征、且计算复杂度小.仿真试验表明基于复小波邻域隐马尔科夫模型(C-LCHMM)用于图像去噪的效果优于典型的去噪算法.  相似文献   

9.
高维数据中存在着成千上万个特征,大量的特征导致问题搜索空间过大,增加了计算代价,影响了数据分类预测的准确性.为了提高特征选择的效率,本文提出了一种对称不确定性和种群降维机制的粒子群特征选择算法,该算法设计了一种基于对称不确定性指标的初始化方法,降低特征选择的计算代价.通过非支配排序的种群降维机制,减少进化过程中冗余特征的影响.在5个公开生物医学的高维数据集上的实验结果表明,该算法能够针对高维数据特征选择问题取得更好的分类精度和更小的最优子集特征个数,并在时间运行方面有一定的优势.  相似文献   

10.
自适应特征加权的Gibbs随机场影像分割方法   总被引:1,自引:0,他引:1       下载免费PDF全文
针对现有分割算法很少同时兼顾不同特征分量区分能力的差异和相邻像素间的相关性的问题,提出了一种结合Gibbs随机场的特征加权遥感影像分割方法.该方法首先依据训练样本计算各特征分量的区分能力,确定不同地物类别相应的特征分量的权重;然后利用加权最小距离分类法对影像进行初始分割,并利用Gibbs随机场来描述像素的空间相关性;最后综合Gibbs随机场描述的标记场和加权最小距离分类法描述的特征场来获取影像的最终分割结果.实验结果表明,Gibbs随机场能够有效地描述空间相关性,根据区分能力确定的权重强化了区分能力强的特征分量.  相似文献   

11.
为了提高不平衡数据集分类中少数类的分类精度,提出了基于特征选择的过抽样算法.该算法考虑了不同的特征列对分类性能的不同作用,首先对训练集进行特征选择,选出一组特征列,然后根据选出的特征列合成少数类样本,合成的每个少数类样本的特征由两部分组成,一部分是特征选择的特征列对应的特征,另一部分是按照SMOTE原理合成的特征.将基于特征选择的过抽样算法和SMOTE算法进行实验比较,结果表明基于特征选择的过抽样算法的性能优于SMOTE算法,能有效降低数据的不平衡性,提高少数类的分类精度.  相似文献   

12.
K-nearest neighbor (KNN) has yielded excellent performance in physiological signals based on emotion recognition. But there are still some issues:the majority vote only by the nearest neighbors is too simple to deal with complex (like skewed) class distribution; features with the same contribution to the similarity will degrade the classification accuracy; samples in boundaries between classes are easily misclassified when k is larger. Therefore, we propose an improved KNN algorithm called WB-KNN, which takes into account the weight (both features and classification) and boundaries between classes. Firstly, a novel weighting method based on the distance and farthest neighbors named WDF is proposed to weight the classification, which improves the voting accuracy by making the nearer neighbors contribute more to the classification and using the farthest neighbors to reduce the weight of non-target class. Secondly, feature weight is introduced into the distance formula, so that the significant features contribute more to the similarity than noisy or irrelevant features. Thirdly, a voting classifier is adopted in order to overcome the weakness of KNN in boundaries between classes by combining different classifiers. Results of WB-KNN algorithm are encouraging compared with the traditional KNN and other classification algorithms on the physiological dataset with a skewed class distribution. Classification accuracy for 29 participants achieves 94.219 2% for the recognition of four emotions.  相似文献   

13.
针对高光谱图像谱段数目较多、近邻谱段相关性过高而导致分类困难的问题,提出了一种自适应差分进化特征选择的高光谱图像分类算法.首先初始化种群向量集,利用自适应差分进化算法搜索特征的自适应性生成特征子集;然后,通过使用ReliefF技术根据特征排序去除重复特征,从而为所有的特征构建一个特征列表;最后,借助于模糊k-近邻分类器计算每个向量的分类精度,利用包裹模型评估特征子集.在印第安纳数据集和KSC数据集上的实验结果验证了算法的有效性及可靠性,实验结果表明,相比其他几种特征选择算法,该算法取得了更高的总分类精度和更好的Kappa系数.  相似文献   

14.
Vibrations produced by the use of industrial machine tools can contain valuable information about the state of wear of tool cutting edges. However, extracting this information automatically is quite difficult. It has been observed that certain structures present in the vibration patterns are correlated with dullness. We present an approach to extracting features present in these structures using self-organizing feature maps (SOFMs). We have modified the SOFM algorithm in order to improve its generalization abilities and to allow it to better serve as a preprocessor for a hidden Markov model (HMM) classifier. We also discuss the challenge of determining which classes exist in the machining application and introduce an algorithm for automatic clustering of time-sequence patterns using the HMM. We show the success of this algorithm in finding clusters that are beneficial to the machine-monitoring application  相似文献   

15.
Feature selection (FS) is a process to select features which are more informative.It is one of the important steps in knowledge discovery.The problem is that not all features are important.Some of the features may be redundant,and others may be irrelevant and noisy.The conventional supervised FS methods evaluate various feature subsets using an evaluation function or metric to select only those features which are related to the decision classes of the data under consideration.However,for many data mining applications,decision class labels are often unknown or incomplete,thus indicating the significance of unsupervised feature selection.However,in unsupervised learning,decision class labels are not provided.In this paper,we propose a new unsupervised quick reduct (QR) algorithm using rough set theory.The quality of the reduced data is measured by the classification performance and it is evaluated using WEKA classifier tool.The method is compared with existing supervised methods and the result demonstrates the efficiency of the proposed algorithm.  相似文献   

16.
A new SVM based emotional classification of image   总被引:1,自引:0,他引:1  
How high-level emotional representation of art paintings can be inferred from perceptual level features suited for the particular classes (dynamic vs. static classification) is presented. The key points are feature selection and classification. According to the strong relationship between notable lines of image and human sensations, a novel feature vector WLDLV (Weighted Line Direction-Length Vector) is proposed, which includes both orientation and length information of lines in an image. Classification is performed by SVM (Support Vector Machine) and images can be classified into dynamic and static. Experimental results demonstrate the effectiveness and superiority of the algorithm.  相似文献   

17.
Classification of characteristic neural spike shapes in multi-unit recordings is performed in real time using a reduced feature set. A model of uncorrelated signal-related noise is used to reduce the feature set by choosing a subset of aperiodic samples which is effective for discrimination between signals by a nearest-mean algorithm. Initial signal classes are determined by an unsupervised clustering algorithm applied to the reduced features of the learning set events. Classification is carried out in real time using a distance measure derived for the reduced feature set. Examples of separation and correlation of multiunit activity from cat and frog visual systems are described.  相似文献   

18.
This paper presents a method for classification of structural brain magnetic resonance (MR) images, by using a combination of deformation-based morphometry and machine learning methods. A morphological representation of the anatomy of interest is first obtained using a high-dimensional mass-preserving template warping method, which results in tissue density maps that constitute local tissue volumetric measurements. Regions that display strong correlations between tissue volume and classification (clinical) variables are extracted using a watershed segmentation algorithm, taking into account the regional smoothness of the correlation map which is estimated by a cross-validation strategy to achieve robustness to outliers. A volume increment algorithm is then applied to these regions to extract regional volumetric features, from which a feature selection technique using support vector machine (SVM)-based criteria is used to select the most discriminative features, according to their effect on the upper bound of the leave-one-out generalization error. Finally, SVM-based classification is applied using the best set of features, and it is tested using a leave-one-out cross-validation strategy. The results on MR brain images of healthy controls and schizophrenia patients demonstrate not only high classification accuracy (91.8% for female subjects and 90.8% for male subjects), but also good stability with respect to the number of features selected and the size of SVM kernel used.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号