首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
胡学伟  蒋芸  李志磊  沈健  华锋亮 《计算机应用》2015,35(11):3116-3121
针对目前模糊等价关系所诱导的模糊粗糙集模型不能准确地反映模糊概念范畴中数值属性描述的决策问题,提出一种基于邻域关系的模糊粗糙集模型NR-FRS,给出了该粗糙集模型的相关定义,在讨论模型性质的基础上进行模糊化邻域近似空间上的推理,并分析特征子空间下的属性依赖性;最后在NR-FRS的基础上提出特征选择算法,构建使得模糊正域增益优于具体阈值的特征子集,进而剔除冗余特征,保留分类能力强的属性.采用UCI标准数据集进行分类实验,使用径向基核函数(RBF)支持向量机作为分类器.实验结果表明,同基于邻域粗糙集的快速前向特征选择方法以及核主成分分析方法(KPCA)相比,NR-FRS模型特征选择算法所得特征子集中特征数量依据参数变化更加平缓、稳定.同时平均分类准确率提升最好可以达到5.2%,且随特征选择参数呈现更加平稳的变化.  相似文献   

2.
In this paper, we present a new system for the classification of electrocardiogram (ECG) beats by using a fast least square support vector machine (LSSVM). Five feature extraction methods are comparatively examined in the 15-dimensional feature space. The dimension of the each feature set is reduced by using dynamic programming based on divergence analysis. After the preprocessing of ECG data, six types of ECG beats obtained from the MIT-BIH database are classified with an accuracy of 95.2% by the proposed fast LSSVM algorithm together with discrete cosine transform. Experimental results show that not only the fast LSSVM is faster than the standard LSSVM algorithm, but also it gives better classification performance than the standard backpropagation multilayer perceptron network.  相似文献   

3.
Case generation using rough sets with fuzzy representation   总被引:1,自引:0,他引:1  
We propose a rough-fuzzy hybridization scheme for case generation. Fuzzy set theory is used for linguistic representation of patterns, thereby producing a fuzzy granulation of the feature space. Rough set theory is used to obtain dependency rules which model informative regions in the granulated feature space. The fuzzy membership functions corresponding to the informative regions are stored as cases along with the strength values. Case retrieval is made using a similarity measure based on these membership functions. Unlike the existing case selection methods, the cases here are cluster granules and not sample points. Also, each case involves a reduced number of relevant features. These makes the algorithm suitable for mining data sets, large both in dimension and size, due to its low-time requirement in case generation as well as retrieval. Superiority of the algorithm in terms of classification accuracy and case generation and retrieval times is demonstrated on some real-life data sets.  相似文献   

4.
Specific patterns of electrocardiogram (ECG), along with other biometrics, have recently been used to recognize a person. Most ECG-based human identification methods rely on the reduced features derived from ECG characteristic points and supervised classification. However, detecting characteristic points is an arduous procedure, particularly at low signal-to-noise ratios. The supervised classifier requires retraining when a new person is included in the group. In the present study, we propose a novel unsupervised ECG-based identification method based on phase space reconstruction of one-lead or three-lead ECG, saving from picking up characteristic points. Identification is performed by inspecting similarity or dissimilarity measure between ECG phase space portraits. Our results in a 100-subject group showed that one-lead ECG reached identification rate at 93% accuracy and three-lead ECG acquired 99% accuracy.  相似文献   

5.
邻域粗糙集模型中,随着信息粒尺寸的增长,基于多数投票原则的邻域分类器(NC)容易对未知样本的类别产生误判。为了缓解该问题,在协同表达分类(CRC)思想的基础上,提出了一种基于邻域协同表达的分类方法,即邻域协同分类器(NCC)。NCC首先借助邻域粗糙集模型对分类学习任务进行特征选择,然后找出被选特征下未知样本的邻域空间,最后在邻域空间内采用协同表达来代替多数投票原则,找出与未知样本具有最小重构误差的类别作为预测的类别标记。在4组UCI数据集上的实验结果表明:1)与NC相比,所提NCC在大尺寸信息粒下获得了较为满意的分类效果;2)与CRC相比,所提NCC在保持良好分类精度的同时,极大地降低了字典样本的规模,进而提高了分类的效率。  相似文献   

6.
The degree of malignancy in brain glioma is assessed based on magnetic resonance imaging (MRI) findings and clinical data before operation. These data contain irrelevant features, while uncertainties and missing values also exist. Rough set theory can deal with vagueness and uncertainty in data analysis, and can efficiently remove redundant information. In this paper, a rough set method is applied to predict the degree of malignancy. As feature selection can improve the classification accuracy effectively, rough set feature selection algorithms are employed to select features. The selected feature subsets are used to generate decision rules for the classification task. A rough set attribute reduction algorithm that employs a search method based on particle swarm optimization (PSO) is proposed in this paper and compared with other rough set reduction algorithms. Experimental results show that reducts found by the proposed algorithm are more efficient and can generate decision rules with better classification performance. The rough set rule-based method can achieve higher classification accuracy than other intelligent analysis methods such as neural networks, decision trees and a fuzzy rule extraction algorithm based on Fuzzy Min-Max Neural Networks (FRE-FMMNN). Moreover, the decision rules induced by rough set rule induction algorithm can reveal regular and interpretable patterns of the relations between glioma MRI features and the degree of malignancy, which are helpful for medical experts.  相似文献   

7.
孙林  赵婧  徐久成  王欣雅 《计算机应用》2022,42(5):1355-1366
针对经典的帝王蝶优化(MBO)算法不能很好地处理连续型数据,以及粗糙集模型对于大规模、高维复杂的数据处理能力不足等问题,提出了基于邻域粗糙集(NRS)和MBO的特征选择算法。首先,将局部扰动和群体划分策略与MBO算法结合,并构建传输机制以形成一种二进制MBO(BMBO)算法;其次,引入突变算子增强算法的探索能力,设计了基于突变算子的BMBO(BMBOM)算法;然后,基于NRS的邻域度构造适应度函数,并对初始化的特征子集的适应度值进行评估并排序;最后,使用BMBOM算法通过不断迭代搜索出最优特征子集,并设计了一种元启发式特征选择算法。在基准函数上评估BMBOM算法的优化性能,并在UCI数据集上评价所提出的特征选择算法的分类能力。实验结果表明,在5个基准函数上,BMBOM算法的最优值、最差值、平均值以及标准差明显优于MBO和粒子群优化(PSO)算法;在UCI数据集上,与基于粗糙集的优化特征选择算法、结合粗糙集与优化算法的特征选择算法、结合NRS与优化算法的特征选择算法、基于二进制灰狼优化的特征选择算法相比,所提特征选择算法在分类精度、所选特征数和适应度值这3个指标上表现良好,能够选择特征数少且分类精度高的最优特征子集。  相似文献   

8.
孟军  李锐  郝涵 《计算机科学》2015,42(6):37-40, 66
在对基因微阵列数据的特征选择和分类的研究中,粗糙集理论是一个可以消除冗余基因的有效工具.但是传统的粗糙集模型不能很好地处理连续型数值数据,而离散化方法可能会导致信息的丢失.为此,提出了一种基于相交邻域粗糙集模型的属性约简算法,即将传统粗糙集中的距离邻域扩展为相交邻域,采用基于集合的方式来定义近似,以此构建粗糙集模型.在癌症数据集上进行实验,结果表明基于集合近似和相交邻域的粗糙集模型可以取得较好的分类效果,并且通过对选择出的基因进行GO术语分析,进一步证明了该模型的有效性.  相似文献   

9.
针对KNN算法的分类效率随着训练集规模和特征维数的增加而逐渐降低的问题,提出了一种基于Canopy和粗糙集的CRS-KNN(Canopy Rough Set-KNN)文本分类算法。算法首先将待处理的文本数据通过Canopy进行聚类,然后对得到的每个类簇运用粗糙集理论进行上、下近似分割,对于分割得到的下近似区域无需再进行分类,而通过上、下近似作差所得的边界区域数据需要通过KNN算法确定其最终的类别。实验结果表明,该算法降低了KNN算法的数据计算规模,提高了分类效率。同时与传统的KNN算法和基于聚类改进的KNN文本分类算法相比,准确率、召回率和[F1]值都得到了一定的提高。  相似文献   

10.
王蓉  刘遵仁  纪俊 《计算机科学》2018,45(7):197-201, 229
作为经典Pawlak粗糙集的扩展,邻域粗糙集能有效处理数值型的数据。但是,因为引入了邻域粒化的概念,所以邻域实数空间下的计算量要比经典离散空间下的计算量大得多。对于邻域粗糙集算法而言,能够有效且快速地找到数据集的属性约简是十分有意义的。为此,针对现有算法中属性重要度定义的不足,首先提出了一种改进的投票式属性重要度,然后进一步提出了一种基于投票式属性重要度的快速属性约简算法。实验证明,与现有算法相比,在保证分类精度的前提下,该算法能更快速地得到属性约简。  相似文献   

11.
Various methods for ensembles selection and classifier combination have been designed to optimize the performance of ensembles of classifiers. However, use of large number of features in training data can affect the classification performance of machine learning algorithms. The objective of this paper is to represent a novel feature elimination (FE) based ensembles learning method which is an extension to an existing machine learning environment. Here the standard 12 lead ECG signal recordings data have been used in order to diagnose arrhythmia by classifying it into normal and abnormal subjects. The advantage of the proposed approach is that it reduces the size of feature space by way of using various feature elimination methods. The decisions obtained from these methods have been coalesced to form a fused data. Thus the idea behind this work is to discover a reduced feature space so that a classifier built using this tiny data set would perform no worse than a classifier built from the original data set. Random subspace based ensembles classifier is used with PART tree as base classifier. The proposed approach has been implemented and evaluated on the UCI ECG signal data. Here, the classification performance has been evaluated using measures such as mean absolute error, root mean squared error, relative absolute error, F-measure, classification accuracy, receiver operating characteristics and area under curve. In this way, the proposed novel approach has provided an attractive performance in terms of overall classification accuracy of 91.11 % on unseen test data set. From this work, it is shown that this approach performs well on the ensembles size of 15 and 20.  相似文献   

12.
为增加向量空间模型的文本语义信息,提出三元组依存关系特征构建方法,将此方法用于文本情感分类任务中。三元组依存关系特征构建方法在得到完整依存解析树的基础上,先依据中文语法特点,制定相应规则对原有完整树进行冗余结点的合并和删除;再将保留的依存树转化为三元组关系并一般化后作为向量空间模型特征项。为了验证此种特征表示方法的有效性,构造出在一元词基础上添加句法特征、简单依存关系特征和词典得分不同组合下的特征向量空间。将三元组依存关系特征向量与构造出的不同组合特征向量分别用于支持向量机和深度信念网络中。结果表明,三元组依存关系文本表示方法在分类精度上均高于其他特征组合表示方法,进一步说明三元组依存关系特征能更充分表达文本语义信息。  相似文献   

13.
This paper presents a process of building a Sentiment Analysis Framework for Serbian (SAFOS). We created a hybrid method that uses a sentiment lexicon and Serbian WordNet (SWN) synsets assigned with sentiment polarity scores in the process of feature selection. As the use of stemming for morphologically rich languages (MRLs) may result in loss or giving incorrect sentiment meaning to words, we decided to expand the sentiment lexicon, as well as the lexicon generated using SWN, by adding morphological forms of emotional terms and phrases. It was done using Serbian Morphological Electronic Dictionaries. A new feature reduction method for document-level sentiment polarity classification using maximum entropy modeling is proposed. It is based on mapping of a large number of related feature candidates (sentiment words, phrases and their inflectional forms) to a few concepts and using them as features. Testing was performed on a 10-fold cross validation set and on test sets containing news and movie reviews. The results of all experiments show that sentiment feature mapping for feature set reduction achieves better results over the basic set of features. For both test sets, the best classification accuracy scores were achieved for the combination of unigram and bigram features reduced by sentiment feature mapping (accuracy 78.3 % for movie reviews and 79.2 % for news test set). In 10-fold cross-validation, best average accuracy score of 95.6 % was obtained using unigrams as features, reduced by the mapping procedure.  相似文献   

14.
为了提高遥感图像的实时分类准确率与效率,提出了一种基于蚁群优化算法与独立特征集的遥感图像集实时分类算法。首先,提取遥感图像的小波域特征与颜色特征,并且组成特征向量;然后,采用蚁群优化算法对特征空间进行优化,独立地选出每个分类的显著特征集,从而降低每个子特征空间的维度;最终,每个分类独立地训练一个极限学习机分类器,从而实现对遥感图像集的分类。基于公开的遥感图像数据集进行了仿真实验,结果显示本算法实现了较高的分类准确率,并且实现了较高的计算效率。  相似文献   

15.
针对图像型火灾探测方法检测准确度和实时性间的矛盾,提出了基于粗糙集的火灾图像特征选择和识别算法。首先通过对火焰图像特征的深入研究发现,在燃烧能量的驱动下火焰的上边缘极不规则,出现明显的震动现象,而下边缘却恰恰相反; 基于此特点,可利用上下边缘抖动投影个数比作为火焰区别于边缘形状较规则的干扰。然后,选择火焰的6个显著特征构造训练样本,在火灾分类能力不受影响的前提下,使用实验所得的特征量归类表对训练样本进行属性约简,并将约简后的信息系统属性训练支持向量机模型,实现火灾探测。最后与传统支持向量机火灾探测算法做了比较。实验结果表明:将粗糙集作为支持向量机分类器的前置系统,把粗糙集理论的属性约简引入到支持向量机中,可以大大消除样本集冗余属性,降低了火灾图像特征空间的维数,减少了分类器训练和检测数据,在保证识别精度的同时,提高了算法的速度和泛化能力。  相似文献   

16.
对文本分类中降维技术、提高分类精度和效率的方法进行了研究,提出了一种基于矩阵投影运算的新型文本分类算法——Matrix Projection(MP)分类算法。矩阵运算将训练样例中表示文本特征的三维空间投影到二维空间上,得到归一化向量,有效地达到了降维与精确计算特征项权重的目的。与其他多种文本分类算法对比实验表明,MP算法的分类精度和时间性能都有明显提高,在两套数据集上的宏平均F1值分别达到92.29%和96.03%。  相似文献   

17.
We propose a new feature selection strategy based on rough sets and particle swarm optimization (PSO). Rough sets have been used as a feature selection method with much success, but current hill-climbing rough set approaches to feature selection are inadequate at finding optimal reductions as no perfect heuristic can guarantee optimality. On the other hand, complete searches are not feasible for even medium-sized datasets. So, stochastic approaches provide a promising feature selection mechanism. Like Genetic Algorithms, PSO is a new evolutionary computation technique, in which each potential solution is seen as a particle with a certain velocity flying through the problem space. The Particle Swarms find optimal regions of the complex search space through the interaction of individuals in the population. PSO is attractive for feature selection in that particle swarms will discover best feature combinations as they fly within the subset space. Compared with GAs, PSO does not need complex operators such as crossover and mutation, it requires only primitive and simple mathematical operators, and is computationally inexpensive in terms of both memory and runtime. Experimentation is carried out, using UCI data, which compares the proposed algorithm with a GA-based approach and other deterministic rough set reduction algorithms. The results show that PSO is efficient for rough set-based feature selection.  相似文献   

18.
传统的分类算法大都默认所有类别的分类代价一致,导致样本数据非均衡时产生分类性能急剧下降的问题.对于非均衡数据分类问题,结合神经网络与降噪自编码器,提出一种改进的神经网络实现非均衡数据分类算法,在神经网络模型输入层与隐层之间加入一层特征受损层,致使部分冗余特征值丢失,降低数据集的不平衡度,训练模型得到最优参数后进行特征分类得到结果.选取UCI标准数据集的3组非均衡数据集进行实验,结果表明采用该算法对小数据集的分类精度有明显改善,但是数据集较大时,分类效果低于某些分类器.该算法的整体分类效果要优于其他分类器.  相似文献   

19.
ECG作为一种活体生物特征用于身份识别在国际上引起了广泛重视.针对基于解析特征的ECG身份识别方法对特征点检测精度要求较高的缺点,提出一种仅需R波峰值点检测的ECG身份识别方法,该方法通过有针对性的设定相应阈值,将PCA特征和小波融合特征方法相结合.实验结果表明该方法优于PCA特征方法、波形特征方法和小波特征方法,既减少了特征点检测的复杂性和特征点检测不准确带来的误差,又可获得较高的识别率,是一种实时、高效算法.  相似文献   

20.

Arrhythmia is a unique type of heart disease which produces inefficient and irregular heartbeat. This is a cardiac disease which is diagnosed through electrocardiogram (ECG) procedure. Several studies have been focused on the speed and accuracy on the learning algorithm by applying pattern recognition, artificial intelligence in the classification algorithm. In this work a novel classification algorithm is planned based on ELM (Extreme Learning Machine) with Recurrent Neural Network (RNN) by using morphological filtering. The popular publicly available ECG arrhythmia database (MIT-BIH arrhythmia DB) is used to express the performance of the proposed algorithm where the level of accuracy is compared with the existing similar types of work. The comparative study shows that performance of our proposed model is much faster than the models working with RBFN (radial basis function network), BPBB(back propagation neural network) and Support Vector Machine. The experimental result with the MIT BIH database with hidden neurons of ELM with RNN, the accuracy is 96.41%, sensitivity 93.62% and specificity 92.66%. The classification methodology follows main four steps the heart beat detection, the ECG feature extraction, feature selection and the construction of the proposed classifier.

  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号