首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 453 毫秒
1.
Artificial Immune Recognition System (AIRS) classification algorithm, which has an important place among classification algorithms in the field of Artificial Immune Systems, has showed an effective and intriguing performance on the problems it was applied. AIRS was previously applied to some medical classification problems including Breast Cancer, Cleveland Heart Disease, Diabetes and it obtained very satisfactory results. So, AIRS proved to be an efficient artificial intelligence technique in medical field. In this study, the resource allocation mechanism of AIRS was changed with a new one determined by Fuzzy-Logic. This system, named as Fuzzy-AIRS was used as a classifier in the diagnosis of Breast Cancer and Liver Disorders, which are of great importance in medicine. The classifications of Breast Cancer and BUPA Liver Disorders datasets taken from University of California at Irvine (UCI) Machine Learning Repository were done using 10-fold cross-validation method. Reached classification accuracies were evaluated by comparing them with reported classifiers in UCI web site in addition to other systems that are applied to the related problems. Also, the obtained classification performances were compared with AIRS with regard to the classification accuracy, number of resources and classification time. Fuzzy-AIRS, which reached to classification accuracy of 98.51% for breast cancer, classified the Liver Disorders dataset with 83.36% accuracy. For both datasets, Fuzzy-AIRS obtained the highest classification accuracy according to the UCI web site. Beside of this success, Fuzzy-AIRS gained an important advantage over the AIRS by means of classification time. In the experiments, it was seen that the classification time in Fuzzy-AIRS was reduced about 70% of AIRS for both datasets. By reducing classification time as well as obtaining high classification accuracies in the applied datasets, Fuzzy-AIRS classifier proved that it could be used as an effective classifier for medical problems.  相似文献   

2.
心电图反映了人体心脏健康状况,是临床诊断心血管类疾病的重要依据。随着心电图数量的快速增长,计算机辅助心电图分析的需求愈加迫切,心电图自动分类作为实现计算机辅助心电图分析不可或缺的技术手段,具有重要的医学价值。由于心电信号非常微弱、抗干扰性差,传统心电图分类算法存在测试集上效果好,实际临床应用效果欠佳的问题。为此,本文研究一种基于多导联二维结构的一维卷积ResNet网络结构,通过平移起始点、“加噪”等数据增强手段增 加训练样本多样性,并采用Focal Loss损失函数优化病人个体的心电图分类模型。该模型利用2万条完整的8导联心电图数据,共计34类心电异常事件进行分类实验,取得了0.91的F1值、93.96%的准确率和87.89%的召回率的分类性能。实验结果表明,该心电图分类算法模型具有较优的深层特征挖掘与分类能力,验证了其在心电异常自动分类上的有效性。  相似文献   

3.
针对现有欠采样处理算法中存在样本缺少代表性、分类性能差等问题,提出了一种基于聚类欠采样的加权随机森林算法(weighted random forest algorithm based on clustering under-sampling,CUS-WRF)。利用K-means算法对多数类样本聚类,引入欧氏距离作为欠采样时分配样本个数的权重依据,使采样后的多数类样本与少数类样本形成一个平衡的样本集,以CART决策树为基分类器,加权随机森林为整体框架,同时将测试样本的准确率作为每棵树的权值来完成对结果的最终投票,有效提高了整体分类性能。选择八组KEEL数据集进行实验,结果表明,与其余四种基于随机森林的不平衡数据处理算法相比,CUS-WRF算法的分类性能及稳定性更具优势。  相似文献   

4.
We have investigated the real-world task of recognizing biological concepts in DNA sequences in this work. Recognizing promoters in strings that represent nucleotides (one of A, G, T, or C) has been performed using a novel approach based on feature selection (FS) and Artificial Immune Recognition System (AIRS) with Fuzzy resource allocation mechanism (Fuzzy-AIRS), which is first proposed by us. The aim of this study is to improve the prediction accuracy of Escherichia coli promoter gene sequences using a novel system based on FS and Fuzzy-AIRS. The E. coli promoter gene sequences dataset has 57 attributes and 106 samples including 53 promoters and 53 non-promoters. The proposed system consists of two parts. Firstly, we have reduced the dimension of E. coli promoter gene sequences dataset from 57 attributes to 4 attributes by means of FS process. Second, Fuzzy-AIRS classifier algorithm has been run to predict the E. coli promoter gene sequences. The robustness of the proposed method is examined using prediction accuracy, sensitivity and specificity analysis, k-fold cross-validation method and confusion matrix. Whilst only Fuzzy-AIRS classifier has obtained 50% prediction accuracy using 10-fold cross-validation, the proposed system has obtained 90% prediction accuracy in the same conditions. These obtained results have indicated that the proposed system obtain the success rate in recognizing promoters in strings that represent nucleotides.  相似文献   

5.
Abstract: The aim of this research was to compare classifier algorithms including the C4.5 decision tree classifier, the least squares support vector machine (LS-SVM) and the artificial immune recognition system (AIRS) for diagnosing macular and optic nerve diseases from pattern electroretinography signals. The pattern electroretinography signals were obtained by electrophysiological testing devices from 106 subjects who were optic nerve and macular disease subjects. In order to show the test performance of the classifier algorithms, the classification accuracy, receiver operating characteristic curves, sensitivity and specificity values, confusion matrix and 10-fold cross-validation have been used. The classification results obtained are 85.9%, 100% and 81.82% for the C4.5 decision tree classifier, the LS-SVM classifier and the AIRS classifier respectively using 10-fold cross-validation. It is shown that the LS-SVM classifier is a robust and effective classifier system for the determination of macular and optic nerve diseases.  相似文献   

6.

In the fields of pattern recognition and machine learning, the use of data preprocessing algorithms has been increasing in recent years to achieve high classification performance. In particular, it has become inevitable to use the data preprocessing method prior to classification algorithms in classifying medical datasets with the nonlinear and imbalanced data distribution. In this study, a new data preprocessing method has been proposed for the classification of Parkinson, hepatitis, Pima Indians, single proton emission computed tomography (SPECT) heart, and thoracic surgery medical datasets with the nonlinear and imbalanced data distribution. These datasets were taken from UCI machine learning repository. The proposed data preprocessing method consists of three steps. In the first step, the cluster centers of each attribute were calculated using k-means, fuzzy c-means, and mean shift clustering algorithms in medical datasets including Parkinson, hepatitis, Pima Indians, SPECT heart, and thoracic surgery medical datasets. In the second step, the absolute differences between the data in each attribute and the cluster centers are calculated, and then, the average of these differences is calculated for each attribute. In the final step, the weighting coefficients are calculated by dividing the mean value of the difference to the cluster centers, and then, weighting is performed by multiplying the obtained weight coefficients by the attribute values in the dataset. Three different attribute weighting methods have been proposed: (1) similarity-based attribute weighting in k-means clustering, (2) similarity-based attribute weighting in fuzzy c-means clustering, and (3) similarity-based attribute weighting in mean shift clustering. In this paper, we aimed to aggregate the data in each class together with the proposed attribute weighting methods and to reduce the variance value within the class. Thus, by reducing the value of variance in each class, we have put together the data in each class and at the same time, we have further increased the discrimination between the classes. To compare with other methods in the literature, the random subsampling has been used to handle the imbalanced dataset classification. After attribute weighting process, four classification algorithms including linear discriminant analysis, k-nearest neighbor classifier, support vector machine, and random forest classifier have been used to classify imbalanced medical datasets. To evaluate the performance of the proposed models, the classification accuracy, precision, recall, area under the ROC curve, κ value, and F-measure have been used. In the training and testing of the classifier models, three different methods including the 50–50% train–test holdout, the 60–40% train–test holdout, and tenfold cross-validation have been used. The experimental results have shown that the proposed attribute weighting methods have obtained higher classification performance than random subsampling method in the handling of classifying of the imbalanced medical datasets.

  相似文献   

7.
Various methods for ensembles selection and classifier combination have been designed to optimize the performance of ensembles of classifiers. However, use of large number of features in training data can affect the classification performance of machine learning algorithms. The objective of this paper is to represent a novel feature elimination (FE) based ensembles learning method which is an extension to an existing machine learning environment. Here the standard 12 lead ECG signal recordings data have been used in order to diagnose arrhythmia by classifying it into normal and abnormal subjects. The advantage of the proposed approach is that it reduces the size of feature space by way of using various feature elimination methods. The decisions obtained from these methods have been coalesced to form a fused data. Thus the idea behind this work is to discover a reduced feature space so that a classifier built using this tiny data set would perform no worse than a classifier built from the original data set. Random subspace based ensembles classifier is used with PART tree as base classifier. The proposed approach has been implemented and evaluated on the UCI ECG signal data. Here, the classification performance has been evaluated using measures such as mean absolute error, root mean squared error, relative absolute error, F-measure, classification accuracy, receiver operating characteristics and area under curve. In this way, the proposed novel approach has provided an attractive performance in terms of overall classification accuracy of 91.11 % on unseen test data set. From this work, it is shown that this approach performs well on the ensembles size of 15 and 20.  相似文献   

8.
This paper surveys the major works related to an artificial immune system based classifier that was proposed in the 2000s, namely, the artificial immune recognition system (AIRS) algorithm. This survey has revealed that most works on AIRS was dedicated to the application of the algorithm to real-world problems rather than to theoretical developments of the algorithm. Based on this finding, we propose an improved version of the AIRS algorithm which we dub AIRS3. AIRS3 takes into account an important parameter that was ignored by the original algorithm, namely, the number of training antigens represented by each memory cell at the end of learning (numRepAg). Experiments of the new AIRS3 algorithm on data sets taken from the UCI machine learning repository have shown that taking into account the numRepAg information enhances the classification accuracy of AIRS.  相似文献   

9.
心血管疾病是当今人类死亡的主要原因之一。本文基于改进的残差网络对心电信号进行识别,并将改进后的残差网络和空洞卷积进行结合,特征提取时保持局部信息不变的同时尽可能地提取全局信息。研究使用K折交叉验证对MIT-BIH心律失常数据集进行训练、验证和测试。首先使用卷积层汇集输入图像,其次利用改进后的网络进行特征提取,最后使用Softmax分类器进行分类。在MIT-BIH心律不齐数据库中,提出的模型在没有任何额外人工特征和数据增强进行辅助的情况下,获得了97.20%的准确度、92.85%的敏感度、 98.29%的特异性、93.16%的精确度和93.00%的 F1分数。该研究将为医疗机构对于心电信号检测识别提供技术支撑,从而减轻专业医师的工作负荷。  相似文献   

10.
关联规则挖掘算法在分类中的应用研究   总被引:1,自引:0,他引:1  
提出了一个基于关联规则挖掘算法的医疗数据分类方法。介绍了关联规则的理论基础、关联规则挖掘算法及其在医疗数据挖掘中的应用方法,并利用介绍的算法对乳腺癌数据进行挖掘。获得了分类的实验结果,该模型系统达到了较高的分类准确率,证明了数据挖掘在辅助医疗诊断中有着广泛的应用前景。  相似文献   

11.
医学图像的关联规则挖掘方法研究   总被引:8,自引:0,他引:8  
提出了一个基于关联规则挖掘算法的医学图像分析器。介绍了数量型属性离散化的CA算法、关联规则挖掘算法及其在医学图像数据挖掘中的应用方法,并利用介绍的算法对乳腺癌图像数据进行挖掘。实验结果表明,该模型系统达到了较高的分类准确率。  相似文献   

12.
现有的软件缺陷预测方法面临数据类别不平衡性、高维数据处理等问题。如何有效解决上述问题已成为目前相关领域的研究热点。针对软件缺陷预测所面临的类别不平衡、预测精度低等问题,本文提出一种基于混合采样与Random_Stacking的软件缺陷预测算法DP_HSRS。DP_HSRS算法首先采用混合采样算法对不平衡数据进行平衡化处理;然后在该平衡数据集上采用Random_Stacking算法进行软件缺陷预测。Random_Stacking算法是对传统Stacking算法的一种有效改进,它通过融合多个经典的分类算法以及Bagging机制构建多个Stacking分类器,对多个Stacking分类器进行投票,得到一个集成分类器,最后利用该集成分类器对软件缺陷进行预测。通过在NASA MDP数据集上的实验结果表明,DP_HSRS算法的性能优于现有的算法,具有更好的缺陷预测性能。  相似文献   

13.
Artificial Immune Systems (AIS) are a type of intelligent algorithm inspired by the principles and processes of the human immune system. In the last decade, applications of AIS have been studied in various fields. In the application of change/anomaly detection, negative selection algorithms of AIS have been successfully applied. However, negative selection algorithms are not appropriate for multi-class classification problems, because they do not have a mechanism to minimize the danger of overfitting and oversearching. In this paper, we propose a new algorithm to overcome this drawback and to extend the application area of negative selection algorithms to multi-class classification. The algorithm we propose is named Artificial Negative Selection Classifier (ANSC). We investigate the tolerance of ANSC against noise, and introduce a method to reduce the effect of noise into ANSC. The accuracy and data reduction are compared with those from the Artificial Immune Recognition System (AIRS), which is a well known and effective classifier of AIS. The results show that our algorithm is useful for classification problems and the reduction of the noise effect.  相似文献   

14.
传统的数据分类算法多是基于平衡的数据集创建,对不平衡数据分类时性能下降,而实践表明组合选择能有效提高算法在不平衡数据集上的分类性能。为此,从组合选择的角度考虑不平衡类学习问题,提出一种新的组合剪枝方法,用于提升组合分类器在不平衡数据上的分类性能。使用Bagging建立分类器库,直接用正类(少数类)实例作为剪枝集,并通过MBM指标和剪枝集,从分类器库中选择一个最优或次优子组合分类器作为目标分类器,用于预测待分类实例。在12个UCI数据集上的实验结果表明,与EasyEnsemble、Bagging和C4.5算法相比,该方法不但能大幅提升组合分类器在正类上的召回率,而且还能提升总体准确率。  相似文献   

15.
针对单个神经网络分类准确率低、RUSBoost算法提高NN分类器准确率耗时长的问题,提出了一种混合RUSBoost算法和积矩系数的分类优化算法。首先,利用RUSBoost算法生成m组训练集;然后,依据Pearson积矩系数计算每组训练集属性的相关程度消除冗余属性,生成目标训练集;最后,新的子训练集训练神经网络分类器,选择最大准确率分类器作为最终的分类模型。实验中使用了4个Benchmark数据集来验证本文算法的有效性。实验结果表明,本文提出的算法的准确率相较于传统的算法最大提升了8.26%,训练时间最高降低了62.27%。  相似文献   

16.
This paper describes feature extraction methods using higher order statistics (HOS) of wavelet packet decomposition (WPD) coefficients for the purpose of automatic heartbeat recognition. The method consists of three stages. First, the wavelet package coefficients (WPC) are calculated for each different type of ECG beat. Then, higher order statistics of WPC are derived. Finally, the obtained feature set is used as input to a classifier, which is based on k-NN algorithm. The MIT-BIH arrhythmia database is used to obtain the ECG records used in this study. All heartbeats in the arrhythmia database are grouped into five main heartbeat classes. The classification accuracy of the proposed system is measured by average sensitivity of 90%, average selectivity of 92% and average specificity of 98%. The results show that HOS of WPC as features are highly discriminative for the classification of different arrhythmic ECG beats.  相似文献   

17.
面向不平衡数据集的机器学习分类策略   总被引:1,自引:0,他引:1       下载免费PDF全文
由于不平衡数据集的内在固有特性,使得分类结果常受数量较多的类别影响,造成分类性能下降。近年来,为了能够从类别不平衡的数据集中学习数据的内在规律并且挖掘其潜在的价值,提出了一系列基于提升不平衡数据集机器学习分类算法准确率的研究策略。这些策略主要是立足于数据层面、分类模型改进层面来解决不平衡数据集分类难的困扰。从以上两个方面论述面向不平衡数据集分类问题的机器学习分类策略,分析和讨论了针对不平衡数据集机器学习分类器的评价指标,总结了不平衡数据集分类尚存在的问题,展望了未来能够深入研究的方向。特别的,这些讨论的研究主要关注类别极端不平衡场景下的二分类问题所面临的困难。  相似文献   

18.
刘殊 《计算机应用》2009,29(6):1582-1589
针对阴性选择算法缺乏高效的分类器生成机制和“过拟合”抑制机制的缺陷,提出了一种面向多类别模式分类的阴性选择算法CS-NSA。通过引入克隆选择机制,根据分类器的分类效果和刺激度对其进行自适应学习;针对多类别模式分类的“过拟合”问题,引入了检测器集合的修剪机制,增强了检测器的分类推广能力。对比实验结果证明:与著名的人工免疫分类器AIRS相比,CS-NSA体现出更高的正确识别率。  相似文献   

19.
针对现有粗糙集属性约简方法中存在的连续数据处理时的信息丢失、粒化策略引入不一致信息、参数寻优困难等问题,提出一种适用于连续型数据、基于类别可区分度的非单调性启发式属性约简算法。首先以各样本的标签为依据对论域进行划分,同一标签的样本组合成一个簇,定义每个簇的类间可区分度和类内可区分度;其次,以最大化类间可区分度、最小化类内可区分度为约简原则,定义了一种新的属性重要性判别准则以确定最优约简集,从而提高后续分类器的分类性能。在十一个UCI数据集上与其他六种属性约简算法进行对比实验。结果表明,与六种算法相比,所提算法获得的约简集平均维度减小了1.16,平均分类精度提高了3.42%,其表现出更好的约简性能。  相似文献   

20.
人工免疫识别系统(AIRS)是受生物免疫系统的启示而研发的一种比较有效的分类器,但也存在记忆细胞数目过于庞大,分类精度不高,特别是在数据不完备的情况下,分类精度低等缺陷。为了解决这个问题,提出了一种不完备数据下的免疫分类算法(ICAU),算法引入半监督学习机制和分类器融合投票决策的思想,利用多个AIRS分类器互相帮助学习训练,来提高AIRS在不完备数据下的分类精度。在UCI数据集上进行了实验,结果验证了ICAU算法的有效性。  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号