首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 46 毫秒
1.
This paper presents a novel method for diagnosis of heart disease. The proposed method is based on a hybrid method that uses fuzzy weighted pre-processing and artificial immune recognition system (AIRS). Artificial immune recognition system has showed an effective performance on several problems such as machine learning benchmark problems and medical classification problems like breast cancer, diabetes, liver disorders classification. The robustness of the proposed method is examined using classification accuracy, k-fold cross-validation method and confusion matrix. The obtained classification accuracy is 96.30% and it is very promising compared to the previously reported classification techniques.  相似文献   

2.
This paper presents a novel method for diagnosis of hepatitis disease. The proposed method is based on a hybrid method that uses feature selection (FS) and artificial immune recognition system (AIRS) with fuzzy resource allocation mechanism. AIRS has showed an effective performance on several problems such as machine learning benchmark problems and medical classification problems like breast cancer, diabets, liver disorders classification. By hybridizing FS and AIRS with fuzzy resource allocation mechanism, a method is obtained to solve this diagnosis problem via classifying. The robustness of this method with regard to sampling variations is examined using a cross-validation method. We used hepatitis disease dataset which is taken from UCI machine learning repository. We obtained a classification accuracy of 92.59%, which is the highest one reached so far. The classification accuracy was obtained via 10-fold cross validation. The obtained classification accuracy of our system was 92.59% and it was very promising with regard to the other classification applications in literature for this problem. Also, sensitivity, and specificity values for hepatitis disease dataset were obtained as 100 and 85%.  相似文献   

3.
In this paper, we propose a new feature selection method called class dependency based feature selection for dimensionality reduction of the macular disease dataset from pattern electroretinography (PERG) signals. In order to diagnosis of macular disease, we have used class dependency based feature selection as feature selection process, fuzzy weighted pre-processing as weighted process and decision tree classifier as decision making. The proposed system consists of three parts. First, we have reduced to 9 features number of features of macular disease dataset that has 63 features using class dependency based feature selection, which is first developed by ours. Second, the macular disease dataset that has 9 features is weighted by using fuzzy weighted pre-processing. And finally, decision tree classifier was applied to PERG signals to distinguish between healthy eye and diseased eye (macula diseases). The employed class dependency based feature selection, fuzzy weighted pre-processing and decision tree classifier have reached to 96.22%, 96.27% and 96.30% classification accuracies using 5–10–15-fold cross-validation, respectively. The results confirmed that the medical decision making system based on the class dependency based feature selection, fuzzy weighted pre-processing and decision tree classifier has potential in detecting the macular disease. The stated results show that the proposed method could point out the ability of design of a new intelligent assistance diagnosis system.  相似文献   

4.
Proper interpretation of the thyroid gland functional data is an important issue in the diagnosis of thyroid disease. The primary role of the thyroid gland is to help regulation of the body’s metabolism. Thyroid hormone produced by the thyroid gland provides this. Production of too little thyroid hormone (hypothyroidism) or production of too much thyroid hormone (hyperthyroidism) defines the type of thyroid disease. Artificial immune systems (AISs) is a new but effective branch of artificial intelligence. Among the systems proposed in this field so far, artificial immune recognition system (AIRS), which was proposed by A. Watkins, has shown an effective and intriguing performance on the problems it was applied. This study aims at diagnosing thyroid disease with a new hybrid machine learning method including this classification system. By hybridizing AIRS with a developed Fuzzy weighted pre-processing, a method is obtained to solve this diagnosis problem via classifying. The robustness of this method with regard to sampling variations is examined using a cross-validation method. We used thyroid disease dataset which is taken from UCI machine learning respiratory. We obtained a classification accuracy of 85%, which is the highest one reached so far. The classification accuracy was obtained via a 10-fold cross-validation.  相似文献   

5.
Abstract: The artificial immune recognition system (AIRS) has been shown to be an efficient approach to tackling a variety of problems such as machine learning benchmark problems and medical classification problems. In this study, the resource allocation mechanism of AIRS was replaced with a new one based on fuzzy logic. The new system, named Fuzzy-AIRS, was used as a classifier in the classification of three well-known medical data sets, the Wisconsin breast cancer data set (WBCD), the Pima Indians diabetes data set and the ECG arrhythmia data set. The performance of the Fuzzy-AIRS algorithm was tested for classification accuracy, sensitivity and specificity values, confusion matrix, computation time and receiver operating characteristic curves. Also, the AIRS and Fuzzy-AIRS algorithms were compared with respect to the amount of resources required in the execution of the algorithm. The highest classification accuracy obtained from applying the AIRS and Fuzzy-AIRS algorithms using 10-fold cross-validation was, respectively, 98.53% and 99.00% for classification of WBCD; 79.22% and 84.42% for classification of the Pima Indians diabetes data set; and 100% and 92.86% for classification of the ECG arrhythmia data set. Hence, these results show that Fuzzy-AIRS can be used as an effective classifier for medical problems.  相似文献   

6.
It is evident that usage of machine learning methods in disease diagnosis has been increasing gradually. In this study, diagnosis of heart disease, which is a very common and important disease, was conducted with such a machine learning system. In this system, a new weighting scheme based on k-nearest neighbour (k-nn) method was utilized as a preprocessing step before the main classifier. Artificial immune recognition system (AIRS) with fuzzy resource allocation mechanism was our used classifier. We took the dataset used in our study from the UCI Machine Learning Database. The obtained classification accuracy of our system was 87% and it was very promising with regard to the other classification applications in the literature for this problem.  相似文献   

7.
In this paper, we have proposed a new feature selection method called kernel F-score feature selection (KFFS) used as pre-processing step in the classification of medical datasets. KFFS consists of two phases. In the first phase, input spaces (features) of medical datasets have been transformed to kernel space by means of Linear (Lin) or Radial Basis Function (RBF) kernel functions. By this way, the dimensions of medical datasets have increased to high dimension feature space. In the second phase, the F-score values of medical datasets with high dimensional feature space have been calculated using F-score formula. And then the mean value of calculated F-scores has been computed. If the F-score value of any feature in medical datasets is bigger than this mean value, that feature will be selected. Otherwise, that feature is removed from feature space. Thanks to KFFS method, the irrelevant or redundant features are removed from high dimensional input feature space. The cause of using kernel functions transforms from non-linearly separable medical dataset to a linearly separable feature space. In this study, we have used the heart disease dataset, SPECT (Single Photon Emission Computed Tomography) images dataset, and Escherichia coli Promoter Gene Sequence dataset taken from UCI (University California, Irvine) machine learning database to test the performance of KFFS method. As classification algorithms, Least Square Support Vector Machine (LS-SVM) and Levenberg–Marquardt Artificial Neural Network have been used. As shown in the obtained results, the proposed feature selection method called KFFS is produced very promising results compared to F-score feature selection.  相似文献   

8.
为减少人工免疫识别系统(AIRS)的记忆细胞数量并提高AIRS的分类准确率,提出一种基于记忆细胞剪切和非线性资源分配的人工免疫识别系统(PNAIRS).PNAIRS采用样本属性离散化来压缩训练空间,利用记忆细胞剪切来淘汰低适应度细胞,并使用非线性资源分配来优化分类器.PNAIRS对6个UCI数据集进行分类测试,测试结果与其它分类算法结果对比,显示PNAIRS具有较小规模的记忆细胞群体和较高的分类准确率,而且算法运行速度快.这表明PNAIRS算法是一个性能良好的分类算法,具有潜在的应用价值.  相似文献   

9.
This paper surveys the major works related to an artificial immune system based classifier that was proposed in the 2000s, namely, the artificial immune recognition system (AIRS) algorithm. This survey has revealed that most works on AIRS was dedicated to the application of the algorithm to real-world problems rather than to theoretical developments of the algorithm. Based on this finding, we propose an improved version of the AIRS algorithm which we dub AIRS3. AIRS3 takes into account an important parameter that was ignored by the original algorithm, namely, the number of training antigens represented by each memory cell at the end of learning (numRepAg). Experiments of the new AIRS3 algorithm on data sets taken from the UCI machine learning repository have shown that taking into account the numRepAg information enhances the classification accuracy of AIRS.  相似文献   

10.
In medical information system, the data that describe patient health records are often time stamped. These data are liable to complexities such as missing data, observations at irregular time intervals and large attribute set. Due to these complexities, mining in clinical time-series data, remains a challenging area of research. This paper proposes a bio-statistical mining framework, named statistical tolerance rough set induced decision tree (STRiD), which handles these complexities and builds an effective classification model. The constructed model is used in developing a clinical decision support system (CDSS) to assist the physician in clinical diagnosis. The STRiD framework provides the following functionalities namely temporal pre-processing, attribute selection and classification. In temporal pre-processing, an enhanced fuzzy-inference based double exponential smoothing method is presented to impute the missing values and to derive the temporal patterns for each attribute. In attribute selection, relevant attributes are selected using the tolerance rough set. A classification model is constructed with the selected attributes using temporal pattern induced decision tree classifier. For experimentation, this work uses clinical time series datasets of hepatitis and thrombosis patients. The constructed classification model has proven the effectiveness of the proposed framework with a classification accuracy of 91.5% for hepatitis and 90.65% for thrombosis.  相似文献   

11.
This paper presents a novel method for differential diagnosis of erythemato-squamous disease. The proposed method is based on fuzzy weighted pre-processing, k-NN (nearest neighbor) based weighted pre-processing, and decision tree classifier. The proposed method consists of three parts. In the first part, we have used decision tree classifier to diagnosis erythemato-squamous disease. In the second part, first of all, fuzzy weighted pre-processing, which can improved by ours, is a new method and applied to inputs erythemato-squamous disease dataset. Then, the obtained weighted inputs were classified using decision tree classifier. In the third part, k-NN based weighted pre-processing, which can improved by ours, is a new method and applied to inputs erythemato-squamous disease dataset. Then, the obtained weighted inputs were classified via decision tree classifier. The employed decision tree classifier, fuzzy weighted pre-processing decision tree classifier, and k-NN based weighted pre-processing decision tree classifier have reached to 86.18, 97.57, and 99.00% classification accuracies using 20-fold cross validation, respectively.  相似文献   

12.
Feature selection is one of the most important techniques for data preprocessing in classification problems. In this paper, fuzzy grids–based association rules mining, as an effective data mining technique, is used for feature selection in misuse detection application in computer networks. The main idea of this algorithm is to find the relationships between items in large datasets so that it detects correlations between inputs of the system and then eliminates the redundant inputs. To classify the attacks, a fuzzy ARTMAP neural network is employed whose training parameters are optimized by gravitational search algorithm. The performance of the proposed system is compared with some other machine learning methods in the same application. Experimental results show that the proposed system, when choosing optimum “feature subset size-adjustment” parameter, performs better in terms of detection rate, false alarm rate, and cost per example in classification problems. In addition, employing the reduced-size feature set results in more than 8.4 percent reduction in computational complexity.  相似文献   

13.
The use of artificial intelligence methods in medical analysis is increasing. This is mainly because the effectiveness of classification and detection systems has improved in a great deal to help medical experts in diagnosing. In this paper, we investigate the performance of an artificial immune system (AIS) based fuzzy k-NN algorithm to determine the heart valve disorders from the Doppler heart sounds. The proposed methodology is composed of three stages. The first stage is the pre-processing stage. The feature extraction is the second stage. During feature extraction stage, Wavelet transforms and short time Fourier transform were used. As next step, wavelet entropy was applied to these features. In the classification stage, AIS based fuzzy k-NN algorithm is used. To compute the correct classification rate of proposed methodology, a comparative study is realized by using a data set containing 215 samples. The validation of the proposed method is measured by using the sensitivity and specificity parameters. 95.9% sensitivity and 96% specificity rate was obtained.  相似文献   

14.
Abstract: The aim of this research was to compare classifier algorithms including the C4.5 decision tree classifier, the least squares support vector machine (LS-SVM) and the artificial immune recognition system (AIRS) for diagnosing macular and optic nerve diseases from pattern electroretinography signals. The pattern electroretinography signals were obtained by electrophysiological testing devices from 106 subjects who were optic nerve and macular disease subjects. In order to show the test performance of the classifier algorithms, the classification accuracy, receiver operating characteristic curves, sensitivity and specificity values, confusion matrix and 10-fold cross-validation have been used. The classification results obtained are 85.9%, 100% and 81.82% for the C4.5 decision tree classifier, the LS-SVM classifier and the AIRS classifier respectively using 10-fold cross-validation. It is shown that the LS-SVM classifier is a robust and effective classifier system for the determination of macular and optic nerve diseases.  相似文献   

15.
Hepatitis is a disease which is seen at all levels of age. Hepatitis disease solely does not have a lethal effect, but the early diagnosis and treatment of hepatitis is crucial as it triggers other diseases. In this study, a new hybrid medical decision support system based on rough set (RS) and extreme learning machine (ELM) has been proposed for the diagnosis of hepatitis disease. RS-ELM consists of two stages. In the first one, redundant features have been removed from the data set through RS approach. In the second one, classification process has been implemented through ELM by using remaining features. Hepatitis data set, taken from UCI machine learning repository has been used to test the proposed hybrid model. A major part of the data set (48.3%) includes missing values. As removal of missing values from the data set leads to data loss, feature selection has been done in the first stage without deleting missing values. In the second stage, the classification process has been performed through ELM after the removal of missing values from sub-featured data sets that were reduced in different dimensions. The results showed that the highest 100.00% classification accuracy has been achieved through RS-ELM and it has been observed that RS-ELM model has been considerably successful compared to the other methods in the literature. Furthermore in this study, the most significant features have been determined for the diagnosis of the hepatitis. It is considered that proposed method is to be useful in similar medical applications.  相似文献   

16.
Artificial Immune Recognition System (AIRS) classification algorithm, which has an important place among classification algorithms in the field of Artificial Immune Systems, has showed an effective and intriguing performance on the problems it was applied. AIRS was previously applied to some medical classification problems including Breast Cancer, Cleveland Heart Disease, Diabetes and it obtained very satisfactory results. So, AIRS proved to be an efficient artificial intelligence technique in medical field. In this study, the resource allocation mechanism of AIRS was changed with a new one determined by Fuzzy-Logic. This system, named as Fuzzy-AIRS was used as a classifier in the diagnosis of Breast Cancer and Liver Disorders, which are of great importance in medicine. The classifications of Breast Cancer and BUPA Liver Disorders datasets taken from University of California at Irvine (UCI) Machine Learning Repository were done using 10-fold cross-validation method. Reached classification accuracies were evaluated by comparing them with reported classifiers in UCI web site in addition to other systems that are applied to the related problems. Also, the obtained classification performances were compared with AIRS with regard to the classification accuracy, number of resources and classification time. Fuzzy-AIRS, which reached to classification accuracy of 98.51% for breast cancer, classified the Liver Disorders dataset with 83.36% accuracy. For both datasets, Fuzzy-AIRS obtained the highest classification accuracy according to the UCI web site. Beside of this success, Fuzzy-AIRS gained an important advantage over the AIRS by means of classification time. In the experiments, it was seen that the classification time in Fuzzy-AIRS was reduced about 70% of AIRS for both datasets. By reducing classification time as well as obtaining high classification accuracies in the applied datasets, Fuzzy-AIRS classifier proved that it could be used as an effective classifier for medical problems.  相似文献   

17.
Improving accuracies of machine learning algorithms is vital in designing high performance computer-aided diagnosis (CADx) systems. Researches have shown that a base classifier performance might be enhanced by ensemble classification strategies. In this study, we construct rotation forest (RF) ensemble classifiers of 30 machine learning algorithms to evaluate their classification performances using Parkinson's, diabetes and heart diseases from literature.While making experiments, first the feature dimension of three datasets is reduced using correlation based feature selection (CFS) algorithm. Second, classification performances of 30 machine learning algorithms are calculated for three datasets. Third, 30 classifier ensembles are constructed based on RF algorithm to assess performances of respective classifiers with the same disease data. All the experiments are carried out with leave-one-out validation strategy and the performances of the 60 algorithms are evaluated using three metrics; classification accuracy (ACC), kappa error (KE) and area under the receiver operating characteristic (ROC) curve (AUC).Base classifiers succeeded 72.15%, 77.52% and 84.43% average accuracies for diabetes, heart and Parkinson's datasets, respectively. As for RF classifier ensembles, they produced average accuracies of 74.47%, 80.49% and 87.13% for respective diseases.RF, a newly proposed classifier ensemble algorithm, might be used to improve accuracy of miscellaneous machine learning algorithms to design advanced CADx systems.  相似文献   

18.

In the fields of pattern recognition and machine learning, the use of data preprocessing algorithms has been increasing in recent years to achieve high classification performance. In particular, it has become inevitable to use the data preprocessing method prior to classification algorithms in classifying medical datasets with the nonlinear and imbalanced data distribution. In this study, a new data preprocessing method has been proposed for the classification of Parkinson, hepatitis, Pima Indians, single proton emission computed tomography (SPECT) heart, and thoracic surgery medical datasets with the nonlinear and imbalanced data distribution. These datasets were taken from UCI machine learning repository. The proposed data preprocessing method consists of three steps. In the first step, the cluster centers of each attribute were calculated using k-means, fuzzy c-means, and mean shift clustering algorithms in medical datasets including Parkinson, hepatitis, Pima Indians, SPECT heart, and thoracic surgery medical datasets. In the second step, the absolute differences between the data in each attribute and the cluster centers are calculated, and then, the average of these differences is calculated for each attribute. In the final step, the weighting coefficients are calculated by dividing the mean value of the difference to the cluster centers, and then, weighting is performed by multiplying the obtained weight coefficients by the attribute values in the dataset. Three different attribute weighting methods have been proposed: (1) similarity-based attribute weighting in k-means clustering, (2) similarity-based attribute weighting in fuzzy c-means clustering, and (3) similarity-based attribute weighting in mean shift clustering. In this paper, we aimed to aggregate the data in each class together with the proposed attribute weighting methods and to reduce the variance value within the class. Thus, by reducing the value of variance in each class, we have put together the data in each class and at the same time, we have further increased the discrimination between the classes. To compare with other methods in the literature, the random subsampling has been used to handle the imbalanced dataset classification. After attribute weighting process, four classification algorithms including linear discriminant analysis, k-nearest neighbor classifier, support vector machine, and random forest classifier have been used to classify imbalanced medical datasets. To evaluate the performance of the proposed models, the classification accuracy, precision, recall, area under the ROC curve, κ value, and F-measure have been used. In the training and testing of the classifier models, three different methods including the 50–50% train–test holdout, the 60–40% train–test holdout, and tenfold cross-validation have been used. The experimental results have shown that the proposed attribute weighting methods have obtained higher classification performance than random subsampling method in the handling of classifying of the imbalanced medical datasets.

  相似文献   

19.
20.
“Dimensionality” is one of the major problems which affect the quality of learning process in most of the machine learning and data mining tasks. Having high dimensional datasets for training a classification model may lead to have “overfitting” of the learned model to the training data. Overfitting reduces generalization of the model, therefore causes poor classification accuracy for the new test instances. Another disadvantage of dimensionality of dataset is to have high CPU time requirement for learning and testing the model. Applying feature selection to the dataset before the learning process is essential to improve the performance of the classification task. In this study, a new hybrid method which combines artificial bee colony optimization technique with differential evolution algorithm is proposed for feature selection of classification tasks. The developed hybrid method is evaluated by using fifteen datasets from the UCI Repository which are commonly used in classification problems. To make a complete evaluation, the proposed hybrid feature selection method is compared with the artificial bee colony optimization, and differential evolution based feature selection methods, as well as with the three most popular feature selection techniques that are information gain, chi-square, and correlation feature selection. In addition to these, the performance of the proposed method is also compared with the studies in the literature which uses the same datasets. The experimental results of this study show that our developed hybrid method is able to select good features for classification tasks to improve run-time performance and accuracy of the classifier. The proposed hybrid method may also be applied to other search and optimization problems as its performance for feature selection is better than pure artificial bee colony optimization, and differential evolution.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号