首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
The purpose of this study was to develop an automatic classifier to increase the accuracy of the forced oscillation technique (FOT) for diagnosing early respiratory abnormalities in smoking patients. The data consisted of FOT parameters obtained from 56 volunteers, 28 healthy and 28 smokers with low tobacco consumption. Many supervised learning techniques were investigated, including logistic linear classifiers, k nearest neighbor (KNN), neural networks and support vector machines (SVM). To evaluate performance, the ROC curve of the most accurate parameter was established as baseline. To determine the best input features and classifier parameters, we used genetic algorithms and a 10-fold cross-validation using the average area under the ROC curve (AUC). In the first experiment, the original FOT parameters were used as input. We observed a significant improvement in accuracy (KNN = 0.89 and SVM = 0.87) compared with the baseline (0.77). The second experiment performed a feature selection on the original FOT parameters. This selection did not cause any significant improvement in accuracy, but it was useful in identifying more adequate FOT parameters. In the third experiment, we performed a feature selection on the cross products of the FOT parameters. This selection resulted in a further increase in AUC (KNN = SVM = 0.91), which allows for high diagnostic accuracy. In conclusion, machine learning classifiers can help identify early smoking-induced respiratory alterations. The use of FOT cross products and the search for the best features and classifier parameters can markedly improve the performance of machine learning classifiers.  相似文献   

2.
In this paper, a classifier motivated from statistical learning theory, i.e., support vector machine, with a new approach based on multiclass directed acyclic graph has been proposed for classification of four types of electrocardiogram signals. The motivation for selecting Directed Acyclic Graph Support Vector Machine (DAGSVM) is to have more accurate classifier with less computational cost. Empirical mode decomposition and subsequently singular value decomposition have been used for computing the feature vector matrix. Further, fivefold cross-validation and particle swarm optimization have been used for optimal selection of SVM model parameters to improve the performance of DAGSVM. A comparison has been made between proposed algorithm and other two classifiers, i.e., K-Nearest Neighbor (KNN) and Artificial Neural Network (ANN). The DAGSVM has yielded an average accuracy of 98.96% against 95.83% and 96.66% for the KNN and the ANN, respectively. The results obtained clearly confirm the superiority of the DAGSVM approach over other classifiers.  相似文献   

3.
Heart failure is now widely spread throughout the world. Heart disease affects approximately 48% of the population. It is too expensive and also difficult to cure the disease. This research paper represents machine learning models to predict heart failure. The fundamental concept is to compare the correctness of various Machine Learning (ML) algorithms and boost algorithms to improve models’ accuracy for prediction. Some supervised algorithms like K-Nearest Neighbor (KNN), Support Vector Machine (SVM), Decision Trees (DT), Random Forest (RF), Logistic Regression (LR) are considered to achieve the best results. Some boosting algorithms like Extreme Gradient Boosting (XGBoost) and CatBoost are also used to improve the prediction using Artificial Neural Networks (ANN). This research also focuses on data visualization to identify patterns, trends, and outliers in a massive data set. Python and Scikit-learns are used for ML. Tensor Flow and Keras, along with Python, are used for ANN model training. The DT and RF algorithms achieved the highest accuracy of 95% among the classifiers. Meanwhile, KNN obtained a second height accuracy of 93.33%. XGBoost had a gratified accuracy of 91.67%, SVM, CATBoost, and ANN had an accuracy of 90%, and LR had 88.33% accuracy.  相似文献   

4.
In this paper the possibility of predicting salt concentrations in soils from measured reflectance spectra is studied using partial least squares regression (PLSR) and artificial neural network (ANN). Performance of these two adaptive methods has been compared in order to examine linear and non-linear relationship between soil reflectance and salt concentration.Experiment-, field- and image-scale data sets were prepared consisting of soil EC measurements (dependent variable) and their corresponding reflectance spectra (independent variables). For each data set, PLSR and ANN predictive models of soil salinity were developed based on soil reflectance data. The predictive accuracies of PLSR and ANN models were assessed against independent validation data sets not included in the calibration or training phase.The results of PLSR analyses suggest that an accurate to good prediction of EC can be made based on models developed from experiment-scale data (R2 > 0.81 and RPD (ratio of prediction to deviation) > 2.1) for soil samples salinized by bischofite and epsomite minerals. For field-scale data sets, the PLSR predictive models provided approximate quantitative EC estimations (R2 = 0.8 and RPD = 2.2) for grids 1 and 6 and poor estimations for grids 2, 3, 4 and 5. The salinity predictions from image-scale data sets by PLSR models were very reliable to good (R2 between 0.86 and 0.94 and RPD values between 2.6 and 4.1) except for sub-image 2 (R2 = 0.61 and RPD = 1.2).The ANN models from experiment-scale data set revealed similar network performances for training, validation and test data sets indicating a good network generalization for samples salinized by bischofite and epsomite minerals. The RPD and the R2 between reference measurements and ANN outputs of theses models suggest an accurate to good prediction of soil salinity (R2 > 0.92 and RPD > 2.3). For the field-scale data set, prediction accuracy is relatively poor (0.69 > R2 > 0.42). The ANN predictive models estimating soil salinity from image-scale data sets indicate a good prediction (R2 > 0.86 and RPD > 2.5) except for sub-image 2 (R2 = 0.6 and RPD = 1.2).The results of this study show that both methods have a great potential for estimating and mapping soil salinity. Performance indexes from both methods suggest large similarity between the two approaches with PLSR advantages. This indicates that the relation between soil salinity and soil reflectance can be approximated by a linear function.  相似文献   

5.
Automatic emotion recognition from speech signals is one of the important research areas, which adds value to machine intelligence. Pitch, duration, energy and Mel-frequency cepstral coefficients (MFCC) are the widely used features in the field of speech emotion recognition. A single classifier or a combination of classifiers is used to recognize emotions from the input features. The present work investigates the performance of the features of Autoregressive (AR) parameters, which include gain and reflection coefficients, in addition to the traditional linear prediction coefficients (LPC), to recognize emotions from speech signals. The classification performance of the features of AR parameters is studied using discriminant, k-nearest neighbor (KNN), Gaussian mixture model (GMM), back propagation artificial neural network (ANN) and support vector machine (SVM) classifiers and we find that the features of reflection coefficients recognize emotions better than the LPC. To improve the emotion recognition accuracy, we propose a class-specific multiple classifiers scheme, which is designed by multiple parallel classifiers, each of which is optimized to a class. Each classifier for an emotional class is built by a feature identified from a pool of features and a classifier identified from a pool of classifiers that optimize the recognition of the particular emotion. The outputs of the classifiers are combined by a decision level fusion technique. The experimental results show that the proposed scheme improves the emotion recognition accuracy. Further improvement in recognition accuracy is obtained when the scheme is built by including MFCC features in the pool of features.  相似文献   

6.
通过实验对SVM、KNN文本分类算法进行了深入探讨。基于KNN和SVM算法,提出了一种SVM.KNN算法。该算法结合KNN和SVM两种分类器,并通过分类预测概率的反馈和修正来提高分类器性能。在CWT100G中文网页分类测试系统中,对SVM.KNN算法的实际效果进行了测试和算法性能验证。  相似文献   

7.
Land use classification is an important part of many remote sensing applications. A lot of research has gone into the application of statistical and neural network classifiers to remote‐sensing images. This research involves the study and implementation of a new pattern recognition technique introduced within the framework of statistical learning theory called Support Vector Machines (SVMs), and its application to remote‐sensing image classification. Standard classifiers such as Artificial Neural Network (ANN) need a number of training samples that exponentially increase with the dimension of the input feature space. With a limited number of training samples, the classification rate thus decreases as the dimensionality increases. SVMs are independent of the dimensionality of feature space as the main idea behind this classification technique is to separate the classes with a surface that maximizes the margin between them, using boundary pixels to create the decision surface. Results from SVMs are compared with traditional Maximum Likelihood Classification (MLC) and an ANN classifier. The findings suggest that the ANN and SVM classifiers perform better than the traditional MLC. The SVM and the ANN show comparable results. However, accuracy is dependent on factors such as the number of hidden nodes (in the case of ANN) and kernel parameters (in the case of SVM). The training time taken by the SVM is several magnitudes less.  相似文献   

8.
Activity recognition in monitored environments where the occupants are elderly or disabled is currently a popular research topic, with current systems implementing ubiquitous sensing or video surveillance techniques. Using disaggregated data from smart meters could be a viable alternative to what is often perceived as intrusive recognition technology. Disaggregation methods have proven to perform exceptionally well when trained with large quantities of data, but gathering and labelling this data is, in itself, an intrusive process that requires significant effort and could compromise the practicality of such promising systems. Here we show that by synthesising labelled training data, using a domain specific algorithm, an innovative water meter disaggregation system that uses Artificial Neural Networks (ANN), Support Vector Machine (SVM) and K-Nearest Neighbour (KNN) classifiers can be trained in minutes rather than hours. We show that by artificially synthesising labelled data accuracies of 83%, 79% and 85% with the SVM, ANN and KNN classifiers, respectively can be achieved. Though these values are marginally lower than 89%, 83% and 89% achieved with no synthesis, the measure of accuracy masks the underlying imbalance of representative classes in the data set.  相似文献   

9.
BackgroundDetection and monitoring of respiratory related illness is an important aspect in pulmonary medicine. Acoustic signals extracted from the human body are considered in detection of respiratory pathology accurately.ObjectivesThe aim of this study is to develop a prototype telemedicine tool to detect respiratory pathology using computerized respiratory sound analysis.MethodsAround 120 subjects (40 normal, 40 continuous lung sounds (20 wheeze and 20 rhonchi)) and 40 discontinuous lung sounds (20 fine crackles and 20 coarse crackles) were included in this study. The respiratory sounds were segmented into respiratory cycles using fuzzy inference system and then S-transform was applied to these respiratory cycles. From the S-transform matrix, statistical features were extracted. The extracted features were statistically significant with p < 0.05. To classify the respiratory pathology KNN, SVM and ELM classifiers were implemented using the statistical features obtained from of the data.ResultsThe validation showed that the classification rate for training for ELM classifier with RBF kernel was high compared to the SVM and KNN classifiers. The time taken for training the classifier was also less in ELM compared to SVM and KNN classifiers. The overall mean classification rate for ELM classifier was 98.52%.ConclusionThe telemedicine software tool was developed using the ELM classifier. The telemedicine tool has performed extraordinary well in detecting the respiratory pathology and it is well validated.  相似文献   

10.
This paper presents a thorough study of gender classification methodologies performing on neutral, expressive and partially occluded faces, when they are used in all possible arrangements of training and testing roles. A comprehensive comparison of two representation approaches (global and local), three types of features (grey levels, PCA and LBP), three classifiers (1-NN, PCA + LDA and SVM) and two performance measures (CCR and d′) is provided over single- and cross-database experiments. Experiments revealed some interesting findings, which were supported by three non-parametric statistical tests: when training and test sets contain different types of faces, local models using the 1-NN rule outperform global approaches, even those using SVM classifiers; however, with the same type of faces, even if the acquisition conditions are diverse, the statistical tests could not reject the null hypothesis of equal performance of global SVMs and local 1-NNs.  相似文献   

11.
Computer-aided Diagnosis (CADx) technology can substantially aid in early detection and diagnosis of breast cancers. However, the overall performance of a CADx system is tied, to a large extent, to the accuracy with which the tumors can be segmented in a mammogram. This implies that the segmentation of mammograms is a critical step in the diagnosis of benign and malignant tumors. In this paper, we develop an enhanced mammography CADx system with an emphasis on the segmentation step. In particular, we present two hybrid algorithms based upon region-based, contour-based and clustering segmentation techniques to recognize benign and malignant breast tumors. In the first algorithm, in order to obtain the most accurate final segmented tumor, the initial segmented image, that is required for the level set, is provided by one of spatial fuzzy clustering (SFC), improved region growing (RG), or cellular neural network (CNN). In the second algorithm, all of the parameters which control the level set are obtained from a dynamic training procedure by the combination of both genetic algorithms (GA) and artificial neural network (ANN) or memetic algorithm (MA) and ANN. After segmenting tumors using one of the hybrid proposed methods, intensity, shape and texture features are extracted from tumors, and the appropriate features are then selected by another GA algorithm. Finally, to classify tumors as benign or malignant, different classifiers such as ANN, random forest, naïve Bayes, support vector machine (SVM), and K-nearest neighbor (KNN) are used. Experimental results confirm the efficiency of the proposed methods in terms of sensitivity, specificity, accuracy and area under ROC curve (AUC) for the classification of breast tumors. It was concluded that RG and GA in adaptive RG-LS method produce more accurate primary boundary of tumors and appropriate parameters for the level set technique in segmentation and subsequently in classification.  相似文献   

12.
Leaf area index (LAI) is a commonly required parameter when modelling land surface fluxes. Satellite based imagers, such as the 300 m full resolution (FR) Medium Spectral Resolution Imaging Spectrometer (MERIS), offer the potential for timely LAI mapping. The availability of multiple MERIS LAI algorithms prompts the need for an evaluation of their performance, especially over a range of land use conditions. Four current methods for deriving LAI from MERIS FR data were compared to estimates from in-situ measurements over a 3 km × 3 km region near Ottawa, Canada. The LAI of deciduous dominant forest stands and corn, soybean and pasture fields was measured in-situ using digital hemispherical photography and processed using the CANEYE software. MERIS LAI estimates were derived using the MERIS Top of Atmosphere (TOA) algorithm, MERIS Top of Canopy (TOC) algorithm, the Canada Centre for Remote Sensing (CCRS) Empirical algorithm and the University of Toronto (UofT) GLOBCARBON algorithm. Results show that TOA and TOC LAI estimates were nearly identical (R2 > 0.98) with underestimation of LAI when it is larger than 4 and overestimation when smaller than 2 over the study region. The UofT and CCRS LAI estimates had root mean square errors over 1.4 units with large (∼ 25%) relative residuals over forests and consistent underestimates over corn fields. Both algorithms were correlated (R2 > 0.8) possibly due to their use of the same spectral bands derived vegetation index for retrieving LAI. LAI time series from TOA, TOC and CCRS algorithms showed smooth growth trajectories however similar errors were found when the values were compared with the in-situ LAI. In summary, none of the MERIS LAI algorithms currently meet performance requirements from the Global Climate Observing System.  相似文献   

13.
Three ocean colour algorithms, OC4v6, Carder and OC5 were tested for retrieving Chlorophyll-a (Chla) in coastal areas of the Bay of Bengal and open ocean areas of the Arabian Sea. Firstly, the algorithms were run using ~ 80 in situ Remote Sensing Reflectance, (Rrs(λ)) data collected from coastal areas during eight cruises from January 2000 to March 2002 and the output was compared to in situ Chla. Secondly, the algorithms were run with ~ 20 SeaWiFS Rrs(λ) and the results were compared with coincident in situ Chla. In both cases, OC5 exhibited the lowest log10-RMS, bias, had a slope close to 1 and this algorithm appears to be the most accurate for both coastal and open ocean areas. Thirdly the error in the algorithms was regressed against Total Suspended Material (TSM) and Coloured Dissolved Organic Material (CDOM) data to assess the co-variance with these parameters. The OC5 error did not co-vary with TSM and CDOM. OC4v6 tended to over-estimate Chla > 2 mg m−3 and the error in OC4v6 co-varied with TSM. OC4v6 was more accurate than the Carder algorithm, which over-estimated Chla at concentrations > 1 mg m−3 and under-estimated Chla at values < 0.5 mg m−3. The error in Carder Chla also co-varied with TSM. The algorithms were inter-compared using > 5500 SeaWiFS Rrs(λ) data from coastal to offshore transects in the Northern Bay of Bengal. There was good agreement between OC4v6 and OC5 in open ocean waters and in coastal areas up to 2 mg m−3. There was a strong divergence between Carder and OC5 in open ocean and coastal waters. OC4v6 and Carder tended to over-estimate Chla in coastal areas by a factor of 2 to 3 when TSM > 25 g m−3. We strongly recommend the use of OC5 for coastal and open ocean waters of the Bay of Bengal and Arabian Sea. A Chla time series was generated using OC5 from 2000 to 2003, which showed that concentrations at the mouths of the Ganges reach a maxima (~ 5 mg m−3) in October and November and were 0.08 mg m−3 further offshore increasing to 0.2 mg m−3 during December. Similarly in early spring from February to March, Chla was 0.08 to 0.2 mg m−3 on the east coast of the Bay.  相似文献   

14.
Financial distress prediction of business institutions is a long cherished topic concentrating on reducing loss of the society. Case-based reasoning (CBR) is an easily understandable methodology for problem solving. Support vector machine (SVM) is a new technology developed recently with high classification performance. Combining-classifiers system is capable of taking advantages of various single techniques to produce high performance. In this research, we develop a new combining-classifiers system for financial distress prediction, where four independent CBR systems with k-nearest neighbor (KNN) algorithms are employed as classifiers to be combined, and SVM is utilized as the algorithm fulfilling combining-classifiers. The new combining-classifiers system is named as Multiple CBR systems by SVM (Multi-CBR–SVM). The four CBR systems, respectively, are found on similarity measure on the basis of Euclidean distance metric, Manhattan distance metric, Grey coefficient metric, and Outranking relation metric. Outputs of independent CBRs are transferred as inputs of SVM to carry out combination. How to implement the combining-classifiers system with collected data is illustrated in detail. In the experiment, 83 pairs of sample companies in health and distress from Shanghai and Shenzhen Stock Exchange were collected, the technique of grid-search was utilized to get optimal parameters, leave-one-out cross-validation (LOO-CV) was used as assessment in parameter optimization, and predictive performances on 30-times hold-out data were used to make comparisons among Multi-CBR–SVM, its components and statistical models. Empirical results have indicated that Multi-CBR–SVM is feasible and validated for listed companies’ business failure prediction in China.  相似文献   

15.
16.
Land-cover mapping is an important research topic with broad applicability in the remote-sensing domain. Machine learning algorithms such as Maximum Likelihood Classifier (MLC), Support Vector Machine (SVM), Artificial Neural Network (ANN), and Random Forest (RF) have been playing an important role in this field for many years, although deep neural networks are experiencing a resurgence of interest. In this article, we demonstrate early efforts to apply deep learning-based classification methods to large-scale land-cover mapping. Based on the Stacked Autoencoder (SAE), one of the deep learning models, we built a classification framework for large-scale remote-sensing image processing. We adjusted and optimized the model parameters based on our test samples. We compared the performance of the SAE-based approach with traditional classification algorithms including RF, SVM, and ANN with multiple performance analytics. Results show that the SAE classifier trained with an entire set of African training samples achieves an overall classification accuracy of 78.99% when assessed by test samples collected independently of training samples, which is higher than the accuracies achieved by the other three classifiers (76.03%, 77.74%, and 77.86% of RF, SVM, and ANN, respectively) based on the same set of test samples. We also demonstrated the advantages of SAE in prediction time and land-cover mapping results in this study.  相似文献   

17.
在文本分类领域中,KNN与SVM算法都具有较高的分类准确率,但两者都有其内在的缺点,KNN算法会因为大量的训练样本而导致计算量过大;SVM算法对于噪声数据过于敏感,对分布在分类超平面附近的数据点无法进行准确的分类,基于此提出一种基于变精度粗糙集理论的混合分类算法,该算法能够充分利用二者的优势同时又能克服二者的弱点,最后通过实验证明混合算法能够有效改善计算复杂度与分类精度。  相似文献   

18.
面向视频序列表情分类的LSVM算法   总被引:1,自引:0,他引:1  
为了提高基于视频序列的表情识别精度,在KNN-SVM算法的基础上提出局部SVM分类机制,并将其用于视频序列中的表情分类.对于一个待分类的几何特征样本,首先在训练集中寻找该样本的k个近邻样本,然后根据这k个近邻样本和待分类样本的相似度信息,重新构建局部最优的SVM分类决策超平面,用来对该几何特征样本进行分类.在Cohn-Kanade数据库中的对比实验表明,该分类器有效地提高了表情分类的精度.  相似文献   

19.
Biometric authentication is the process that allows an individual to be identified based on a set of unique biological features data. In this study, we present different experiments to use the cardiac sound signals (phonocardiogram “PCG”) as a biometric authentication trait. We have applied different features extraction approaches and different classification techniques to use the PCG as a biometric trait. Through all experiments, data acquisition is based on collecting the cardiac sounds from HSCT-11 and PASCAL CHSC2011 datasets, while preprocessing is concerned with de-noising of cardiac sounds using multiresolution-decomposition and multiresolution-reconstruction (MDR-MRR). The de-noised signal is then segmented based on frame-windowing and Shanon energy (SE) methods. For feature extraction, Cepstral (Cp) domain (based on mel-frequency) and time-scale (T-S) domain (based on Wavelet Transform) features are extracted from the de-noised signal after segmentation. The features, extracted from the Cp-domain and the T-S domain, are fed to four different classifiers: Artificial neural networks (ANN), support vector machine (SVM), random forest (RF) and K-nearest neighbor (KNN). The performance of the classifications is assessed based on the k-fold cross validation. The computation complexity of the feature extraction domains is expressed using the Big-O measurements. The T-S features are superior to PCG heart signals in terms of the classification accuracy. The experiments' results give the highest classification accuracy with lowest computation complexity for RF in the Cp domain and SVM and ANN in the T-S domain.  相似文献   

20.
基于支持向量机集成的故障诊断   总被引:3,自引:2,他引:3  
为提高故障诊断的准确性,提出了一种基于遗传算法的支持向量机集成学习方法,定义了相应的遗传操作算子,并探讨了集成下的分类器的构造策略。对汽轮机转子不平衡故障诊断的仿真实验结果表明,集成学习方法的性能通常优于单个支持向量机,而所提方法性能则优于Bagging与Boosting等传统集成学习方法,获得的集成所包括的分类器数目更少,而且结合多种分类器构造策略可提高分类器的多样性。该方法能容易地推广到神经网络、决策树等其他学习算法。  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号