首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
In this article, we present a semisupervised support vector machine that uses self-training approach. We then construct an ensemble of semisupervised SVM classifiers to address the problem of pixel classification of remote sensing images. Semisupervised support vector machines (S3VMs) are based on applying the margin maximization principle to both labeled and unlabeled samples. The ensemble of SVM classifiers recognizes the conceptual similarity between component classifiers from the same data source. The effectiveness of the proposed technique is first demonstrated for two numeric remote sensing data described in terms of feature vectors and then identifying different land cover regions in remote sensing imagery. Experimental results on these datasets show that employing this learning scheme can increase the accuracy level. The performance of the ensemble is compared with one of its component classifier and conventional SVM in terms of accuracy and quantitative cluster validity indices.  相似文献   

2.
基于主成份分析的肿瘤分类检测算法研究   总被引:1,自引:0,他引:1       下载免费PDF全文
基于基因表达谱的肿瘤诊断方法有望成为临床医学上一种快速而有效的诊断方法,但由于基因表达数据存在维数过高、样本量很小以及噪音大等特点,使得提取与肿瘤有关的信息基因成为一件有挑战性的工作。因此,在分析了目前肿瘤分类检测所采用方法的基础上,本文提出了一种结合基因特征记分和主成份分析的混合特征抽取方法。实验表明明,这种方法能够有效地提取分类特征信息,并在保持较高的肿瘤识别准确率的前提下大幅度地降低基因表达数据的维数,使得分类器性能得到很大提高。实验采用了两种与肿瘤有关的基因表达数据集来验证这种混合特征抽取方法的有效性,采用支持向量机的分类实验结果表明,所提出的混合方法不仅交叉验证识别准确率高而且分类结果能够可
可视化。对于结肠癌组织样本集,其交叉验证识别准确率高这95.16%;而对于急性白血病组织样本集,其交叉验证识别准确率高这100%。  相似文献   

3.
In this paper, we propose an active learning technique for solving multiclass problems with support vector machine (SVM) classifiers. The technique is based on both uncertainty and diversity criteria. The uncertainty criterion is implemented by analyzing the one-dimensional output space of the SVM classifier. A simple histogram thresholding algorithm is used to find out the low density region in the SVM output space to identify the most uncertain samples. Then the diversity criterion exploits the kernel k-means clustering algorithm to select uncorrelated informative samples among the selected uncertain samples. To assess the effectiveness of the proposed method we compared it with other batch mode active learning techniques presented in the literature using one toy data set and three real data sets. Experimental results confirmed that the proposed technique provided a very good tradeoff among robustness to biased initial training samples, classification accuracy, computational complexity, and number of new labeled samples necessary to reach the convergence.  相似文献   

4.
Gender recognition has been playing a very important role in various applications such as human–computer interaction, surveillance, and security. Nonlinear support vector machines (SVMs) were investigated for the identification of gender using the Face Recognition Technology (FERET) image face database. It was shown that SVM classifiers outperform the traditional pattern classifiers (linear, quadratic, Fisher linear discriminant, and nearest neighbour). In this context, this paper aims to improve the SVM classification accuracy in the gender classification system and propose new models for a better performance. We have evaluated different SVM learning algorithms; the SVM‐radial basis function with a 5% outlier fraction outperformed other SVM classifiers. We have examined the effectiveness of different feature selection methods. AdaBoost performs better than the other feature selection methods in selecting the most discriminating features. We have proposed two classification methods that focus on training subsets of images among the training images. Method 1 combines the outcome of different classifiers based on different image subsets, whereas method 2 is based on clustering the training data and building a classifier for each cluster. Experimental results showed that both methods have increased the classification accuracy.  相似文献   

5.
Text categorization is continuing to be one of the most researched NLP problems due to the ever-increasing amounts of electronic documents and digital libraries. In this paper, we present a new text categorization method that combines the distributional clustering of words and a learning logic technique, called Lsquare, for constructing text classifiers. The high dimensionality of text in a document has not been fruitful for the task of categorization, for which reason, feature clustering has been proven to be an ideal alternative to feature selection for reducing the dimensionality. We, therefore, use distributional clustering method (IB) to generate an efficient representation of documents and apply Lsquare for training text classifiers. The method was extensively tested and evaluated. The proposed method achieves higher or comparable classification accuracy and {rm F}_1 results compared with SVM on exact experimental settings with a small number of training documents on three benchmark data sets WebKB, 20Newsgroup, and Reuters-21578. The results prove that the method is a good choice for applications with a limited amount of labeled training data. We also demonstrate the effect of changing training size on the classification performance of the learners.  相似文献   

6.
Fault detection and diagnosis (FDD) in chemical process systems is an important tool for effective process monitoring to ensure the safety of a process. Multi-scale classification offers various advantages for monitoring chemical processes generally driven by events in different time and frequency domains. However, there are issues when dealing with highly interrelated, complex, and noisy databases with large dimensionality. Therefore, a new method for the FDD framework is proposed based on wavelet analysis, kernel Fisher discriminant analysis (KFDA), and support vector machine (SVM) classifiers. The main objective of this work was to combine the advantages of these tools to enhance the performance of the diagnosis on a chemical process system. Initially, a discrete wavelet transform (DWT) was applied to extract the dynamics of the process at different scales. The wavelet coefficients obtained during the analysis were reconstructed using the inverse discrete wavelet transform (IDWT) method, which were then fed into the KFDA to produce discriminant vectors. Finally, the discriminant vectors were used as inputs for the SVM classification task. The SVM classifiers were utilized to classify the feature sets extracted by the proposed method. The performance of the proposed multi-scale KFDA-SVM method for fault classification and diagnosis was analysed and compared using a simulated Tennessee Eastman process as a benchmark. The results showed the improvements of the proposed multiscale KFDA-SVM framework with an average 96.79% of classification accuracy over the multi-scale KFDA-GMM (84.94%), and the established independent component analysis-SVM method (95.78%) of the faults in the Tennessee Eastman process.  相似文献   

7.
Accurate distinction of dynamic moving objects especially in the context of security surveillance attracts great attention of researchers and practitioners. In the same context, present study proposes an advancement in feature extraction method from the micro‐Doppler spectrogram with the application of spatial statistics for moving human subject classification which minimizes the spectrogram analysis. A novel approach of spatial feature extraction from whole image spectrogram, followed by support vector machine (SVM) classifiers algorithm for multiclass classification, has been proposed in the present study. The proposed method has been tested for prediction accuracy and validated by applying on a very close and important five distinct human activities (which usually arise at any security observation site) as reported in the available literature. The results obtained adopting the proposed approach exhibit high accuracy for multiclass classification; yielding cross‐validation accuracy of 96.7% while actual predication of testing data provides the accuracy of 93.33%. For the prediction of accurate data classes, the post‐processing of the spectrogram prior to feature definition has also been performed using spatial based methods to enhance micro‐Doppler signatures.  相似文献   

8.
Land-cover classification based on multi-temporal satellite images for scenarios where parts of the data are missing due to, for example, clouds, snow or sensor failure has received little attention in the remote-sensing literature. The goal of this article is to introduce support vector machine (SVM) methods capable of handling missing data in land-cover classification. The novelty of this article consists of combining the powerful SVM regularization framework with a recent statistical theory of missing data, resulting in a new method where an SVM is trained for each missing data pattern, and a given incomplete test vector is classified by selecting the corresponding SVM model. The SVM classifiers are evaluated on Landsat Enhanced Thematic Mapper Plus (ETM?+?) images covering a scene of Norwegian mountain vegetation. The results show that the proposed SVM-based classifier improves the classification accuracy by 5–10% compared with single image classification. The proposed SVM classifier also outperforms recent non-parametric k-nearest neighbours (k-NN) and Parzen window density-based classifiers for incomplete data by about 3%. Moreover, since the resulting SVM classifier may easily be implemented using existing SVM libraries, we consider the new method to be an attractive choice for classification of incomplete data in remote sensing.  相似文献   

9.

Features subset selection (FSS) generally plays an essential role in the implementation of data mining, particularly in the field of high-dimensional medical data analysis, as well as in supplying early detection with essential features and high accuracy. The latest modern feature selection models are now using the ability of optimization algorithms for extracting features of particular properties to get the highest accuracy performance possible. Many of the optimization algorithms, such as genetic algorithm, often use the required parameters that would need to be adjusted for better results. For the function selection procedure, tuning these parameter values is a difficult challenge. In this paper, a new wrapper-based feature selection approach called binary teaching learning based optimization (BTLBO) is introduced. The binary teaching learning based optimization (BTLBO) is among the most sophisticated meta-heuristic method which does not involve any specific algorithm parameters. It requires only standard process parameters such as population size and a number of iterations to extract a set of features selected from a data. This is a demanding process, to achieve the best possible set of features would be to use a method which is independent of the method controlling parameters. This paper introduces a new modified binary teaching–learning-based optimization (NMBTLBO) as a technique to select subset features and demonstrate support vector machine (SVM) accuracy of binary identification as a fitness function for the implementation of the feature subset selection process. The new proposed algorithm NMBTLBO contains two steps: first, the new updating procedure, second, the new method to select the primary teacher in teacher phase in binary teaching-learning based on optimization algorithm. The proposed technique NMBTLBO was used to classify the rheumatic disease datasets collected from Baghdad Teaching Hospital Outpatient Rheumatology Clinic during 2016–2018. Compared with the original BTLBO algorithm, the improved NMBTLBO algorithm has achieved a major difference in accuracy. Validation was carried out by testing the accuracy of four classification methods: K-nearest neighbors, decision trees, support vector machines and K-means. Study results showed that the classification accuracy of the four methods was increased for the proposed method of selection of features (NMBTLBO) compared to the BTLBO algorithm. SVM classifier provided 89% accuracy of BTLBO-SVM and 95% with NMBTLBO –SVM. Decision trees set the values of 94% with BTLBO-SVM and 95% with the feature selection of NMBTLBO-SVM. The analysis indicates that the latest method (NMBTLBO) enhances classification accuracy.

  相似文献   

10.
针对实际应用中存在的数据集分布不平衡的问题,提出一种融合特征边界数据信息的过采样方法。去除数据集中的噪声点,基于少数类样本点的多类近邻集合,融合特征边界的几何分布信息获得有利于定义最优非线性分类边界的少数类样本点,通过其与所属类簇的结合生成新样本。对不平衡数据集采用多种过采样技术处理后,利用支持向量机进行分类,对比实验表明所提方法有效改善了不平衡数据的分类精度,验证了算法的有效性。  相似文献   

11.
J. Li  X. Tang  J. Liu  J. Huang  Y. Wang 《Pattern recognition》2008,41(6):1975-1984
Various microarray experiments are now done in many laboratories, resulting in the rapid accumulation of microarray data in public repositories. One of the major challenges of analyzing microarray data is how to extract and select efficient features from it for accurate cancer classification. Here we introduce a new feature extraction and selection method based on information gene pairs that have significant change in different tissue samples. Experimental results on five public microarray data sets demonstrate that the feature subset selected by the proposed method performs well and achieves higher classification accuracy on several classifiers. We perform extensive experimental comparison of the features selected by the proposed method and features selected by other methods using different evaluation methods and classifiers. The results confirm that the proposed method performs as well as other methods on acute lymphoblastic-acute myeloid leukemia, adenocarcinoma and breast cancer data sets using a fewer information genes and leads to significant improvement of classification accuracy on colon and diffuse large B cell lymphoma cancer data sets.  相似文献   

12.
Land use classification is an important part of many remote sensing applications. A lot of research has gone into the application of statistical and neural network classifiers to remote‐sensing images. This research involves the study and implementation of a new pattern recognition technique introduced within the framework of statistical learning theory called Support Vector Machines (SVMs), and its application to remote‐sensing image classification. Standard classifiers such as Artificial Neural Network (ANN) need a number of training samples that exponentially increase with the dimension of the input feature space. With a limited number of training samples, the classification rate thus decreases as the dimensionality increases. SVMs are independent of the dimensionality of feature space as the main idea behind this classification technique is to separate the classes with a surface that maximizes the margin between them, using boundary pixels to create the decision surface. Results from SVMs are compared with traditional Maximum Likelihood Classification (MLC) and an ANN classifier. The findings suggest that the ANN and SVM classifiers perform better than the traditional MLC. The SVM and the ANN show comparable results. However, accuracy is dependent on factors such as the number of hidden nodes (in the case of ANN) and kernel parameters (in the case of SVM). The training time taken by the SVM is several magnitudes less.  相似文献   

13.
This paper presents an effective mutual information-based feature selection approach for EMG-based motion classification task. The wavelet packet transform (WPT) is exploited to decompose the four-class motion EMG signals to the successive and non-overlapped sub-bands. The energy characteristic of each sub-band is adopted to construct the initial full feature set. For reducing the computation complexity, mutual information (MI) theory is utilized to get the reduction feature set without compromising classification accuracy. Compared with the extensively used feature reduction methods such as principal component analysis (PCA), sequential forward selection (SFS) and backward elimination (BE) etc., the comparison experiments demonstrate its superiority in terms of time-consuming and classification accuracy. The proposed strategy of feature extraction and reduction is a kind of filter-based algorithms which is independent of the classifier design. Considering the classification performance will vary with the different classifiers, we make the comparison between the fuzzy least squares support vector machines (LS-SVMs) and the conventional widely used neural network classifier. In the further study, our experiments prove that the combination of MI-based feature selection and SVM techniques outperforms other commonly used combination, for example, the PCA and NN. The experiment results show that the diverse motions can be identified with high accuracy by the combination of MI-based feature selection and SVM techniques.

Compared with the combination of PCA-based feature selection and the classical Neural Network classifier, superior performance of the proposed classification scheme illustrates the potential of the SVM techniques combined with WPT and MI in EMG motion classification.  相似文献   


14.
Brain tumor grade identification is an invasive technique and clinicians rely on biopsy and spinal tap method. The proposed method takes an effort to develop a non-invasive method for the tumor grade (Low/High) identification using magnetic resonant images. The process involves preprocessing, image segmentation, tumor isolation, feature extraction, feature selection and classification. An analysis on the performance of the segmentation techniques, feature extraction methods, automatic feature selection (SFLA) and constructed classifiers (support vector machines, learning vector quantization and Naives Bayes) is done on the basis of accuracy, efficiency and elapsed time. This analysis motivates towards the accurate determination of tumor grade from MR images instead of depending on magnetic resonant spectroscopy and biopsy. Fuzzy c-means segmentation outperformed other segmentation techniques, shape and size based textural feature promoted the demarcation of tumor grades, Naive Bayes classifier succeeded in terms of efficiency, error and elapse time when compared with SVM and LVQ. The study was carried out with 200 images consisting training set (164 images) and testing set (36 images). The results revealed that the system is robust and accurate (91%), consumed less time in grade identification, an alternative for biopsy and MRS in the brain tumor grade identification diagnosis procedure.  相似文献   

15.
针对传统网络流量分类方法准确率低、开销大、应用范围受限等问题,提出一种支持向量机(SVM)的半监督网络流量分类方法。该方法在SVM训练中,使用增量学习技术在初始和新增样本集中动态地确定支持向量,避免不必要的重复训练,改善因出现新样本而造成原分类器分类精度降低、分类时间长的情况;改进半监督Tri-training方法对分类器进行协同训练,同时使用大量未标记和少量已标记样本对分类器进行反复修正, 减少辅助分类器的噪声数据,克服传统协同验证对分类算法及样本类型要求苛刻的不足。实验结果表明,该方法可明显提高网络流量分类的准确率和效率。  相似文献   

16.
Support vector machine (SVM) has become a dominant classification technique used in pedestrian detection systems. In such systems, classifiers are used to detect pedestrians in some input frames. The performance of a SVM classifier is mainly influenced by two factors: the selected features and the parameters of the kernel function. These two factors are highly related and therefore, it is desirable that the two factors can be analyzed simultaneously, which are usually not the case in the previous work.In this paper, we propose an evolutionary method to simultaneously optimize the feature set and the parameters for the SVM classifier. Specifically, adaptive genetic operators were designed to be suitable for the feature selection and parameter tuning. The proposed method is used to train a SVM classifier for pedestrian detection. Experiments in real city traffic scenes show that the proposed approach leads to higher detection accuracy and shorter detection time.  相似文献   

17.
Several studies have demonstrated the superior performance of ensemble classification algorithms, whereby multiple member classifiers are combined into one aggregated and powerful classification model, over single models. In this paper, two rotation-based ensemble classifiers are proposed as modeling techniques for customer churn prediction. In Rotation Forests, feature extraction is applied to feature subsets in order to rotate the input data for training base classifiers, while RotBoost combines Rotation Forest with AdaBoost. In an experimental validation based on data sets from four real-life customer churn prediction projects, Rotation Forest and RotBoost are compared to a set of well-known benchmark classifiers. Moreover, variations of Rotation Forest and RotBoost are compared, implementing three alternative feature extraction algorithms: principal component analysis (PCA), independent component analysis (ICA) and sparse random projections (SRP). The performance of rotation-based ensemble classifier is found to depend upon: (i) the performance criterion used to measure classification performance, and (ii) the implemented feature extraction algorithm. In terms of accuracy, RotBoost outperforms Rotation Forest, but none of the considered variations offers a clear advantage over the benchmark algorithms. However, in terms of AUC and top-decile lift, results clearly demonstrate the competitive performance of Rotation Forests compared to the benchmark algorithms. Moreover, ICA-based Rotation Forests outperform all other considered classifiers and are therefore recommended as a well-suited alternative classification technique for the prediction of customer churn that allows for improved marketing decision making.  相似文献   

18.
Feature selection and feature weighting are useful techniques for improving the classification accuracy of K-nearest-neighbor (K-NN) rule. The term feature selection refers to algorithms that select the best subset of the input feature set. In feature weighting, each feature is multiplied by a weight value proportional to the ability of the feature to distinguish pattern classes. In this paper, a novel hybrid approach is proposed for simultaneous feature selection and feature weighting of K-NN rule based on Tabu Search (TS) heuristic. The proposed TS heuristic in combination with K-NN classifier is compared with several classifiers on various available data sets. The results have indicated a significant improvement in the performance in classification accuracy. The proposed TS heuristic is also compared with various feature selection algorithms. Experiments performed revealed that the proposed hybrid TS heuristic is superior to both simple TS and sequential search algorithms. We also present results for the classification of prostate cancer using multispectral images, an important problem in biomedicine.  相似文献   

19.
This paper aims at automatic classification of power quality events using Wavelet Packet Transform (WPT) and Support Vector Machines (SVM). The features of the disturbance signals are extracted using WPT and given to the SVM for effective classification. Recent literature dealing with power quality establishes that support vector machine methods generally outperform traditional statistical and neural methods in classification problems involving power disturbance signals. However, the two vital issues namely the determination of the most appropriate feature subset and the model selection, if suitably addressed, could pave way for further improvement of their performances in terms of classification accuracy and computation time. This paper addresses these issues through a classification system using two optimization techniques, the genetic algorithms and simulated annealing. This system detects the best discriminative features and estimates the best SVM kernel parameters in a fully automatic way. Effectiveness of the proposed detection method is shown in comparison with the conventional parameter optimization methods discussed in literature like grid search method, neural classifiers like Probabilistic Neural Network (PNN), fuzzy k-nearest neighbor classifier (FkNN) and hence proved that the proposed method is reliable as it produces consistently better results.  相似文献   

20.
目的 高光谱分类任务中,由于波段数量较多,图像中存在包含噪声以及各类地物样本分布不均匀等问题,导致分类精度与训练效率不能平衡,在小样本上分类精度低。因此,提出一种基于级联多分类器的高光谱图像分类方法。方法 首先采用主成分分析方法将高度相关的高维特征合成无关的低维特征,以加快Gabor滤波器提取纹理特征的速度;然后使用Gabor滤波器提取图像在各个尺寸、方向上的纹理信息,每一个滤波器会生成一张特征图,在特征图中以待分类样本为中心取一个d×d的邻域,计算该邻域内数据的均值和方差来作为待分类样本的空间信息,再将空间信息和光谱信息融合,以降低光线与噪声的影响;最后将谱—空联合特征输入级联多分类器中,得到预测样本关于类别的概率分布的平均值。结果 实验采用Indian Pines、Pavia University和Salinas 3个数据集,与经典算法如支持向量机和卷积神经网络进行比较,并利用总体分类精度、平均分类精度和Kappa系数作为评价标准进行分析。本文方法总体分类精度在3个数据集上分别达到97.24%、99.57%和99.46%,相对于基于径向基神经网络(RBF)核函数的支持向量机方法提高了13.2%、4.8%和5.68%,相对于加入谱—空联合特征的RBF-SVM (radial basis function-support vector machine)方法提高了2.18%、0.36%和0.83%,相对于卷积神经网络方法提高了3.27%、3.2%和0.3%;Kappa系数分别是0.968 6、0.994 3和0.995 6,亦有提高。结论 实验结果表明,本文方法应用于高光谱图像分类具有较优的分类效果,训练效率较高,无需依赖GPU,而且在小样本上也具有较高的分类精度。  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号