首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 46 毫秒
1.
This paper presents the results of handwritten digit recognition on well-known image databases using state-of-the-art feature extraction and classification techniques. The tested databases are CENPARMI, CEDAR, and MNIST. On the test data set of each database, 80 recognition accuracies are given by combining eight classifiers with ten feature vectors. The features include chaincode feature, gradient feature, profile structure feature, and peripheral direction contributivity. The gradient feature is extracted from either binary image or gray-scale image. The classifiers include the k-nearest neighbor classifier, three neural classifiers, a learning vector quantization classifier, a discriminative learning quadratic discriminant function (DLQDF) classifier, and two support vector classifiers (SVCs). All the classifiers and feature vectors give high recognition accuracies. Relatively, the chaincode feature and the gradient feature show advantage over other features, and the profile structure feature shows efficiency as a complementary feature. The SVC with RBF kernel (SVC-rbf) gives the highest accuracy in most cases but is extremely expensive in storage and computation. Among the non-SV classifiers, the polynomial classifier and DLQDF give the highest accuracies. The results of non-SV classifiers are competitive to the best ones previously reported on the same databases.  相似文献   

2.
Training set resampling based ensemble design techniques are successfully used to reduce the classification errors of the base classifiers. Boosting is one of the techniques used for this purpose where each training set is obtained by drawing samples with replacement from the available training set according to a weighted distribution which is modified for each new classifier to be included in the ensemble. The weighted resampling results in a classifier set, each being accurate in different parts of the input space mainly specified the sample weights. In this study, a dynamic integration of boosting based ensembles is proposed so as to take into account the heterogeneity of the input sets. An evidence-theoretic framework is developed for this purpose so as to take into account the weights and distances of the neighboring training samples in both training and testing boosting based ensembles. The effectiveness of the proposed technique is compared to the AdaBoost algorithm using three different base classifiers.  相似文献   

3.
Timely identification and treatment of medical conditions could facilitate faster recovery and better health. Existing systems address this issue using custom-built sensors, which are invasive and difficult to generalize. A low-complexity scalable process is proposed to detect and identify medical conditions from 2D skeletal movements on video feed data. Minimal set of features relevant to distinguish medical conditions: AMF, PVF and GDF are derived from skeletal data on sampled frames across the entire action. The AMF (angular motion features) are derived to capture the angular motion of limbs during a specific action. The relative position of joints is represented by PVF (positional variation features). GDF (global displacement features) identifies the direction of overall skeletal movement. The discriminative capability of these features is illustrated by their variance across time for different actions. The classification of medical conditions is approached in two stages. In the first stage, a low-complexity binary LSTM classifier is trained to distinguish visual medical conditions from general human actions. As part of stage 2, a multi-class LSTM classifier is trained to identify the exact medical condition from a given set of visually interpretable medical conditions. The proposed features are extracted from the 2D skeletal data of NTU RGB + D and then used to train the binary and multi-class LSTM classifiers. The binary and multi-class classifiers observed average F1 scores of 77% and 73%, respectively, while the overall system produced an average F1 score of 69% and a weighted average F1 score of 80%. The multi-class classifier is found to utilize 10 to 100 times fewer parameters than existing 2D CNN-based models while producing similar levels of accuracy.  相似文献   

4.
We describe a multi-purpose image classifier that can be applied to a wide variety of image classification tasks without modifications or fine-tuning, and yet provide classification accuracy comparable to state-of-the-art task-specific image classifiers. The proposed image classifier first extracts a large set of 1025 image features including polynomial decompositions, high contrast features, pixel statistics, and textures. These features are computed on the raw image, transforms of the image, and transforms of transforms of the image. The feature values are then used to classify test images into a set of pre-defined image classes. This classifier was tested on several different problems including biological image classification and face recognition. Although we cannot make a claim of universality, our experimental results show that this classifier performs as well or better than classifiers developed specifically for these image classification tasks. Our classifier's high performance on a variety of classification problems is attributed to (i) a large set of features extracted from images; and (ii) an effective feature selection and weighting algorithm sensitive to specific image classification problems. The algorithms are available for free download from openmicroscopy.org.  相似文献   

5.
Ensembles that combine the decisions of classifiers generated by using perturbed versions of the training set where the classes of the training examples are randomly switched can produce a significant error reduction, provided that large numbers of units and high class switching rates are used. The classifiers generated by this procedure have statistically uncorrelated errors in the training set. Hence, the ensembles they form exhibit a similar dependence of the training error on ensemble size, independently of the classification problem. In particular, for binary classification problems, the classification performance of the ensemble on the training data can be analysed in terms of a Bernoulli process. Experiments on several UCI datasets demonstrate the improvements in classification accuracy that can be obtained using these class-switching ensembles.  相似文献   

6.
Fisher kernels combine the powers of discriminative and generative classifiers by mapping the variable-length sequences to a new fixed length feature space, called the Fisher score space. The mapping is based on a single generative model and the classifier is intrinsically binary. We propose a multi-class classification strategy that applies a multi-class classification on each Fisher score space and combines the decisions of multi-class classifiers. We experimentally show that the Fisher scores of one class provide discriminative information for the other classes as well. We compare several multi-class classification strategies for Fisher scores generated from the hidden Markov models of sign sequences. The proposed multi-class classification strategy increases the classification accuracy in comparison with the state of the art strategies based on combining binary classifiers. To reduce the computational complexity of the Fisher score extraction and the training phases, we also propose a score space selection method and show that, similar or even higher accuracies can be obtained by using only a subset of the score spaces. Based on the proposed score space selection method, a signer adaptation technique is also presented that does not require any re-training.  相似文献   

7.
Various methods for ensembles selection and classifier combination have been designed to optimize the performance of ensembles of classifiers. However, use of large number of features in training data can affect the classification performance of machine learning algorithms. The objective of this paper is to represent a novel feature elimination (FE) based ensembles learning method which is an extension to an existing machine learning environment. Here the standard 12 lead ECG signal recordings data have been used in order to diagnose arrhythmia by classifying it into normal and abnormal subjects. The advantage of the proposed approach is that it reduces the size of feature space by way of using various feature elimination methods. The decisions obtained from these methods have been coalesced to form a fused data. Thus the idea behind this work is to discover a reduced feature space so that a classifier built using this tiny data set would perform no worse than a classifier built from the original data set. Random subspace based ensembles classifier is used with PART tree as base classifier. The proposed approach has been implemented and evaluated on the UCI ECG signal data. Here, the classification performance has been evaluated using measures such as mean absolute error, root mean squared error, relative absolute error, F-measure, classification accuracy, receiver operating characteristics and area under curve. In this way, the proposed novel approach has provided an attractive performance in terms of overall classification accuracy of 91.11 % on unseen test data set. From this work, it is shown that this approach performs well on the ensembles size of 15 and 20.  相似文献   

8.
《Information Fusion》2003,4(2):87-100
A popular method for creating an accurate classifier from a set of training data is to build several classifiers, and then to combine their predictions. The ensembles of simple Bayesian classifiers have traditionally not been a focus of research. One way to generate an ensemble of accurate and diverse simple Bayesian classifiers is to use different feature subsets generated with the random subspace method. In this case, the ensemble consists of multiple classifiers constructed by randomly selecting feature subsets, that is, classifiers constructed in randomly chosen subspaces. In this paper, we present an algorithm for building ensembles of simple Bayesian classifiers in random subspaces. The EFS_SBC algorithm includes a hill-climbing-based refinement cycle, which tries to improve the accuracy and diversity of the base classifiers built on random feature subsets. We conduct a number of experiments on a collection of 21 real-world and synthetic data sets, comparing the EFS_SBC ensembles with the single simple Bayes, and with the boosted simple Bayes. In many cases the EFS_SBC ensembles have higher accuracy than the single simple Bayesian classifier, and than the boosted Bayesian ensemble. We find that the ensembles produced focusing on diversity have lower generalization error, and that the degree of importance of diversity in building the ensembles is different for different data sets. We propose several methods for the integration of simple Bayesian classifiers in the ensembles. In a number of cases the techniques for dynamic integration of classifiers have significantly better classification accuracy than their simple static analogues. We suggest that a reason for that is that the dynamic integration better utilizes the ensemble coverage than the static integration.  相似文献   

9.
Recent researches in fault classification have shown the importance of accurately selecting the features that have to be used as inputs to the diagnostic model. In this work, a multi-objective genetic algorithm (MOGA) is considered for the feature selection phase. Then, two different techniques for using the selected features to develop the fault classification model are compared: a single classifier based on the feature subset with the best classification performance and an ensemble of classifiers working on different feature subsets. The motivation for developing ensembles of classifiers is that they can achieve higher accuracies than single classifiers. An important issue for an ensemble to be effective is the diversity in the predictions of the base classifiers which constitute it, i.e. their capability of erring on different sub-regions of the pattern space. In order to show the benefits of having diverse base classifiers in the ensemble, two different ensembles have been developed: in the first, the base classifiers are constructed on feature subsets found by MOGAs aimed at maximizing the fault classification performance and at minimizing the number of features of the subsets; in the second, diversity among classifiers is added to the MOGA search as the third objective function to maximize. In both cases, a voting technique is used to effectively combine the predictions of the base classifiers to construct the ensemble output. For verification, some numerical experiments are conducted on a case of multiple-fault classification in rotating machinery and the results achieved by the two ensembles are compared with those obtained by a single optimal classifier.  相似文献   

10.
Multiple classifier systems (MCS) are attracting increasing interest in the field of pattern recognition and machine learning. Recently, MCS are also being introduced in the remote sensing field where the importance of classifier diversity for image classification problems has not been examined. In this article, Satellite Pour l'Observation de la Terre (SPOT) IV panchromatic and multispectral satellite images are classified into six land cover classes using five base classifiers: contextual classifier, k-nearest neighbour classifier, Mahalanobis classifier, maximum likelihood classifier and minimum distance classifier. The five base classifiers are trained with the same feature sets throughout the experiments and a posteriori probability, derived from the confusion matrix of these base classifiers, is applied to five Bayesian decision rules (product rule, sum rule, maximum rule, minimum rule and median rule) for constructing different combinations of classifier ensembles. The performance of these classifier ensembles is evaluated for overall accuracy and kappa statistics. Three statistical tests, the McNemar's test, the Cochran's Q test and the Looney's F-test, are used to examine the diversity of the classification results of the base classifiers compared to the results of the classifier ensembles. The experimental comparison reveals that (a) significant diversity amongst the base classifiers cannot enhance the performance of classifier ensembles; (b) accuracy improvement of classifier ensembles can only be found by using base classifiers with similar and low accuracy; (c) increasing the number of base classifiers cannot improve the overall accuracy of the MCS and (d) none of the Bayesian decision rules outperforms the others.  相似文献   

11.
In this work a novel technique for building ensembles of classifiers for spectrogram classification is presented. We propose a simple approach for classifying signals from a large database of plant echoes, these echoes are highly complex stochastic signals, anyway their spectrograms contain enough information for extracting a good set of features for training the proposed ensemble of classifiers.The proposed ensemble of classifiers is a novel modified version of a recent feature transform based ensemble method: the Input Decimated Ensemble. In the proposed variant different subsets of randomly extracted training patterns are used to create a set of different Neighborhood Preserving Embedding subspace projections. These feature transformations are applied to the whole dataset and a set of decision trees are trained using these transformed spaces. Finally, the scores of this set of classifiers are combined by sum rule.Experiments carried out on a yet proposed dataset show the superiority of this method with respect to other approaches. The proposed approach outperforms the yet proposed, for the tested dataset, combination of principal component analysis and support vector machine (SVM). Moreover, we show that the fusion between the proposed ensemble and the system based on SVM outperforms both the stand-alone methods.  相似文献   

12.
In general, the analysis of microarray data requires two steps: feature selection and classification. From a variety of feature selection methods and classifiers, it is difficult to find optimal ensembles composed of any feature-classifier pairs. This paper proposes a novel method based on the evolutionary algorithm (EA) to form sophisticated ensembles of features and classifiers that can be used to obtain high classification performance. In spite of the exponential number of possible ensembles of individual feature-classifier pairs, an EA can produce the best ensemble in a reasonable amount of time. The chromosome is encoded with real values to decide the weight for each feature-classifier pair in an ensemble. Experimental results with two well-known microarray datasets in terms of time and classification rate indicate that the proposed method produces ensembles that are superior to individual classifiers, as well as other ensembles optimized by random and greedy strategies.  相似文献   

13.
Spectral features of images, such as Gabor filters and wavelet transform can be used for texture image classification. That is, a classifier is trained based on some labeled texture features as the training set to classify unlabeled texture features of images into some pre-defined classes. The aim of this paper is twofold. First, it investigates the classification performance of using Gabor filters, wavelet transform, and their combination respectively, as the texture feature representation of scenery images (such as mountain, castle, etc.). A k-nearest neighbor (k-NN) classifier and support vector machine (SVM) are also compared. Second, three k-NN classifiers and three SVMs are combined respectively, in which each of the combined three classifiers uses one of the above three texture feature representations respectively, to see whether combining multiple classifiers can outperform the single classifier in terms of scenery image classification. The result shows that a single SVM using Gabor filters provides the highest classification accuracy than the other two spectral features and the combined three k-NN classifiers and three SVMs.  相似文献   

14.
We present attribute bagging (AB), a technique for improving the accuracy and stability of classifier ensembles induced using random subsets of features. AB is a wrapper method that can be used with any learning algorithm. It establishes an appropriate attribute subset size and then randomly selects subsets of features, creating projections of the training set on which the ensemble classifiers are built. The induced classifiers are then used for voting. This article compares the performance of our AB method with bagging and other algorithms on a hand-pose recognition dataset. It is shown that AB gives consistently better results than bagging, both in accuracy and stability. The performance of ensemble voting in bagging and the AB method as a function of the attribute subset size and the number of voters for both weighted and unweighted voting is tested and discussed. We also demonstrate that ranking the attribute subsets by their classification accuracy and voting using only the best subsets further improves the resulting performance of the ensemble.  相似文献   

15.
This paper proposed two psychophysiological-data-driven classification frameworks for operator functional states (OFS) assessment in safety-critical human-machine systems with stable generalization ability. The recursive feature elimination (RFE) and least square support vector machine (LSSVM) are combined and used for binary and multiclass feature selection. Besides typical binary LSSVM classifiers for two-class OFS assessment, two multiclass classifiers based on multiclass LSSVM-RFE and decision directed acyclic graph (DDAG) scheme are developed, one used for recognizing the high mental workload and fatigued state while the other for differentiating overloaded and base-line states from the normal states. Feature selection results have revealed that different dimensions of OFS can be characterized by specific set of psychophysiological features. Performance comparison studies show that reasonable high and stable classification accuracy of both classification frameworks can be achieved if the RFE procedure is properly implemented and utilized.  相似文献   

16.
小样本学习的分类结果依赖于模型对样本特征的表达能力,为了进一步挖掘图像所表达的语义信息,提出一种多级度量网络的小样本学习方法。将输入图像的特征向量放入嵌入模块进行特征提取;将经过第二层卷积及第三层卷积得到的特征描述子分别进行图像-类的度量以获得图像关系得分,对第四层卷积得到的特征向量进行全连接并将其做图像-图像的度量从而得到图像从属概率;通过交叉验证对2个图像关系得分以及1个图像从属概率进行加权融合并输出分类结果。实验结果表明在miniImageNet数据集上,该方法 5-way 1-shot准确率为56.77%,5-way 5-shot准确率为75.83%。在CUB数据集上,该方法 5-way 1-shot及5-way 5-shot准确率分别上升到55.34%及76.32%。在Omniglot数据集上准确率同传统方法相比也有一定提升。因此,该方法可有效挖掘图像中所表达的语义信息,显著提高小样本图像分类的准确率。  相似文献   

17.
陈松峰  范明 《计算机科学》2010,37(8):236-239256
提出了一种使用基于贝叶斯的基分类器建立组合分类器的新方法PCABoost.本方法在创建训练样本时,随机地将特征集划分成K个子集,使用PCA得到每个子集的主成分,形成新的特征空间,并将全部的训练数据映射到新的特征空间作为新的训练集.通过不同的变换生成不同的特征空间,从而产生若干个有差异的训练集.在每一个新的训练集上利用AdaBoost建立一组基于贝叶斯的逐渐提升的分类器(即一个分类器组),这样就建立了若干个有差异的分类器组,然后在每个分类器组内部通过加权投票产生一个预测,再把每个组的预测通过投票来产生组合分类器的分类结果,最终建立一个具有两层组合的组合分类器.从UCI标准数据集中随机选取30个数据集进行实验.结果表明,本算法不仅能够显著提高基于贝叶斯的分类器的分类性能,而且与Rotation Forest和AdaBoost等组合方法相比,在大部分数据集上都具有更高的分类准确率.  相似文献   

18.
This paper presents a novel application of advanced machine learning techniques for Mars terrain image classification. Fuzzy-rough feature selection (FRFS) is adapted and then employed in conjunction with Support Vector Machines (SVMs) to construct image classifiers. These techniques are integrated to address problems in space engineering where the images are of many classes, large-scale, and diverse representational properties. The use of the adapted FRFS allows the induction of low-dimensionality feature sets from feature patterns of a much higher dimensionality. To evaluate the proposed work, K-Nearest Neighbours (KNNs) and decision trees (DTREEs) based image classifiers as well as information gain rank (IGR) based feature selection are also investigated here, as possible alternatives to the underlying machine learning techniques adopted. The results of systematic comparative studies demonstrate that in general, feature selection improves the performance of classifiers that are intended for use in high dimensional domains. In particular, the proposed approach helps to increase the classification accuracy, while enhancing classification efficiency by requiring considerably less features. This is evident in that the resultant SVM-based classifiers which utilise FRFS-selected features generally outperform KNN and DTREE based classifiers and those which use IGR-returned features. The work is therefore shown to be of great potential for on-board or ground-based image classification in future Mars rover missions.  相似文献   

19.
特征选择有助于增强集成分类器成员间的随机差异性,从而提高泛化精度。研究了随机子空间法(RandomSub-space)和旋转森林法(RotationForest)两种基于特征选择的集成分类器构造算法,分析讨论了两算法特征选择的方式与随机差异程度之间的关系。通过对UCI数据集引入噪声,比较两者在噪声环境下的分类精度。实验结果表明:当噪声增加及特征关联度下降时,基本学习算法及噪声程度对集成效果均有影响,当噪声增强到一定程度后。集成效果和单分类器的性能趋于一致。  相似文献   

20.
Image-to-class (I2C) distance is a novel measure for image classification and has successfully handled datasets with large intra-class variances. However, due to the lack of a training phase, the performance of this distance is easily affected by irrelevant local features that may hurt the classification accuracy. Besides, the success of this I2C distance relies heavily on the large number of local features in the training set, which requires expensive computation cost for classifying test images. On the other hand, if there are small number of local features in the training set, it may result in poor performance.In this paper, we propose a distance learning method to improve the classification accuracy of this I2C distance as well as two strategies for accelerating its NN search. We first propose a large margin optimization framework to learn the I2C distance function, which is modeled as a weighted combination of the distance from every local feature in an image to its nearest-neighbor (NN) in a candidate class. We learn these weights associated with local features in the training set by constraining the optimization such that the I2C distance from image to its belonging class should be less than that to any other class. We evaluate the proposed method on several publicly available image datasets and show that the performance of I2C distance for classification can significantly be improved by learning a weighted I2C distance function. To improve the computation cost, we also propose two methods based on spatial division and hubness score to accelerate the NN search, which is able to largely reduce the on-line testing time while still preserving or even achieving a better classification accuracy.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号