首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
加权KNN(k-nearest neighbor)方法,仅利用了k个最近邻训练样本所提供的类别信息,而没考虑测试样本的贡献,因而常会导致一些误判。针对这个缺陷,提出了半监督KNN分类方法。该方法对序列样本和非序列样本,均能够较好地执行分类。在分类决策时,还考虑了c个最近邻测试样本的贡献,从而提高了分类的正确性。在Cohn-Kanade人脸库上,序列图像的识别率提高了5.95%,在CMU-AMP人脸库上,非序列图像的识别率提高了7.98%。实验结果表明,该方法执行效率高,分类效果好。  相似文献   

2.
The k nearest neighbor (k-NN) classifier has been a widely used nonparametric technique in Pattern Recognition, because of its simplicity and good performance. In order to decide the class of a new prototype, the k-NN classifier performs an exhaustive comparison between the prototype to classify and the prototypes in the training set T. However, when T is large, the exhaustive comparison is expensive. For this reason, many fast k-NN classifiers have been developed, some of them are based on a tree structure, which is created during a preprocessing phase using the prototypes in T. Then, in a search phase, the tree is traversed to find the nearest neighbor. The speed up is obtained, while the exploration of some parts of the tree is avoided using pruning rules which are usually based on the triangle inequality. However, in soft sciences as Medicine, Geology, Sociology, etc., the prototypes are usually described by numerical and categorical attributes (mixed data), and sometimes the comparison function for computing the similarity between prototypes does not satisfy metric properties. Therefore, in this work an approximate fast k most similar neighbor classifier, for mixed data and similarity functions that do not satisfy metric properties, based on a tree structure (Tree k-MSN) is proposed. Some experiments with synthetic and real data are presented.  相似文献   

3.
An efficient method for face recognition which is robust under illumination variations is proposed. The proposed method achieves the illumination invariants based on the illumination-reflection model employing local matching for best classification. Different filters have been tested to achieve the reflectance part of the image, which is illumination invariant, and maximum filter is suggested as the best method for this purpose. A set of adaptively weighted classifiers vote on different sub-images of each input image and a decision is made based on their votes. Image entropy and mutual information are used as weight factors. The proposed method does not need any prior information about the face shape or illumination and can be applied on each image separately. Unlike most available methods, our method does not need multiple images in training stage to get the illumination invariants. Support vector machines and k-nearest neighbors methods are used as classifier. Several experiments are performed on Yale B, Extended Yale B and CMU-PIE databases. Recognition results show that the proposed method is suitable for efficient face recognition under illumination variations.  相似文献   

4.
Nearest neighbor (NN) classification assumes locally constant class conditional probabilities, and suffers from bias in high dimensions with a small sample set. In this paper, we propose a novel cam weighted distance to ameliorate the curse of dimensionality. Different from the existing neighborhood-based methods which only analyze a small space emanating from the query sample, the proposed nearest neighbor classification using the cam weighted distance (CamNN) optimizes the distance measure based on the analysis of inter-prototype relationship. Our motivation comes from the observation that the prototypes are not isolated. Prototypes with different surroundings should have different effects in the classification. The proposed cam weighted distance is orientation and scale adaptive to take advantage of the relevant information of inter-prototype relationship, so that a better classification performance can be achieved. Experiments show that CamNN significantly outperforms one nearest neighbor classification (1-NN) and k-nearest neighbor classification (k-NN) in most benchmarks, while its computational complexity is comparable with that of 1-NN classification.  相似文献   

5.
k-nearest neighbor (k-NN) classification is a well-known decision rule that is widely used in pattern classification. However, the traditional implementation of this method is computationally expensive. In this paper we develop two effective techniques, namely, template condensing and preprocessing, to significantly speed up k-NN classification while maintaining the level of accuracy. Our template condensing technique aims at “sparsifying” dense homogeneous clusters of prototypes of any single class. This is implemented by iteratively eliminating patterns which exhibit high attractive capacities. Our preprocessing technique filters a large portion of prototypes which are unlikely to match against the unknown pattern. This again accelerates the classification procedure considerably, especially in cases where the dimensionality of the feature space is high. One of our case studies shows that the incorporation of these two techniques to k-NN rule achieves a seven-fold speed-up without sacrificing accuracy.  相似文献   

6.
The k-nearest neighbors classifier is one of the most widely used methods of classification due to several interesting features, such as good generalization and easy implementation. Although simple, it is usually able to match, and even beat, more sophisticated and complex methods. However, no successful method has been reported so far to apply boosting to k-NN. As boosting methods have proved very effective in improving the generalization capabilities of many classification algorithms, proposing an appropriate application of boosting to k-nearest neighbors is of great interest.Ensemble methods rely on the instability of the classifiers to improve their performance, as k-NN is fairly stable with respect to resampling, these methods fail in their attempt to improve the performance of k-NN classifier. On the other hand, k-NN is very sensitive to input selection. In this way, ensembles based on subspace methods are able to improve the performance of single k-NN classifiers. In this paper we make use of the sensitivity of k-NN to input space for developing two methods for boosting k-NN. The two approaches modify the view of the data that each classifier receives so that the accurate classification of difficult instances is favored.The two approaches are compared with the classifier alone and bagging and random subspace methods with a marked and significant improvement of the generalization error. The comparison is performed using a large test set of 45 problems from the UCI Machine Learning Repository. A further study on noise tolerance shows that the proposed methods are less affected by class label noise than the standard methods.  相似文献   

7.
Data weighting is of paramount importance with respect to classification performance in pattern recognition applications. In this paper, the output labels of datasets have been encoded using binary codes (numbers) and by this way provided a novel data weighting method called binary encoded output based data weighting (BEOBDW). In the proposed data weighting method, first of all, the output labels of datasets have been encoded with binary codes and then obtained two encoded output labels. Depending to these encoded outputs, the data points in datasets have been weighted using the relationships between features of datasets and two encoded output labels. To generalize the proposed data weighting method, five datasets have been used. These datasets are chain link (2 classes), two spiral (2 classes), iris (3 classes), wine (3 classes), and dermatology (6 classes). After applied BEOBDW to five datasets, the k-NN (nearest neighbor) classifier has been used to classify the weighted datasets. A set of experiments on used real world datasets demonstrated that the proposed data weighting method is a very efficient and has robust discrimination ability in the classification of datasets. BEOBDW method could be confidently used before many classification algorithms.  相似文献   

8.
Though the k-nearest neighbor (k-NN) pattern classifier is an effective learning algorithm, it can result in large model sizes. To compensate, a number of variant algorithms have been developed that condense the model size of the k-NN classifier at the expense of accuracy. To increase the accuracy of these condensed models, we present a direct boosting algorithm for the k-NN classifier that creates an ensemble of models with locally modified distance weighting. An empirical study conducted on 10 standard databases from the UCI repository shows that this new Boosted k-NN algorithm has increased generalization accuracy in the majority of the datasets and never performs worse than standard k-NN.  相似文献   

9.
The k-nearest neighbour estimation method is one of the main tools used in multi-source forest inventories. It is a powerful non-parametric method for which estimates are easy to compute and relatively accurate. One downside of this method is that it lacks an uncertainty measure for predicted values and for areas of an arbitrary size. We present a method to estimate the prediction uncertainty based on the variogram model which derives the necessary formula for the k-nn method. A data application is illustrated for multi-source forest inventory data, and the results are compared at pixel level to the conventional RMSE method. We find that the variogram model-based method which is analytic, is competitive with the RMSE method.  相似文献   

10.
A modified k-nearest neighbour (k-NN) classifier is proposed for supervised remote sensing classification of hyperspectral data. To compare its performance in terms of classification accuracy and computational cost, k-NN and a back-propagation neural network classifier were used. A classification accuracy of 91.2% was achieved by the proposed classifier with the data set used. Results from this study suggest that the accuracy achieved with this classifier is significantly better than the k-NN and comparable to a back-propagation neural network. Comparison in terms of computational cost also suggests the effectiveness of modified k-NN classifier for hyperspectral data classification. A fuzzy entropy-based filter approach was used for feature selection to compare the performance of modified and k-NN classifiers with a reduced data set. The results suggest a significant increase in classification accuracy by the modified k-NN classifier in comparison with k-NN classifier with selected features.  相似文献   

11.
The k-nearest neighbors classifier is a widely used classification method that has proven to be very effective in supervised learning tasks. In this paper, a fuzzy rough set method for prototype selection, focused on optimizing the behavior of this classifier, is presented. The hybridization with an evolutionary feature selection method is considered to further improve its performance, obtaining a competent data reduction algorithm for the 1-nearest neighbors classifier. This hybridization is performed in the training phase, by using the solution of each preprocessing technique as the starting condition of the other one, within a cycle. The results of the experimental study, which have been contrasted through nonparametric statistical tests, show that the new hybrid approach obtains very promising results with respect to classification accuracy and reduction of the size of the training set.  相似文献   

12.
Although many more complex learning algorithms exist, k-nearest neighbor is still one of the most successful classifiers in real-world applications. One of the ways of scaling up the k-nearest neighbors classifier to deal with large datasets is instance selection. Due to the constantly growing amount of data in almost any pattern recognition task, we need more efficient instance selection algorithms, which must achieve larger reductions while maintaining the accuracy of the selected subset.  相似文献   

13.
We propose a case-based reasoning (CBR) model that uses preference theory functions for similarity measurements between cases. As it is hard to select the right preference function for every feature and set the appropriate parameters, a genetic algorithm is used for choosing the right preference functions, or more precisely, for setting the parameters of each preference function, as to set attribute weights. The proposed model is compared to the well-known k-nearest neighbour (k-NN) model based on the Euclidean distance measure. It has been evaluated on three different benchmark datasets, while its accuracy has been measured with 10-fold cross-validation test. The experimental results show that the proposed approach can, in some cases, outperform the traditional k-NN classifier.  相似文献   

14.
A novel classification method based on multiple-point statistics (MPS) is proposed in this article. The method is a modified version of the spatially weighted k-nearest neighbour (k-NN) classifier, which accounts for spatial correlation through weights applied to neighbouring pixels. The MPS characterizes the spatial correlation between multiple points of land-cover classes by learning local patterns in a training image. This rich spatial information is then converted to multiple-point probabilities and incorporated into the k-NN classifier. Experiments were conducted in two study areas, in which the proposed method for classification was tested on a WorldView-2 sub-scene of the Sichuan mountainous area and an IKONOS image of the Beijing urban area. The multiple-point weighted k-NN method (MPk-NN) was compared to several alternatives; including the traditional k-NN and two previously published spatially weighted k-NN schemes; the inverse distance weighted k-NN, and the geostatistically weighted k-NN. The classifiers using the Bayesian and Support Vector Machine (SVM) methods, and these classifiers weighted with spatial context using the Markov random field (MRF) model, were also introduced to provide a benchmark comparison with the MPk-NN method. The proposed approach increased classification accuracy significantly relative to the alternatives, and it is, thus, recommended for the identification of land-cover types with complex and diverse spatial distributions.  相似文献   

15.
16.
The use of machine learning tools in biological data analysis is increasing gradually. This is mainly because the effectiveness of classification and recognition systems has improved in a great deal to help medical experts in diagnosing. In this paper, we investigate the performance of an artificial immune system based k-nearest neighbors algorithm with and without cross-validation in a class of imbalanced problems from bioinformatics field. Furthermore, we used an unsupervised artificial immune system algorithm for reduction training data dimension and k-nearest neighbors algorithm for classification purpose. The conducted experiments showed the effectiveness of the proposed schema. By selecting the E. coli database, we could compare our classification accuracy with other methods which were presented in the literature. The proposed hybrid system produced much more accurate results than the Horton and Nakai's proposal [P. Horton, K. Nakai, A probabilistic classification system for predicting the cellular localization sites of proteins, in: Proceedings of the 4th International Conference on Intelligent Systems for Molecular Biology, AAAI Press, St. Louis, 1996, pp. 109–115; P. Horton, K. Nakai, Better prediction of protein cellular localization sites with the k-nearest neighbors classifier, in: Proceedings of Intelligent Systems in Molecular Biology, Halkidiki, Greece, 1997, pp. 368–383]. Besides the accuracy improvement, one of the important aspects of the proposed methodology is the complexity. As the artificial immune system provided data reduction, the training complexity of the proposed system is considerably low against the k-nearest neighbors classifier.  相似文献   

17.
Instance-based learning (IBL), so called memory-based reasoning (MBR), is a commonly used non-parametric learning algorithm. k-nearest neighbor (k-NN) learning is the most popular realization of IBL. Due to its usability and adaptability, k-NN has been successfully applied to a wide range of applications. However, in practice, one has to set important model parameters only empirically: the number of neighbors (k) and weights to those neighbors. In this paper, we propose structured ways to set these parameters, based on locally linear reconstruction (LLR). We then employed sequential minimal optimization (SMO) for solving quadratic programming step involved in LLR for classification to reduce the computational complexity. Experimental results from 11 classification and eight regression tasks were promising enough to merit further investigation: not only did LLR outperform the conventional weight allocation methods without much additional computational cost, but also LLR was found to be robust to the change of k.  相似文献   

18.
An investigation is conducted on two well-known similarity-based learning approaches to text categorization: the k-nearest neighbors (kNN) classifier and the Rocchio classifier. After identifying the weakness and strength of each technique, a new classifier called the kNN model-based classifier (kNN Model) is proposed. It combines the strength of both kNN and Rocchio. A text categorization prototype, which implements kNN Model along with kNN and Rocchio, is described. An experimental evaluation of different methods is carried out on two common document corpora: the 20-newsgroup collection and the ModApte version of the Reuters-21578 collection of news stories. The experimental results show that the proposed kNN model-based method outperforms the kNN and Rocchio classifiers, and is therefore a good alternative for kNN and Rocchio in some application areas. This work was partly supported by the European Commission project ICONS, project no. IST-2001-32429.  相似文献   

19.
This study presents the application of fuzzy c-means (FCM) clustering-based feature weighting (FCMFW) for the detection of Parkinson's disease (PD). In the classification of PD dataset taken from University of California – Irvine machine learning database, practical values of the existing traditional and non-standard measures for distinguishing healthy people from people with PD by detecting dysphonia were applied to the input of FCMFW. The main aims of FCM clustering algorithm are both to transform from a linearly non-separable dataset to a linearly separable one and to increase the distinguishing performance between classes. The weighted PD dataset is presented to k-nearest neighbour (k-NN) classifier system. In the classification of PD, the various k-values in k-NN classifier were used and compared with each other. Also, the effects of k-values in k-NN classifier on the classification of Parkinson disease datasets have been investigated and the best k-value found. The experimental results have demonstrated that the combination of the proposed weighting method called FCMFW and k-NN classifier has obtained very promising results on the classification of PD.  相似文献   

20.
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号