首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 642 毫秒
1.
《Applied Soft Computing》2008,8(2):985-995
The fuzzy min–max (FMM) network is a supervised neural network classifier that forms hyperboxes for classification and prediction. In this paper, we propose modifications to FMM in an attempt to improve its classification performance when a small number of large hyperboxes are formed in the network. Given a new input pattern, in addition to measuring the fuzzy membership function of the input pattern to the hyperboxes formed in FMM, an Euclidean distance measure is introduced for predicting the target class associated with the new input pattern. A rule extraction algorithm is also embedded into the modified FMM network. A confidence factor is calculated for each FMM hyperbox, and a user-defined threshold is used to prune the hyperboxes with low confidence factors. Fuzzy ifthen rules are then extracted from the pruned network. The benefits of the proposed modifications are twofold, viz., to improve the performance of FMM when large hyperboxes are formed in the network; to facilitate the extraction of a compact rule set from FMM to justify its predictions. To assess the effectiveness of modified FMM, two benchmark pattern classification problems are experimented, and the results from different methods published in the literature are compared. In addition, a fault detection and classification problem with a set of real sensor measurements collected from a power generation plant is evaluated using modified FMM. The results obtained are analyzed and explained, and implications of the modified FMM network as a useful fault detection and classification tool in real environments are discussed.  相似文献   

2.
In this paper, we propose a novel supervised dimension reduction algorithm based on K-nearest neighbor (KNN) classifier. The proposed algorithm reduces the dimension of data in order to improve the accuracy of the KNN classification. This heuristic algorithm proposes independent dimensions which decrease Euclidean distance of a sample data and its K-nearest within-class neighbors and increase Euclidean distance of that sample and its M-nearest between-class neighbors. This algorithm is a linear dimension reduction algorithm which produces a mapping matrix for projecting data into low dimension. The dimension reduction step is followed by a KNN classifier. Therefore, it is applicable for high-dimensional multiclass classification. Experiments with artificial data such as Helix and Twin-peaks show ability of the algorithm for data visualization. This algorithm is compared with state-of-the-art algorithms in classification of eight different multiclass data sets from UCI collection. Simulation results have shown that the proposed algorithm outperforms the existing algorithms. Visual place classification is an important problem for intelligent mobile robots which not only deals with high-dimensional data but also has to solve a multiclass classification problem. A proper dimension reduction method is usually needed to decrease computation and memory complexity of algorithms in large environments. Therefore, our method is very well suited for this problem. We extract color histogram of omnidirectional camera images as primary features, reduce the features into a low-dimensional space and apply a KNN classifier. Results of experiments on five real data sets showed superiority of the proposed algorithm against others.  相似文献   

3.
ABSTRACT

This investigation proposes a fuzzy min-max hyperbox classifier to solve M-class classification problems. In the proposed fuzzy min-max hyperbox classifier, a supervised learning method is implemented to generate min-max hyperboxes for the training patterns in each class so that the generated fuzzy min-max hyperbox classifier has a perfect classification rate in the training set. However, the 100% correct classification of the training set generally leads to overfitting. In order to improve this drawback, a procedure is employed to decrease the complexity of the input decision boundaries so that the generated fuzzy hyperbox classifier has a good generalization performance. Finally, two benchmark data sets are considered to demonstrate the good performance of the proposed approach for solving this classification problem.  相似文献   

4.
Graph is a powerful representation formalism that has been widely employed in machine learning and data mining. In this paper, we present a graph-based classification method, consisting of the construction of a special graph referred to as K-associated graph, which is capable of representing similarity relationships among data cases and proportion of classes overlapping. The main properties of the K-associated graphs as well as the classification algorithm are described. Experimental evaluation indicates that the proposed technique captures topological structure of the training data and leads to good results on classification task particularly for noisy data. In comparison to other well-known classification techniques, the proposed approach shows the following interesting features: (1) A new measure, called purity, is introduced not only to characterize the degree of overlap among classes in the input data set, but also to construct the K-associated optimal graph for classification; (2) nonlinear classification with automatic local adaptation according to the input data. Contrasting to K-nearest neighbor classifier, which uses a fixed K, the proposed algorithm is able to automatically consider different values of K, in order to best fit the corresponding overlap of classes in different data subspaces, revealing both the local and global structure of input data. (3) The proposed classification algorithm is nonparametric, implicating high efficiency and no need for model selection in practical applications.  相似文献   

5.
Traditional fast k-nearest neighbor search algorithms based on pyramid structures need either many extra memories or long search time. This paper proposes a fast k-nearest neighbor search algorithm based on the wavelet transform, which exploits the important information hiding in the transform coefficients to reduce the computational complexity. The study indicates that the Haar wavelet transform brings two kinds of important pyramids. Two elimination criteria derived from the transform coefficients are used to reject those impossible candidates. Experimental results on texture classification verify the effectiveness of the proposed algorithm.  相似文献   

6.
The use of machine learning tools in biological data analysis is increasing gradually. This is mainly because the effectiveness of classification and recognition systems has improved in a great deal to help medical experts in diagnosing. In this paper, we investigate the performance of an artificial immune system based k-nearest neighbors algorithm with and without cross-validation in a class of imbalanced problems from bioinformatics field. Furthermore, we used an unsupervised artificial immune system algorithm for reduction training data dimension and k-nearest neighbors algorithm for classification purpose. The conducted experiments showed the effectiveness of the proposed schema. By selecting the E. coli database, we could compare our classification accuracy with other methods which were presented in the literature. The proposed hybrid system produced much more accurate results than the Horton and Nakai's proposal [P. Horton, K. Nakai, A probabilistic classification system for predicting the cellular localization sites of proteins, in: Proceedings of the 4th International Conference on Intelligent Systems for Molecular Biology, AAAI Press, St. Louis, 1996, pp. 109–115; P. Horton, K. Nakai, Better prediction of protein cellular localization sites with the k-nearest neighbors classifier, in: Proceedings of Intelligent Systems in Molecular Biology, Halkidiki, Greece, 1997, pp. 368–383]. Besides the accuracy improvement, one of the important aspects of the proposed methodology is the complexity. As the artificial immune system provided data reduction, the training complexity of the proposed system is considerably low against the k-nearest neighbors classifier.  相似文献   

7.
Feature selection is a useful pre-processing technique for solving classification problems. The challenge of solving the feature selection problem lies in applying evolutionary algorithms capable of handling the huge number of features typically involved. Generally, given classification data may contain useless, redundant or misleading features. To increase classification accuracy, the primary objective is to remove irrelevant features in the feature space and to correctly identify relevant features. Binary particle swarm optimization (BPSO) has been applied successfully to solving feature selection problems. In this paper, two kinds of chaotic maps—so-called logistic maps and tent maps—are embedded in BPSO. The purpose of chaotic maps is to determine the inertia weight of the BPSO. We propose chaotic binary particle swarm optimization (CBPSO) to implement the feature selection, in which the K-nearest neighbor (K-NN) method with leave-one-out cross-validation (LOOCV) serves as a classifier for evaluating classification accuracies. The proposed feature selection method shows promising results with respect to the number of feature subsets. The classification accuracy is superior to other methods from the literature.  相似文献   

8.
Prototype classifiers are a type of pattern classifiers, whereby a number of prototypes are designed for each class so as they act as representatives of the patterns of the class. Prototype classifiers are considered among the simplest and best performers in classification problems. However, they need careful positioning of prototypes to capture the distribution of each class region and/or to define the class boundaries. Standard methods, such as learning vector quantization (LVQ), are sensitive to the initial choice of the number and the locations of the prototypes and the learning rate. In this article, a new prototype classification method is proposed, namely self-generating prototypes (SGP). The main advantage of this method is that both the number of prototypes and their locations are learned from the training set without much human intervention. The proposed method is compared with other prototype classifiers such as LVQ, self-generating neural tree (SGNT) and K-nearest neighbor (K-NN) as well as Gaussian mixture model (GMM) classifiers. In our experiments, SGP achieved the best performance in many measures of performance, such as training speed, and test or classification speed. Concerning number of prototypes, and test classification accuracy, it was considerably better than the other methods, but about equal on average to the GMM classifiers. We also implemented the SGP method on the well-known STATLOG benchmark, and it beat all other 21 methods (prototype methods and non-prototype methods) in classification accuracy.  相似文献   

9.
In this paper, a hybrid online learning model that combines the fuzzy min–max (FMM) neural network and the Classification and Regression Tree (CART) for motor fault detection and diagnosis tasks is described. The hybrid model, known as FMM-CART, incorporates the advantages of both FMM and CART for undertaking data classification (with FMM) and rule extraction (with CART) problems. In particular, the CART model is enhanced with an importance predictor-based feature selection measure. To evaluate the effectiveness of the proposed online FMM-CART model, a series of experiments using publicly available data sets containing motor bearing faults is first conducted. The results (primarily prediction accuracy and model complexity) are analyzed and compared with those reported in the literature. Then, an experimental study on detecting imbalanced voltage supply of an induction motor using a laboratory-scale test rig is performed. In addition to producing accurate results, a set of rules in the form of a decision tree is extracted from FMM-CART to provide explanations for its predictions. The results positively demonstrate the usefulness of FMM-CART with online learning capabilities in tackling real-world motor fault detection and diagnosis tasks.  相似文献   

10.
Penalized likelihood is a general approach whereby an objective function is defined, consisting of the log likelihood of the data minus some term penalizing non-smooth solutions. Subsequently, this objective function is maximized, yielding a solution that achieves some sort of trade-off between the faithfulness and the smoothness of the fit. Most work on that topic focused on the regression problem, and there has been little work on the classification problem. In this paper we propose a new classification method using the concept of penalized likelihood (for the two class case). By proposing a novel penalty term based on the K-nearest neighbors, simple analytical derivations have led to an algorithm that is proved to converge to the global optimum. Moreover, this algorithm is very simple to implement and converges typically in two or three iterations. We also introduced two variants of the method by distance-weighting the K-nearest neighbor contributions, and by tackling the unbalanced class patterns situation. We performed extensive experiments to compare the proposed method to several well-known classification methods. These simulations reveal that the proposed method achieves one of the top ranks in classification performance and with a fairly small computation time.  相似文献   

11.
Instance-based learning (IBL), so called memory-based reasoning (MBR), is a commonly used non-parametric learning algorithm. k-nearest neighbor (k-NN) learning is the most popular realization of IBL. Due to its usability and adaptability, k-NN has been successfully applied to a wide range of applications. However, in practice, one has to set important model parameters only empirically: the number of neighbors (k) and weights to those neighbors. In this paper, we propose structured ways to set these parameters, based on locally linear reconstruction (LLR). We then employed sequential minimal optimization (SMO) for solving quadratic programming step involved in LLR for classification to reduce the computational complexity. Experimental results from 11 classification and eight regression tasks were promising enough to merit further investigation: not only did LLR outperform the conventional weight allocation methods without much additional computational cost, but also LLR was found to be robust to the change of k.  相似文献   

12.
加权KNN(k-nearest neighbor)方法,仅利用了k个最近邻训练样本所提供的类别信息,而没考虑测试样本的贡献,因而常会导致一些误判。针对这个缺陷,提出了半监督KNN分类方法。该方法对序列样本和非序列样本,均能够较好地执行分类。在分类决策时,还考虑了c个最近邻测试样本的贡献,从而提高了分类的正确性。在Cohn-Kanade人脸库上,序列图像的识别率提高了5.95%,在CMU-AMP人脸库上,非序列图像的识别率提高了7.98%。实验结果表明,该方法执行效率高,分类效果好。  相似文献   

13.
Feature selection and feature weighting are useful techniques for improving the classification accuracy of K-nearest-neighbor (K-NN) rule. The term feature selection refers to algorithms that select the best subset of the input feature set. In feature weighting, each feature is multiplied by a weight value proportional to the ability of the feature to distinguish pattern classes. In this paper, a novel hybrid approach is proposed for simultaneous feature selection and feature weighting of K-NN rule based on Tabu Search (TS) heuristic. The proposed TS heuristic in combination with K-NN classifier is compared with several classifiers on various available data sets. The results have indicated a significant improvement in the performance in classification accuracy. The proposed TS heuristic is also compared with various feature selection algorithms. Experiments performed revealed that the proposed hybrid TS heuristic is superior to both simple TS and sequential search algorithms. We also present results for the classification of prostate cancer using multispectral images, an important problem in biomedicine.  相似文献   

14.
We have presented a model-based approach for human gait recognition, which is based on analyzing the leg and arm movements. An initial model is created based on anatomical proportions, and a posterior model is constructed upon the movements of the articulated parts of the body, using active contour models and the Hough transform. Fourier analysis is used to describe the motion patterns of the moving parts. The k-nearest neighbor rule applied to the phase-weighted Fourier magnitude of each segment’s spectrum is used for classification. In contrast to the existing approaches, the main focus of this paper is on increasing the discrimination capability of the model through extra features produced from the motion of the arms. Experimental results indicate good performance of the proposed method. The technique has also proved to be able to reduce the adverse effects of self-occlusion, which is a common incident in human walking.  相似文献   

15.
The renowned k-nearest neighbor decision rule is widely used for classification tasks, where the label of any new sample is estimated based on a similarity criterion defined by an appropriate distance function. It has also been used successfully for regression problems where the purpose is to predict a continuous numeric label. However, some alternative neighborhood definitions, such as the surrounding neighborhood, have considered that the neighbors should fulfill not only the proximity property, but also a spatial location criterion. In this paper, we explore the use of the k-nearest centroid neighbor rule, which is based on the concept of surrounding neighborhood, for regression problems. Two support vector regression models were executed as reference. Experimentation over a wide collection of real-world data sets and using fifteen odd different values of k demonstrates that the regression algorithm based on the surrounding neighborhood significantly outperforms the traditional k-nearest neighborhood method and also a support vector regression model with a RBF kernel.  相似文献   

16.
This paper proposes a cellular automata-based solution of a binary classification problem. The proposed method is based on a two-dimensional, three-state cellular automaton (CA) with the von Neumann neighborhood. Since the number of possible CA rules (potential CA-based classifiers) is huge, searching efficient rules is conducted with use of a genetic algorithm (GA). Experiments show an excellent performance of discovered rules in solving the classification problem. The best found rules perform better than the heuristic CA rule designed by a human and also better than one of the most widely used statistical method: the k-nearest neighbors algorithm (k-NN). Experiments show that CAs rules can be successfully reused in the process of searching new rules.  相似文献   

17.
A novel non-parametric clustering method based on non-parametric local shrinking is proposed. Each data point is transformed in such a way that it moves a specific distance toward a cluster center. The direction and the associated size of each movement are determined by the median of its K-nearest neighbors. This process is repeated until a pre-defined convergence criterion is satisfied. The optimal value of the number of neighbors is determined by optimizing some commonly used index functions that measure the strengths of clusters generated by the algorithm. The number of clusters and the final partition are determined automatically without any input parameter except the stopping rule for convergence. Experiments on simulated and real data sets suggest that the proposed algorithm achieves relatively high accuracies when compared with classical clustering algorithms.  相似文献   

18.
A two-stage hybrid model for data classification and rule extraction is proposed. The first stage uses a Fuzzy ARTMAP (FAM) classifier with Q-learning (known as QFAM) for incremental learning of data samples, while the second stage uses a Genetic Algorithm (GA) for rule extraction from QFAM. Given a new data sample, the resulting hybrid model, known as QFAM-GA, is able to provide prediction pertaining to the target class of the data sample as well as to give a fuzzy if-then rule to explain the prediction. To reduce the network complexity, a pruning scheme using Q-values is applied to reduce the number of prototypes generated by QFAM. A ‘don't care’ technique is employed to minimize the number of input features using the GA. A number of benchmark problems are used to evaluate the effectiveness of QFAM-GA in terms of test accuracy, noise tolerance, model complexity (number of rules and total rule length). The results are comparable, if not better, than many other models reported in the literature. The main significance of this research is a usable and useful intelligent model (i.e., QFAM-GA) for data classification in noisy conditions with the capability of yielding a set of explanatory rules with minimum antecedents. In addition, QFAM-GA is able to maximize accuracy and minimize model complexity simultaneously. The empirical outcome positively demonstrate the potential impact of QFAM-GA in the practical environment, i.e., providing an accurate prediction with a concise justification pertaining to the prediction to the domain users, therefore allowing domain users to adopt QFAM-GA as a useful decision support tool in assisting their decision-making processes.  相似文献   

19.
Nearest neighbor (NN) classification assumes locally constant class conditional probabilities, and suffers from bias in high dimensions with a small sample set. In this paper, we propose a novel cam weighted distance to ameliorate the curse of dimensionality. Different from the existing neighborhood-based methods which only analyze a small space emanating from the query sample, the proposed nearest neighbor classification using the cam weighted distance (CamNN) optimizes the distance measure based on the analysis of inter-prototype relationship. Our motivation comes from the observation that the prototypes are not isolated. Prototypes with different surroundings should have different effects in the classification. The proposed cam weighted distance is orientation and scale adaptive to take advantage of the relevant information of inter-prototype relationship, so that a better classification performance can be achieved. Experiments show that CamNN significantly outperforms one nearest neighbor classification (1-NN) and k-nearest neighbor classification (k-NN) in most benchmarks, while its computational complexity is comparable with that of 1-NN classification.  相似文献   

20.
The use of artificial intelligence methods in biological data analysis has been increased recent since performance of the classification and detection systems have improved considerably to help medical experts in diagnosing. In this paper, we investigate the performance of an artificial immune system (AIS) based fuzzy k-NN algorithm with and without cross validation in a class of imbalanced problems in bioinformatics. Furthermore, we devise an unsupervised AIS algorithm in a supervised manner which contains a training stage for data reduction and a classification stage using fuzzy k-NN algorithm. The experiments show the efficacy of the proposed method with promising results. Using the Escherichia coli and yeast database, we compare the classification accuracy of the proposed method with those of other methods which have been proposed in the literature. The proposed hybrid system produced much more accurate results than the Horton and Nakai's method [P. Horton, K. Nakai, Better prediction of protein cellular localization sites with the k-nearest neighbors classifier, in: Proceedings of Intelligent Systems in Molecular Biology, Halkidiki, Greece, 1997, pp. 368–383]. Besides the improvement on the classification accuracy, one of the important aspects of the proposed method is the complexity. As the proposed AIS method incorporates data reduction in the training stage, the training complexity is considerably low comparing with the k-NN classifier.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号