首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 0 毫秒
1.
We present an extensive empirical comparison between nineteen prototypical supervised ensemble learning algorithms, including Boosting, Bagging, Random Forests, Rotation Forests, Arc-X4, Class-Switching and their variants, as well as more recent techniques like Random Patches. These algorithms were compared against each other in terms of threshold, ranking/ordering and probability metrics over nineteen UCI benchmark data sets with binary labels. We also examine the influence of two base learners, CART and Extremely Randomized Trees, on the bias–variance decomposition and the effect of calibrating the models via Isotonic Regression on each performance metric. The selected data sets were already used in various empirical studies and cover different application domains. The source code and the detailed results of our study are publicly available.  相似文献   

2.
The paper presents a novel framework for large class, binary pattern classification problem by learning-based combination of multiple features. In particular, class of binary patterns including characters/primitives and symbols has been considered in the scope of this work. We demonstrate novel binary multiple kernel learning-based classification architecture for applications including such problems for fast and efficient performance. The character/primitive classification problem primarily concentrates on Gujarati and Bangla character recognition from the analytical and experimental context. A novel feature representation scheme for symbols images is introduced containing the necessary elastic and non-elastic deformation invariance properties. The experimental efficacy of proposed framework for symbol classification has been demonstrated on two public data sets.  相似文献   

3.
One of the simplest, and yet most consistently well-performing set of classifiers is the naïve Bayes models (a special class of Bayesian network models). However, these models rely on the (naïve) assumption that all the attributes used to describe an instance are conditionally independent given the class of that instance. To relax this independence assumption, we have in previous work proposed a family of models, called latent classification models (LCMs). LCMs are defined for continuous domains and generalize the naïve Bayes model by using latent variables to model class-conditional dependencies between the attributes. In addition to providing good classification accuracy, the LCM has several appealing properties, including a relatively small parameter space making it less susceptible to over-fitting. In this paper we take a first step towards generalizing LCMs to hybrid domains, by proposing an LCM for domains with binary attributes. We present algorithms for learning the proposed model, and we describe a variational approximation-based inference procedure. Finally, we empirically compare the accuracy of the proposed model to the accuracy of other classifiers for a number of different domains, including the problem of recognizing symbols in black and white images.  相似文献   

4.
Neural Computing and Applications - Feature Selection (FS) is an important preprocessing step that is involved in machine learning and data mining tasks for preparing data (especially...  相似文献   

5.
Effectiveness of local binary pattern (LBP) features is well proven in the field of texture image classification and retrieval. This paper presents a more effective completed modeling of the LBP. The traditional LBP has a shortcoming that sometimes it may represent different structural patterns with same LBP code. In addition, LBP also lacks global information and is sensitive to noise. In this paper, the binary patterns generated using threshold as a summation of center pixel value and average local differences are proposed. The proposed local structure patterns (LSP) can more accurately classify different textural structures as they utilize both local and global information. The LSP can be combined with a simple LBP and center pixel pattern to give a completed local structure pattern (CLSP) to achieve higher classification accuracy. In order to make CLSP insensitive to noise, a robust local structure pattern (RLSP) is also proposed. The proposed scheme is tested over three representative texture databases viz. Outex, Curet, and UIUC. The experimental results indicate that the proposed method can achieve higher classification accuracy while being more robust to noise.  相似文献   

6.
Several meta-learning techniques for multi-label classification (MLC), such as chaining and stacking, have already been proposed in the literature, mostly aimed at improving predictive accuracy through the exploitation of label dependencies. In this paper, we propose another technique of that kind, called dependent binary relevance (DBR) learning. DBR combines properties of both, chaining and stacking. We provide a careful analysis of the relationship between these and other techniques, specifically focusing on the underlying dependency structure and the type of training data used for model construction. Moreover, we offer an extensive empirical evaluation, in which we compare different techniques on MLC benchmark data. Our experiments provide evidence for the good performance of DBR in terms of several evaluation measures that are commonly used in MLC.  相似文献   

7.
Pan  Zhibin  Wu  Xiuquan  Li  Zhengyi 《Multimedia Tools and Applications》2020,79(9-10):5477-5500
Multimedia Tools and Applications - Local binary pattern (LBP) has already been proved to be a powerful measure of image texture with fixed sampling scheme: all P neighbor pixels in a single-scale...  相似文献   

8.
9.
Recently, the local binary patterns (LBP) have been widely used in the texture classification. The LBP methods obtain the binary pattern by comparing the gray scales of pixels on a small circular region with the gray scale of their central pixel. The conventional LBP methods only describe microstructures of texture images, such as edges, corners, spots and so on, although many of them show good performances on the texture classification. This situation still could not be changed, even though the multi-resolution analysis technique is adopted by LBP methods. Moreover, the circular sampling region limits the ability of the conventional LBP methods in describing anisotropic features. In this paper, we change the shape of sampling region and get an extended LBP operator. And a multi-structure local binary pattern (Ms-LBP) operator is achieved by executing the extended LBP operator on different layers of an image pyramid. Thus, the proposed method is simple yet efficient to describe four types of structures: isotropic microstructure, isotropic macrostructure, anisotropic microstructure and anisotropic macrostructure. We demonstrate the performance of our method on two public texture databases: the Outex and the CUReT. The experimental results show the advantages of the proposed method.  相似文献   

10.
Chu  Hong-Min  Huang  Kuan-Hao  Lin  Hsuan-Tien 《Machine Learning》2019,108(8-9):1193-1230

We study multi-label classification (MLC) with three important real-world issues: online updating, label space dimension reduction (LSDR), and cost-sensitivity. Current MLC algorithms have not been designed to address these three issues simultaneously. In this paper, we propose a novel algorithm, cost-sensitive dynamic principal projection (CS-DPP) that resolves all three issues. The foundation of CS-DPP is an online LSDR framework derived from a leading LSDR algorithm. In particular, CS-DPP is equipped with an efficient online dimension reducer motivated by matrix stochastic gradient, and establishes its theoretical backbone when coupled with a carefully-designed online regression learner. In addition, CS-DPP embeds the cost information into label weights to achieve cost-sensitivity along with theoretical guarantees. Experimental results verify that CS-DPP achieves better practical performance than current MLC algorithms across different evaluation criteria, and demonstrate the importance of resolving the three issues simultaneously.

  相似文献   

11.
Minimum classification error training for online handwriting recognition   总被引:1,自引:0,他引:1  
This paper describes an application of the minimum classification error (MCE) criterion to the problem of recognizing online unconstrained-style characters and words. We describe an HMM-based, character and word-level MCE training aimed at minimizing the character or word error rate while enabling flexibility in writing style through the use of multiple allographs per character. Experiments on a writer-independent character recognition task covering alpha-numerical characters and keyboard symbols show that the MCE criterion achieves more than 30 percent character error rate reduction compared to the baseline maximum likelihood-based system. Word recognition results, on vocabularies of 5k to 10k, show that MCE training achieves around 17 percent word error rate reduction when compared to the baseline maximum likelihood system.  相似文献   

12.
This paper proposes a cellular automata-based solution of a binary classification problem. The proposed method is based on a two-dimensional, three-state cellular automaton (CA) with the von Neumann neighborhood. Since the number of possible CA rules (potential CA-based classifiers) is huge, searching efficient rules is conducted with use of a genetic algorithm (GA). Experiments show an excellent performance of discovered rules in solving the classification problem. The best found rules perform better than the heuristic CA rule designed by a human and also better than one of the most widely used statistical method: the k-nearest neighbors algorithm (k-NN). Experiments show that CAs rules can be successfully reused in the process of searching new rules.  相似文献   

13.

In this paper, an extended complete LBP (ELBP) for texture classification is proposed, in which the local feature vectors are composed of the ratio of the central pixel and its neighborhood pixels to a specific threshold. ECLBP_C represents the gray level of the image, which is obtained by comparing the center pixel with the global threshold. ECLBP_S and ECLBP_M represent the symbol component and the magnitude component of the 3-neighbor region of the center pixel respectively, which are obtained by calculating two binary codes using the original LBP algorithm for the 3-neighbor region of the center pixel. In order to make the proposed algorithm scalable, in addition to the 3-neighbor pixels of the central pixel, the proposed algorithm use the center pixel as the center, r as the radius in the circle with ɑ as the filter’s radius to generate extended binary coding, such as ECLBP_ES_r,α ECLBP_EM_ r,α. In order to describe the local region feature vector in detail, specified ECLBP_ES_r,α and ECLBP_EM_r,α can be obtained by defining the number of extensions according to actual needs, and then established and concatenated all ECLBP gray histograms for statistics. In the experimental part, we analyze the performance of the proposed algorithm in detail, and prove that the algorithm has good scalability and robustness. The experimental results show that the classification accuracy of the proposed algorithm is up to 99% after 3 expansions in Table 2. The source codes of the proposed algorithm can be downloaded from https://github.com/zenqiang/ECLBP.git.

  相似文献   

14.
15.
Adaptive binary tree for fast SVM multiclass classification   总被引:1,自引:0,他引:1  
Jin  Cheng  Runsheng   《Neurocomputing》2009,72(13-15):3370
This paper presents an adaptive binary tree (ABT) to reduce the test computational complexity of multiclass support vector machine (SVM). It achieves a fast classification by: (1) reducing the number of binary SVMs for one classification by using separating planes of some binary SVMs to discriminate other binary problems; (2) selecting the binary SVMs with the fewest average number of support vectors (SVs). The average number of SVs is proposed to denote the computational complexity to exclude one class. Compared with five well-known methods, experiments on many benchmark data sets demonstrate our method can speed up the test phase while remain the high accuracy of SVMs.  相似文献   

16.
We propose a scoring criterion, named mixture-based factorized conditional log-likelihood (mfCLL), which allows for efficient hybrid learning of mixtures of Bayesian networks in binary classification tasks. The learning procedure is decoupled in foreground and background learning, being the foreground the single concept of interest that we want to distinguish from a highly complex background. The overall procedure is hybrid as the foreground is discriminatively learned, whereas the background is generatively learned. The learning algorithm is shown to run in polynomial time for network structures such as trees and consistent κ-graphs. To gauge the performance of the mfCLL scoring criterion, we carry out a comparison with state-of-the-art classifiers. Results obtained with a large suite of benchmark datasets show that mfCLL-trained classifiers are a competitive alternative and should be taken into consideration.  相似文献   

17.
We present a novel method for online inference of real-valued quantities on a large network from very sparse measurements. The target application is a large scale system, like e.g. a traffic network, where a small varying subset of the variables is observed, and predictions about the other variables have to be continuously updated. A key feature of our approach is the modeling of dependencies between the original variables through a latent binary Markov random field. This greatly simplifies both the model selection and its subsequent use. We introduce the mirror belief propagation algorithm, that performs fast inference in such a setting. The offline model estimation relies only on pairwise historical data and its complexity is linear w.r.t. the dataset size. Our method makes no assumptions about the joint and marginal distributions of the variables but is primarily designed with multimodal joint distributions in mind. Numerical experiments demonstrate both the applicability and scalability of the method in practice.  相似文献   

18.
19.
With the advent of Big Data, data is being collected at an unprecedented fast pace, and it needs to be processed in a short time. To deal with data streams that flow continuously, classical batch learning algorithms cannot be applied and it is necessary to employ online approaches. Online learning consists of continuously revising and refining a model by incorporating new data as they arrive, and it allows important problems such as concept drift or management of extremely high-dimensional datasets to be solved. In this paper, we present a unified pipeline for online learning which covers online discretization, feature selection and classification. Three classical methods—the k-means discretizer, the χ2 filter and a one-layer artificial neural network—have been reimplemented to be able to tackle online data, showing promising results on both synthetic and real datasets.  相似文献   

20.
Syndromic surveillance can play an important role in protecting the public's health against infectious diseases. Infectious disease outbreaks can have a devastating effect on society as well as the economy, and global awareness is therefore critical to protecting against major outbreaks. By monitoring online news sources and developing an accurate news classification system for syndromic surveillance, public health personnel can be apprised of outbreaks and potential outbreak situations. In this study, we have developed a framework for automatic online news monitoring and classification for syndromic surveillance. The framework is unique and none of the techniques adopted in this study have been previously used in the context of syndromic surveillance on infectious diseases. In recent classification experiments, we compared the performance of different feature subsets on different machine learning algorithms. The results showed that the combined feature subsets including Bag of Words, Noun Phrases, and Named Entities features outperformed the Bag of Words feature subsets. Furthermore, feature selection improved the performance of feature subsets in online news classification. The highest classification performance was achieved when using SVM upon the selected combination feature subset.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号