首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
In classification problems with hierarchical structures of labels, the target function must assign labels that are hierarchically organized and it can be used either for single-label (one label per instance) or multi-label classification problems (more than one label per instance). In parallel to these developments, the idea of semi-supervised learning has emerged as a solution to the problems found in a standard supervised learning procedure (used in most classification algorithms). It combines labelled and unlabelled data during the training phase. Some semi-supervised methods have been proposed for single-label classification methods. However, very little effort has been done in the context of multi-label hierarchical classification. Therefore, this paper proposes a new method for supervised hierarchical multi-label classification, called HMC-RAkEL. Additionally, we propose the use of semi-supervised learning, self-training, in hierarchical multi-label classification, leading to three new methods, called HMC-SSBR, HMC-SSLP and HMC-SSRAkEL. In order to validate the feasibility of these methods, an empirical analysis will be conducted, comparing the proposed methods with their corresponding supervised versions. The main aim of this analysis is to observe whether the semi-supervised methods proposed in this paper have similar performance of the corresponding supervised versions.  相似文献   

2.
In this work, we consider dimensionality reduction in supervised settings and, specifically, we focus on regression problems. A novel algorithm, the supervised distance preserving projection (SDPP), is proposed. The SDPP minimizes the difference between pairwise distances among projected input covariates and distances among responses locally. This minimization of distance differences leads to the effect that the local geometrical structure of the low-dimensional subspace retrieved by the SDPP mimics that of the response space. This, not only facilitates an efficient regressor design but it also uncovers useful information for visualization. The SDPP achieves this goal by learning a linear parametric mapping and, thus, it can easily handle out-of-sample data points. For nonlinear data, a kernelized version of the SDPP is also derived. In addition, an intuitive extension of the SDPP is proposed to deal with classification problems. The experimental evaluation on both synthetic and real-world data sets demonstrates the effectiveness of the SDPP, showing that it performs comparably or superiorly to state-of-the-art approaches.  相似文献   

3.
In the past few years, the computer vision and pattern recognition community has witnessed a rapid growth of a new kind of feature extraction method, the manifold learning methods, which attempt to project the original data into a lower dimensional feature space by preserving the local neighborhood structure. Among these methods, locality preserving projection (LPP) is one of the most promising feature extraction techniques. Unlike the unsupervised learning scheme of LPP, this paper follows the supervised learning scheme, i.e. it uses both local information and class information to model the similarity of the data. Based on novel similarity, we propose two feature extraction algorithms, supervised optimal locality preserving projection (SOLPP) and normalized Laplacian-based supervised optimal locality preserving projection (NL-SOLPP). Optimal here means that the extracted features via SOLPP (or NL-SOLPP) are statistically uncorrelated and orthogonal. We compare the proposed SOLPP and NL-SOLPP with LPP, orthogonal locality preserving projection (OLPP) and uncorrelated locality preserving projection (ULPP) on publicly available data sets. Experimental results show that the proposed SOLPP and NL-SOLPP achieve much higher recognition accuracy.  相似文献   

4.
Prototype classifiers are a type of pattern classifiers, whereby a number of prototypes are designed for each class so as they act as representatives of the patterns of the class. Prototype classifiers are considered among the simplest and best performers in classification problems. However, they need careful positioning of prototypes to capture the distribution of each class region and/or to define the class boundaries. Standard methods, such as learning vector quantization (LVQ), are sensitive to the initial choice of the number and the locations of the prototypes and the learning rate. In this article, a new prototype classification method is proposed, namely self-generating prototypes (SGP). The main advantage of this method is that both the number of prototypes and their locations are learned from the training set without much human intervention. The proposed method is compared with other prototype classifiers such as LVQ, self-generating neural tree (SGNT) and K-nearest neighbor (K-NN) as well as Gaussian mixture model (GMM) classifiers. In our experiments, SGP achieved the best performance in many measures of performance, such as training speed, and test or classification speed. Concerning number of prototypes, and test classification accuracy, it was considerably better than the other methods, but about equal on average to the GMM classifiers. We also implemented the SGP method on the well-known STATLOG benchmark, and it beat all other 21 methods (prototype methods and non-prototype methods) in classification accuracy.  相似文献   

5.
Gaussian mixture model (GMM) is a flexible tool for image segmentation and image classification. However, one main limitation of GMM is that it does not consider spatial information. Some authors introduced global spatial information from neighbor pixels into GMM without taking the image content into account. The technique of saliency map, which is based on the human visual system, enhances the image regions with high perceptive information. In this paper, we propose a new model, which incorporates the image content-based spatial information extracted from saliency map into the conventional GMM. The proposed method has several advantages: It is easy to implement into the expectation–maximization algorithm for parameters estimation, and therefore, there is only little impact in computational cost. Experimental results performed on the public Berkeley database show that the proposed method outperforms the state-of-the-art methods in terms of accuracy and computational time.  相似文献   

6.
7.
On classification with incomplete data   总被引:4,自引:0,他引:4  
We address the incomplete-data problem in which feature vectors to be classified are missing data (features). A (supervised) logistic regression algorithm for the classification of incomplete data is developed. Single or multiple imputation for the missing data is avoided by performing analytic integration with an estimated conditional density function (conditioned on the observed data). Conditional density functions are estimated using a Gaussian mixture model (GMM), with parameter estimation performed using both expectation-maximization (EM) and variational Bayesian EM (VB-EM). The proposed supervised algorithm is then extended to the semisupervised case by incorporating graph-based regularization. The semisupervised algorithm utilizes all available data-both incomplete and complete, as well as labeled and unlabeled. Experimental results of the proposed classification algorithms are shown  相似文献   

8.
There is a great interest in dimensionality reduction techniques for tackling the problem of high-dimensional pattern classification. This paper addresses the topic of supervised learning of a linear dimension reduction mapping suitable for classification problems. The proposed optimization procedure is based on minimizing an estimation of the nearest neighbor classifier error probability, and it learns a linear projection and a small set of prototypes that support the class boundaries. The learned classifier has the property of being very computationally efficient, making the classification much faster than state-of-the-art classifiers, such as SVMs, while having competitive recognition accuracy. The approach has been assessed through a series of experiments, showing a uniformly good behavior, and competitive compared with some recently proposed supervised dimensionality reduction techniques.  相似文献   

9.
视觉词袋(Visual Bag-of-Words)模型在图像分类、检索和识别等计算机视觉领域有了广泛的应用,但是视觉词袋模型中词汇数目往往是根据经验确定或者采用有监督的交叉学习选取。提出一种确定视觉词袋模型中词汇数目的无监督方法,利用模型选择的思想来解决问题。使用高斯混合模型描述具有不同词汇数目的视觉词袋,计算各模型贝叶斯信息准则的值,选取贝叶斯信息准则最小值对应的词汇数目。与交叉验证的监督学习在图像分类实验的对比结果说明该方法准确有效。  相似文献   

10.
Sculptured surface machining using triangular mesh slicing   总被引:7,自引:0,他引:7  
In this paper, an optimized procedure for tool path generation in regional milling is presented. The proposed procedure computes tool paths by slicing a CL-surface (Cutter Location surface), which is a triangular, mesh containing invalid triangles. Tool path generation consists of two steps: firstly, it obtains a set of line segments by slicing the triangular mesh with two-dimensional geometric elements (slicing elements), and, secondly, it extracts a valid tool path from the line segments by removing invalid portions. Two algorithms based on the slicing elements are presented: a ‘line projection’ algorithm based on the plane sweeping paradigm, which works efficiently by using the characteristics of a monotone chain; and a ‘curve projection’ algorithm for the projection of curves, which transforms the curve projection problem into a line projection problem by mapping the XYZ-space of the cylinder surface to the TZ-plane of the unfolded cylinder. The proposed procedure has been implemented and applied to tool path generation in regional milling. Performance tests show the efficiency of the proposed procedure.  相似文献   

11.
Text representation has received extensive attention in text mining tasks. There are various text representation models. Among them, vector space model is the most commonly used one. For vector space model, the core technique is term weighting. To date, a great deal of different term-weighting methods have been proposed, which can be divided into supervised group and unsupervised group. However, it is not advisable to use these two groups of methods directly in semi-supervised applications. In semi-supervised applications, the majority of the supervised term-weighting methods are not applicable as the label information is insufficient; meanwhile, the unsupervised term-weighting methods cannot make use of the provided category labels. Thus, a semi-supervised learning framework for iteratively revising the text representation by an EM-like strategy is proposed in this paper. Furthermore, a new supervised term-weighting method t f.sd f is proposed. T f.sd f has the ability to emphasize the importance of terms that are unevenly distributed among all the classes and weaken the importance of terms that are uniformly distributed. Experimental results on real text data show that the proposed semi-supervised learning framework with the aid of t f.sd f performs well. Also, t f.sd f is shown to be efficient for supervised learning.  相似文献   

12.
The traditional CCA and 2D-CCA algorithms are unsupervised multiple feature extraction methods. Hence, introducing the supervised information of samples into these methods should be able to promote the classification performance. In this paper, a novel method is proposed to carry out the multiple feature extraction for classification, called two-dimensional supervised canonical correlation analysis (2D-SCCA), in which the supervised information is added to the criterion function. Then, by analyzing the relationship between GCCA and 2D-SCCA, another feature extraction method called multiple-rank supervised canonical correlation analysis (MSCCA) is also developed. Different from 2D-SCCA, in MSCCA k pairs left transforms and k pairs right transforms are sought to maximize the correlation. The convergence behavior and computational complexity of the algorithms are analyzed. Experimental results on real-world databases demonstrate the viability of the formulation, they also show that the classification results of our methods are higher than the other’s and the computing time is competitive. In this manner, the proposed methods proved to be the competitive multiple feature extraction and classification methods. As such, the two methods may well help to improve image recognition tasks, which are essential in many advanced expert and intelligent systems.  相似文献   

13.
This work presents a classification technique for hyperspectral image analysis when concurrent ground truth is either unavailable or available. The method adopts a principal component analysis (PCA)-based projection pursuit (PP) procedure with an entropy index for dimensionality reduction, followed by a Markov random field (MRF) model-based segmentation. An ordinal optimization approach to PP determines a set of ‘good enough projections’ with high probability, the best among which is chosen with the help of MRF model-based segmentation. When ground-truth is absent, the segmented output obtained is labelled with the desired number of classes so that it resembles the natural scene closely. When the land-cover classes are in detailed level, some special reflectance characteristics based on the classes of the study area are determined and incorporated in the segmentation stage. Segments are evaluated with training samples so as to yield a classified image with respect to the type of ground-truth data. Two illustrations are presented: (i) an AVIRIS-92AV3C image with concurrent ground truth – for both supervised and unsupervised cases and (ii) an EO-1 Hyperion sensor image with concurrent ground-truth at detailed level classes. Provided with the illustrations are comparisons of classification accuracies and computational times of other approaches with those of the proposed methodology. Experimental results demonstrate that the proposed method provides high classification accuracy and is not computationally intensive.  相似文献   

14.
In this work, we discuss a recently proposed approach for supervised dimensionality reduction, the Supervised Distance Preserving Projection (SDPP) and, we investigate its applicability to monitoring material's properties from spectroscopic observations. Motivated by continuity preservation, the SDPP is a linear projection method where the proximity relations between points in the low-dimensional subspace mimic the proximity relations between points in the response space. Such a projection facilitates the design of efficient regression models and it may also uncover useful information for visualisation. An experimental evaluation is conducted to show the performance of the SDPP and compare it with a number of state-of-the-art approaches for unsupervised and supervised dimensionality reduction. The regression step after projection is performed using computationally light models with low maintenance cost like Multiple Linear Regression and Locally Linear Regression with k-NN neighbourhoods. For the evaluation, a benchmark and a full-scale calibration problem are discussed. The case studies pertain the estimation of a number of chemico-physical properties in diesel fuels and in light cycle oils, starting from near-infrared spectra. Based on the experimental results, we found that the SDPP leads to parsimonious projections that can be used to design light and yet accurate estimation models.  相似文献   

15.
目的 典型相关分析是一种经典的多视图学习方法。为了提高投影方向的判别性能,现有典型相关分析方法通常采用引入样本标签信息的策略。然而,获取样本的标签信息需要付出大量的人力与物力,为此,提出了一种联合标签预测与判别投影学习的半监督典型相关分析算法。方法 将标签预测与模型构建相融合,具体地说,将标签预测融入典型相关分析框架中,利用联合学习框架学得的标签矩阵更新投影方向,进而学得的投影方向又重新更新标签矩阵。标签预测与投影方向的学习过程相互依赖、交替更新,预测标签不断地接近其真实标签,有利于学得最优的投影方向。结果 本文方法在AR、Extended Yale B、Multi-PIE和ORL这4个人脸数据集上分别进行实验。特征维度为20时,在AR、Extended Yale B、Multi-PIE和ORL人脸数据集上分别取得87%、55%、83%和85%识别率。取训练样本中每人2(3,4,5)幅人脸图像为监督样本,提出的方法识别率在4个人脸数据集上均高于其他方法。训练样本中每人5幅人脸图像为监督样本,在AR、Extended Yale B、Multi-PIE和ORL人脸数据集上分别取得94.67%、68%、83%和85%识别率。实验结果表明在训练样本标签信息较少情况下以及特征降维后的维数较低的情况下,联合学习模型使得降维后的数据最大限度地保存更加有效的信息,得到较好的识别结果。结论 本文提出的联合学习方法提高了学习的投影方向的判别性能,能够有效地处理少量的有标签样本和大量的无标签样本的情况以及解决两步学习策略的缺陷。  相似文献   

16.
Massive textual data management and mining usually rely on automatic text classification technology. Term weighting is a basic problem in text classification and directly affects the classification accuracy. Since the traditional TF-IDF (term frequency & inverse document frequency) is not fully effective for text classification, various alternatives have been proposed by researchers. In this paper we make comparative studies on different term weighting schemes and propose a new term weighting scheme, TF-IGM (term frequency & inverse gravity moment), as well as its variants. TF-IGM incorporates a new statistical model to precisely measure the class distinguishing power of a term. Particularly, it makes full use of the fine-grained term distribution across different classes of text. The effectiveness of TF-IGM is validated by extensive experiments of text classification using SVM (support vector machine) and kNN (k nearest neighbors) classifiers on three commonly used corpora. The experimental results show that TF-IGM outperforms the famous TF-IDF and the state-of-the-art supervised term weighting schemes. In addition, some new findings different from previous studies are obtained and analyzed in depth in the paper.  相似文献   

17.
We propose a new approach to estimate the a priori signal-to-noise ratio (SNR) based on a multiple linear regression (MLR) technique. In contrast to estimation of the a priori SNR employing the decision-directed (DD) method, which uses the estimated speech spectrum in previous frame, we propose to find the a priori SNR based on the MLR technique by incorporating regression parameters such as the ratio between the local energy of the noisy speech and its derived minimum along with the a posteriori SNR. In the experimental step, regression coefficients obtained using the MLR are assigned according to various noise types, for which we employ a real-time noise classification scheme based on a Gaussian mixture model (GMM). Evaluations using both objective speech quality measures and subjective listening tests under various ambient noise environments show that the performance of the proposed algorithm is better than that of the conventional methods.  相似文献   

18.
将投影寻踪回归分析技术引入遥感影像分类中,详尽叙述遥感影像投影寻踪回归分类模型的建立和实现过程。将广州地区的TM影像用于分类实验,并用混合蛙跳算法来优化投影寻踪回归分类模型中的参数矩阵,取得了较为理想的分类效果。此外,还进一步分析了投影中心的设定、调整以及优化算法和岭函数个数对投影寻踪回归模型分类精度的影响。实验结果表明,该模型易于优化实现,稳定性强,模型中岭函数的个数对投影寻踪回归模型的分类精度没有显著影响。  相似文献   

19.
Text categorization is a significant technique to manage the surging text data on the Internet.The k-nearest neighbors(kNN) algorithm is an effective,but not efficient,classification model for text categorization.In this paper,we propose an effective strategy to accelerate the standard kNN,based on a simple principle:usually,near points in space are also near when they are projected into a direction,which means that distant points in the projection direction are also distant in the original space.Using the proposed strategy,most of the irrelevant points can be removed when searching for the k-nearest neighbors of a query point,which greatly decreases the computation cost.Experimental results show that the proposed strategy greatly improves the time performance of the standard kNN,with little degradation in accuracy.Specifically,it is superior in applications that have large and high-dimensional datasets.  相似文献   

20.
The aims of this paper are two-fold: to define Gaussian mixture models (GMMs) of colored texture on several feature spaces and to compare the performance of these models in various classification tasks, both with each other and with other models popular in the literature. We construct GMMs over a variety of different color and texture feature spaces, with a view to the retrieval of textured color images from databases. We compare supervised classification results for different choices of color and texture features using the Vistex database, and explore the best set of features and the best GMM configuration for this task. In addition we introduce several methods for combining the ‘color’ and ‘structure’ information in order to improve the classification performances. We then apply the resulting models to the classification of texture databases and to the classification of man-made and natural areas in aerial images. We compare the GMM model with other models in the literature, and show an overall improvement in performance.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号