首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
The manifold regularization (MR) based semi-supervised learning could explore structural relationships from both labeled and unlabeled data. However, the model selection of MR seriously affects its predictive performance due to the inherent additional geometry regularizer of labeled and unlabeled data. In this paper, two continuous and two inherent discrete hyperparameters are selected as optimization variables, and a leave-one-out cross-validation (LOOCV) based Predicted REsidual Sum of Squares (PRESS) criterion is first presented for model selection of MR to choose appropriate regularization coefficients and kernel parameters. Considering the inherent discontinuity of the two hyperparameters, the minimization process is implemented by using a improved Nelder-Mead simplex algorithm to solve the inherent discrete and continues hybrid variables set. The manifold regularization and model selection algorithm are applied to six synthetic and real-life benchmark dataset. The proposed approach, leveraged by effectively exploiting the embedded intrinsic geometric manifolds and unbiased LOOCV estimation, outperforms the original MR and supervised learning approaches in the empirical study.  相似文献   

2.
Ranking functions are an important component of information retrieval systems. Recently there has been a surge of research in the field of “learning to rank”, which aims at using labeled training data and machine learning algorithms to construct reliable ranking functions. Machine learning methods such as neural networks, support vector machines, and least squares have been successfully applied to ranking problems, and some are already being deployed in commercial search engines.Despite these successes, most algorithms to date construct ranking functions in a supervised learning setting, which assume that relevance labels are provided by human annotators prior to training the ranking function. Such methods may perform poorly when human relevance judgments are not available for a wide range of queries. In this paper, we examine whether additional unlabeled data, which is easy to obtain, can be used to improve supervised algorithms. In particular, we investigate the transductive setting, where the unlabeled data is equivalent to the test data.We propose a simple yet flexible transductive meta-algorithm: the key idea is to adapt the training procedure to each test list after observing the documents that need to be ranked. We investigate two instantiations of this general framework: The Feature Generation approach is based on discovering more salient features from the unlabeled test data and training a ranker on this test-dependent feature-set. The importance weighting approach is based on ideas in the domain adaptation literature, and works by re-weighting the training data to match the statistics of each test list. We demonstrate that both approaches improve over supervised algorithms on the TREC and OHSUMED tasks from the LETOR dataset.  相似文献   

3.
A machine learning framework which uses unlabeled data from a related task domain in supervised classification tasks is described. The unlabeled data come from related domains, which share the same class labels or generative distribution as the labeled data. Patterns in the unlabeled data are learned via a neural network and transferred to the target domain from where the labeled data are generated, so as to improve the performance of the supervised learning task. We call this approach self-taught transfer learning from unlabeled data. We introduce a general-purpose feature learning algorithm producing features that retain information from the unlabeled data. Information preservation assures that the features obtained will be useful for improving the classification performance of the supervised tasks.  相似文献   

4.
Support vector machine (SVM) is a general and powerful learning machine, which adopts supervised manner. However, for many practical machine learning and data mining applications, unlabeled training examples are readily available but labeled ones are very expensive to be obtained. Therefore, semi-supervised learning emerges as the times require. At present, the combination of SVM and semi-supervised learning principle such as transductive learning has attracted more and more attentions. Transductive support vector machine (TSVM) learns a large margin hyperplane classifier using labeled training data, but simultaneously force this hyperplane to be far away from the unlabeled data. TSVM might seem to be the perfect semi-supervised algorithm since it combines the powerful regularization of SVMs and a direct implementation of the clustering assumption, nevertheless its objective function is non-convex and then it is difficult to be optimized. This paper aims to solve this difficult problem. We apply least square support vector machine to implement TSVM, which can ensure that the objective function is convex and the optimization solution can then be easily found by solving a set of linear equations. Simulation results demonstrate that the proposed method can exploit unlabeled data to yield good performance effectively.  相似文献   

5.
In this paper, we consider the multi-task metric learning problem, i.e., the problem of learning multiple metrics from several correlated tasks simultaneously. Despite the importance, there are only a limited number of approaches in this field. While the existing methods often straightforwardly extend existing vector-based methods, we propose to couple multiple related metric learning tasks with the von Neumann divergence. On one hand, the novel regularized approach extends previous methods from the vector regularization to a general matrix regularization framework; on the other hand and more importantly, by exploiting von Neumann divergence as the regularization, the new multi-task metric learning method has the capability to well preserve the data geometry. This leads to more appropriate propagation of side-information among tasks and provides potential for further improving the performance. We propose the concept of geometry preserving probability and show that our framework encourages a higher geometry preserving probability in theory. In addition, our formulation proves to be jointly convex and the global optimal solution can be guaranteed. We have conducted extensive experiments on six data sets (across very different disciplines), and the results verify that our proposed approach can consistently outperform almost all the current methods.  相似文献   

6.
针对标记数据不足的多标签分类问题,提出一种新的半监督Boosting算法,即基于函数梯度下降方法给出一种半监督Boosting多标签分类的框架,并将非标记数据的条件熵作为一个正则化项引入分类模型。实验结果表明,对于多标签分类问题,新的半监督Boosting算法的分类效果随着非标记数据数量的增加而显著提高,在各方面都优于传统的监督Boosting算法。  相似文献   

7.
Label Propagation through Linear Neighborhoods   总被引:8,自引:0,他引:8  
In many practical data mining applications such as text classification, unlabeled training examples are readily available, but labeled ones are fairly expensive to obtain. Therefore, semi supervised learning algorithms have aroused considerable interests from the data mining and machine learning fields. In recent years, graph-based semi supervised learning has been becoming one of the most active research areas in the semi supervised learning community. In this paper, a novel graph-based semi supervised learning approach is proposed based on a linear neighborhood model, which assumes that each data point can be linearly reconstructed from its neighborhood. Our algorithm, named linear neighborhood propagation (LNP), can propagate the labels from the labeled points to the whole data set using these linear neighborhoods with sufficient smoothness. A theoretical analysis of the properties of LNP is presented in this paper. Furthermore, we also derive an easy way to extend LNP to out-of-sample data. Promising experimental results are presented for synthetic data, digit, and text classification tasks.  相似文献   

8.
Image collections are currently widely available and are being generated in a fast pace due to mobile and accessible equipment. In principle, that is a good scenario taking into account the design of successful visual pattern recognition systems. However, in particular for classification tasks, one may need to choose which examples are more relevant in order to build a training set that well represents the data, since they often require representative and sufficient observations to be accurate. In this paper we investigated three methods for selecting relevant examples from image collections based on learning models from small portions of the available data. We considered supervised methods that need labels to allow selection, and an unsupervised method that is agnostic to labels. The image datasets studied were described using both handcrafted and deep learning features. A general purpose algorithm is proposed which uses learning methods as subroutines. We show that our relevance selection algorithm outperforms random selection, in particular when using unlabelled data in an unsupervised approach, significantly reducing the size of the training set with little decrease in the test accuracy.  相似文献   

9.
Traditional learning algorithms use only labeled data for training. However, labeled examples are often difficult or time consuming to obtain since they require substantial human labeling efforts. On the other hand, unlabeled data are often relatively easy to collect. Semisupervised learning addresses this problem by using large quantities of unlabeled data with labeled data to build better learning algorithms. In this paper, we use the manifold regularization approach to formulate the semisupervised learning problem where a regularization framework which balances a tradeoff between loss and penalty is established. We investigate different implementations of the loss function and identify the methods which have the least computational expense. The regularization hyperparameter, which determines the balance between loss and penalty, is crucial to model selection. Accordingly, we derive an algorithm that can fit the entire path of solutions for every value of the hyperparameter. Its computational complexity after preprocessing is quadratic only in the number of labeled examples rather than the total number of labeled and unlabeled examples.  相似文献   

10.
Hidden Markov model (HMM) classifier design is considered for the analysis of sequential data, incorporating both labeled and unlabeled data for training; the balance between the use of labeled and unlabeled data is controlled by an allocation parameter lambda isin (0, 1), where lambda = 0 corresponds to purely supervised HMM learning (based only on the labeled data) and lambda = 1 corresponds to unsupervised HMM-based clustering (based only on the unlabeled data). The associated estimation problem can typically be reduced to solving a set of fixed-point equations in the form of a "natural-parameter homotopy." This paper applies a homotopy method to track a continuous path of solutions, starting from a local supervised solution (lambda = 0) to a local unsupervised solution (lambda = 1). The homotopy method is guaranteed to track with probability one from lambda = 0 to lambda = 1 if the lambda = 0 solution is unique; this condition is not satisfied for the HMM since the maximum likelihood supervised solution (lambda = 0) is characterized by many local optima. A modified form of the homotopy map for HMMs assures a track from lambda = 0 to lambda = 1. Following this track leads to a formulation for selecting lambda isin (0, 1) for a semisupervised solution and it also provides a tool for selection from among multiple local-optimal supervised solutions. The results of applying the proposed method to measured and synthetic sequential data verify its robustness and feasibility compared to the conventional EM approach for semisupervised HMM training.  相似文献   

11.
Feature selection is an important step for large-scale image data analysis, which has been proved to be difficult due to large size in both dimensions and samples. Feature selection firstly eliminates redundant and irrelevant features and then chooses a subset of features that performs as efficient as the complete set. Generally, supervised feature selection yields better performance than unsupervised feature selection because of the utilization of labeled information. However, labeled data samples are always expensive to obtain, which constraints the performance of supervised feature selection, especially for the large web image datasets. In this paper, we propose a semi-supervised feature selection algorithm that is based on a hierarchical regression model. Our contribution can be highlighted as: (1) Our algorithm utilizes a statistical approach to exploit both labeled and unlabeled data, which preserves the manifold structure of each feature type. (2) The predicted label matrix of the training data and the feature selection matrix are learned simultaneously, making the two aspects mutually benefited. Extensive experiments are performed on three large-scale image datasets. Experimental results demonstrate the better performance of our algorithm, compared with the state-of-the-art algorithms.  相似文献   

12.
The problem of model selection, or determination of the number of hidden units, can be approached statistically, by generalizing Akaike's information criterion (AIC) to be applicable to unfaithful (i.e., unrealizable) models with general loss criteria including regularization terms. The relation between the training error and the generalization error is studied in terms of the number of the training examples and the complexity of a network which reduces to the number of parameters in the ordinary statistical theory of AIC. This relation leads to a new network information criterion which is useful for selecting the optimal network model based on a given training set.  相似文献   

13.
Recently, semi-supervised learning (SSL) has attracted a great deal of attention in the machine learning community. Under SSL, large amounts of unlabeled data are used to assist the learning procedure to construct a more reasonable classifier. In this paper, we propose a novel manifold proximal support vector machine (MPSVM) for semi-supervised classification. By introducing discriminant information in the manifold regularization (MR), MPSVM not only introduces MR terms to capture as much geometric information as possible from inside the data, but also utilizes the maximum distance criterion to characterize the discrepancy between different classes, leading to the solution of a pair of eigenvalue problems. In addition, an efficient particle swarm optimization (PSO)-based model selection approach is suggested for MPSVM. Experimental results on several artificial as well as real-world datasets demonstrate that MPSVM obtains significantly better performance than supervised GEPSVM, and achieves comparable or better performance than LapSVM and LapTSVM, with better learning efficiency.  相似文献   

14.
波段选择是数据降维的有效手段,但有限的标记样本影响了监督波段选择的性能。提出一种利用图Laplacian和自训练策略实现半监督波段选择的方法。该方法首先定义基于图的半监督特征评分准则以产生初始波段子集,接着在该子集基础上进行分类,采用自训练策略将部分可信度较高的非标记样本扩展至标记样本集合,再用特征评分准则对波段子集进行更新。重复该过程,获得最终波段子集。高光谱波段选择与分类实验比较了多种非监督、监督和半监督方法,实验结果表明所提算法能选择出更好的波段子集。  相似文献   

15.
This paper presents a convex optimization perspective towards the task of tuning the regularization trade-off with validation and cross-validation criteria in the context of kernel machines. We focus on the problem of tuning the regularization trade-off in the context of Least Squares Support Vector Machines (LS-SVMs) for function approximation and classification. By adopting an additive regularization trade-off scheme, the task of tuning the regularization trade-off with respect to a validation and cross-validation criterion can be written as a convex optimization problem. The solution of this problem then contains both the optimal regularization constants with respect to the model selection criterion at hand, and the corresponding training solution. We refer to such formulations as the fusion of training with model selection. The major tool to accomplish this task is found in the primal-dual derivations as occuring in convex optimization theory. The paper advances the discussion by relating the additive regularization trade-off scheme with the classical Tikhonov scheme. Motivations are given for the usefulness of the former scheme. Furthermore, it is illustrated how to restrict the additive trade-off scheme towards the solution path corresponding with a Tikhonov scheme while retaining convexity of the overall problem of fusion of model selection and training. We relate such a scheme with an ensemble learning problem and with stability of learning machines. The approach is illustrated on a number of artificial and benchmark datasets relating the proposed method with the classical practice of tuning the Tikhonov scheme with a cross-validation measure.  相似文献   

16.
Previous partially supervised classification methods can partition unlabeled data into positive examples and negative examples for a given class by learning from positive labeled examples and unlabeled examples, but they cannot further group the negative examples into meaningful clusters even if there are many different classes in the negative examples. Here we proposed an automatic method to obtain a natural partitioning of mixed data (labeled data + unlabeled data) by maximizing a stability criterion defined on classification results from an extended label propagation algorithm over all the possible values of model order (or the number of classes) in mixed data. Our experimental results on benchmark corpora for word sense disambiguation task indicate that this model order identification algorithm with the extended label propagation algorithm as the base classifier outperforms SVM, a one-class partially supervised classification algorithm, and the model order identification algorithm with semi-supervised k-means clustering as the base classifier when labeled data is incomplete.  相似文献   

17.
Semi-supervised learning has attracted a significant amount of attention in pattern recognition and machine learning. Most previous studies have focused on designing special algorithms to effectively exploit the unlabeled data in conjunction with labeled data. Our goal is to improve the classification accuracy of any given supervised learning algorithm by using the available unlabeled examples. We call this as the Semi-supervised improvement problem, to distinguish the proposed approach from the existing approaches. We design a metasemi-supervised learning algorithm that wraps around the underlying supervised algorithm and improves its performance using unlabeled data. This problem is particularly important when we need to train a supervised learning algorithm with a limited number of labeled examples and a multitude of unlabeled examples. We present a boosting framework for semi-supervised learning, termed as SemiBoost. The key advantages of the proposed semi-supervised learning approach are: 1) performance improvement of any supervised learning algorithm with a multitude of unlabeled data, 2) efficient computation by the iterative boosting algorithm, and 3) exploiting both manifold and cluster assumption in training classification models. An empirical study on 16 different data sets and text categorization demonstrates that the proposed framework improves the performance of several commonly used supervised learning algorithms, given a large number of unlabeled examples. We also show that the performance of the proposed algorithm, SemiBoost, is comparable to the state-of-the-art semi-supervised learning algorithms.  相似文献   

18.
3D shape recognition has been actively investigated in the field of computer graphics. With the rapid development of deep learning, various deep models have been introduced and achieved remarkable results. Most 3D shape recognition methods are supervised and learn only from the large amount of labeled shapes. However, it is expensive and time consuming to obtain such a large training set. In contrast to these methods, this paper studies a semi-supervised learning framework to train a deep model for 3D shape recognition by using both labeled and unlabeled shapes. Inspired by the co-training algorithm, our method iterates between model training and pseudo-label generation phases. In the model training phase, we train two deep networks based on the point cloud and multi-view representation simultaneously. In the pseudo-label generation phase, we generate the pseudo-labels of the unlabeled shapes using the joint prediction of two networks, which augments the labeled set for the next iteration. To extract more reliable consensus information from multiple representations, we propose an uncertainty-aware consistency loss function to combine the two networks into a multimodal network. This not only encourages the two networks to give similar predictions on the unlabeled set, but also eliminates the negative influence of the large performance gap between the two networks. Experiments on the benchmark ModelNet40 demonstrate that, with only 10% labeled training data, our approach achieves competitive performance to the results reported by supervised methods.  相似文献   

19.
Word sense disambiguation (WSD) is the problem of determining the right sense of a polysemous word in a certain context. This paper investigates the use of unlabeled data for WSD within a framework of semi-supervised learning, in which labeled data is iteratively extended from unlabeled data. Focusing on this approach, we first explicitly identify and analyze three problems inherently occurred piecemeal in the general bootstrapping algorithm; namely the imbalance of training data, the confidence of new labeled examples, and the final classifier generation; all of which will be considered integratedly within a common framework of bootstrapping. We then propose solutions for these problems with the help of classifier combination strategies. This results in several new variants of the general bootstrapping algorithm. Experiments conducted on the English lexical samples of Senseval-2 and Senseval-3 show that the proposed solutions are effective in comparison with previous studies, and significantly improve supervised WSD.  相似文献   

20.
We develop a supervised dimensionality reduction method, called Lorentzian discriminant projection (LDP), for feature extraction and classification. Our method represents the structures of sample data by a manifold, which is furnished with a Lorentzian metric tensor. Different from classic discriminant analysis techniques, LDP uses distances from points to their within-class neighbors and global geometric centroid to model a new manifold to detect the intrinsic local and global geometric structures of data set. In this way, both the geometry of a group of classes and global data structures can be learnt from the Lorentzian metric tensor. Thus discriminant analysis in the original sample space reduces to metric learning on a Lorentzian manifold. We also establish the kernel, tensor and regularization extensions of LDP in this paper. The experimental results on benchmark databases demonstrate the effectiveness of our proposed method and the corresponding extensions.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号