首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到10条相似文献,搜索用时 162 毫秒
1.
In domains like bioinformatics, information retrieval and social network analysis, one can find learning tasks where the goal consists of inferring a ranking of objects, conditioned on a particular target object. We present a general kernel framework for learning conditional rankings from various types of relational data, where rankings can be conditioned on unseen data objects. We propose efficient algorithms for conditional ranking by optimizing squared regression and ranking loss functions. We show theoretically, that learning with the ranking loss is likely to generalize better than with the regression loss. Further, we prove that symmetry or reciprocity properties of relations can be efficiently enforced in the learned models. Experiments on synthetic and real-world data illustrate that the proposed methods deliver state-of-the-art performance in terms of predictive power and computational efficiency. Moreover, we also show empirically that incorporating symmetry or reciprocity properties can improve the generalization performance.  相似文献   

2.
High dimensional data arising from diverse scientific research fields and industrial development have led to increased interest in sparse learning due to model parsimony and computational advantage. With the assumption of sparsity, many computational problems can be handled efficiently in practice. Structured sparse learning encodes the structural information of the variables and has been quite successful in numerous research fields. With various types of structures discovered, sorts of structured regularizations have been proposed. These regularizations have greatly improved the efficacy of sparse learning algorithms through the use of specific structural information. In this article, we present a systematic review of structured sparse learning including ideas, formulations, algorithms, and applications. We present these algorithms in the unified framework of minimizing the sum of loss and penalty functions, summarize publicly accessible software implementations, and compare the computational complexity of typical optimization methods to solve structured sparse learning problems. In experiments, we present applications in unsupervised learning, for structured signal recovery and hierarchical image reconstruction, and in supervised learning in the context of a novel graph-guided logistic regression.  相似文献   

3.
Graph representations of data are increasingly common. Such representations arise in a variety of applications, including computational biology, social network analysis, web applications, and many others. There has been much work in recent years on developing learning algorithms for such graph data; in particular, graph learning algorithms have been developed for both classification and regression on graphs. Here we consider graph learning problems in which the goal is not to predict labels of objects in a graph, but rather to rank the objects relative to one another; for example, one may want to rank genes in a biological network by relevance to a disease, or customers in a social network by their likelihood of being interested in a certain product. We develop algorithms for such problems of learning to rank on graphs. Our algorithms build on the graph regularization ideas developed in the context of other graph learning problems, and learn a ranking function in a reproducing kernel Hilbert space (RKHS) derived from the graph. This allows us to show attractive stability and generalization properties. Experiments on several graph ranking tasks in computational biology and in cheminformatics demonstrate the benefits of our framework.  相似文献   

4.
As many databases have been brought online, data retrieval??finding relevant data from large databases??has become a nontrivial task. A feedback-based data retrieval system was proposed to provide user with an intuitive way for expressing their preferences in queries. The system iteratively receives a partial ordering on a sample of data from the user, learns a ranking function, and returns highly ranked results according to the function. An important issue in such retrieval systems is minimizing the number of iterations or the amount of feedback to learn an accurate ranking function. This paper proposes selective sampling (or active learning) techniques for RankSVM that can be used in the retrieval systems. The proposed techniques minimizes the amount of user interaction to learn an accurate ranking function thus facilitates users formulating a preference query in the data retrieval system.  相似文献   

5.
Ranking is a core problem for information retrieval since the performance of the search system is directly impacted by the accuracy of ranking results. Ranking model construction has been the focus of both the fields of information retrieval and machine learning, and learning to rank in particular has attracted much interest. Many ranking models have been proposed, for example, RankSVM is a state‐of‐the‐art method for learning to rank and has been empirically demonstrated to be effective. However, most of the proposed methods do not consider about the significant differences between queries, only resort to a single function in ranking. In this paper, we present a novel ranking model named QoRank, which performs the learning task dependent on queries. We also propose a LSE (least‐squares estimation) ‐based weighted method to aggregate the ranking lists produced by base decision functions as the final ranking. Comparison of QoRank with other ranking techniques is conducted, and several evaluation criteria are employed to evaluate its performance. Experimental results on the LETOR OHSUMED data set show that QoRank strikes a good balance of accuracy and complexity, and outperforms the baseline methods. © 2010 Wiley Periodicals, Inc.  相似文献   

6.
Linear RankSVM is one of the widely used methods for learning to rank. The existing methods, such as Trust-Region Newton (TRON) method along with Order-Statistic Tree (OST), can be applied to train the linear RankSVM effectively. However, extremely lengthy training time is unacceptable when using any existing method to handle the large-scale linear RankSVM. To solve this problem, we thus focus on designing an efficient distributed method (named DLRankSVM) to train the huge-scale linear RankSVM on distributed systems. First, to efficiently reduce the communication overheads, we divide the training problem into subproblems in terms of different queries. Second, we propose an efficient heuristic algorithm to address the load balancing issue (which is a NP-complete problem). Third, using OST, we propose an efficient parallel algorithm (named PAV) to compute auxiliary variables at each computational node of the distributed system. Finally, based on PAV and the proposed heuristic algorithm, we develop DLRankSVM under the framework of TRON. The extensive empirical evaluations show that DLRankSVM not only can obtain impressive speedups on both multi-core and distributed systems, but also can perform well in prediction compared with the other state-of-the-art methods. To the best of our knowledge, this is the first research work proposed to train the huge-scale linear RankSVM in a distributed fashion.  相似文献   

7.
We consider the problem of learning the ranking function that maximizes a generalization of the Wilcoxon-Mann-Whitney statistic on the training data. Relying on an $\epsilon$-accurate approximation for the error-function, we reduce the computational complexity of each iteration of a conjugate gradient algorithm for learning ranking functions from O(m2) to O(m2), where m is the number of training samples. Experiments on public benchmarks for ordinal regression and collaborative filtering indicate that the proposed algorithm is as accurate as the best available methods in terms of ranking accuracy, when the algorithms are trained on the same data. However, since it is several orders of magnitude faster than the current state-of-the-art approaches, it is able to leverage much larger training datasets.  相似文献   

8.
排序学习(learning to rank)一直是机器学习领域的研究热点之一。作为解决排序学习的常用模型,线性RankSVM一直备受研究者关注。尽管现有的线性RankSVM已得到较有效地研究,但在训练大规模的线性RankSVM时,过长的训练时间依然难以让人接受。通过对当前最先进算法Tree-TRON的分析可知,利用信任区域的牛顿迭代(Trust Region Newton Method,TRON)去训练线性RankSVM模型涉及大量的Hessian-vector内积(Hessian-vector product)计算。同时完成Hessian-vector内积计算又需计算大量的辅助变量和矩阵运算。为了有效地加速与Hessian-vector内积有关的计算,本文在多核系统下提出了一种高效的并行算法(命名为PRankSVM)用于提高大规模线性RankSVM的训练速度。PRankSVM的特征主要体现以下两个方面:(1)训练数据按不同的查询划分为不同的子问题,(2)在多核系统下,利用多核加速辅助变量和相关矩阵的计算。通过实验分析可知,相较于现有的算法(如Tree-TRON),PRankSVM不仅可以有效地提高训练速度,而且可以有效地确保预测的准确率。  相似文献   

9.
Ranking functions are an important component of information retrieval systems. Recently there has been a surge of research in the field of “learning to rank”, which aims at using labeled training data and machine learning algorithms to construct reliable ranking functions. Machine learning methods such as neural networks, support vector machines, and least squares have been successfully applied to ranking problems, and some are already being deployed in commercial search engines.Despite these successes, most algorithms to date construct ranking functions in a supervised learning setting, which assume that relevance labels are provided by human annotators prior to training the ranking function. Such methods may perform poorly when human relevance judgments are not available for a wide range of queries. In this paper, we examine whether additional unlabeled data, which is easy to obtain, can be used to improve supervised algorithms. In particular, we investigate the transductive setting, where the unlabeled data is equivalent to the test data.We propose a simple yet flexible transductive meta-algorithm: the key idea is to adapt the training procedure to each test list after observing the documents that need to be ranked. We investigate two instantiations of this general framework: The Feature Generation approach is based on discovering more salient features from the unlabeled test data and training a ranker on this test-dependent feature-set. The importance weighting approach is based on ideas in the domain adaptation literature, and works by re-weighting the training data to match the statistics of each test list. We demonstrate that both approaches improve over supervised algorithms on the TREC and OHSUMED tasks from the LETOR dataset.  相似文献   

10.
Recently developed methods for learning sparse classifiers are among the state-of-the-art in supervised learning. These methods learn classifiers that incorporate weighted sums of basis functions with sparsity-promoting priors encouraging the weight estimates to be either significantly large or exactly zero. From a learning-theoretic perspective, these methods control the capacity of the learned classifier by minimizing the number of basis functions used, resulting in better generalization. This paper presents three contributions related to learning sparse classifiers. First, we introduce a true multiclass formulation based on multinomial logistic regression. Second, by combining a bound optimization approach with a component-wise update procedure, we derive fast exact algorithms for learning sparse multiclass classifiers that scale favorably in both the number of training samples and the feature dimensionality, making them applicable even to large data sets in high-dimensional feature spaces. To the best of our knowledge, these are the first algorithms to perform exact multinomial logistic regression with a sparsity-promoting prior. Third, we show how nontrivial generalization bounds can be derived for our classifier in the binary case. Experimental results on standard benchmark data sets attest to the accuracy, sparsity, and efficiency of the proposed methods.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号