首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
2.
We present a meta-learning method to support selection of candidate learning algorithms. It uses a k-Nearest Neighbor algorithm to identify the datasets that are most similar to the one at hand. The distance between datasets is assessed using a relatively small set of data characteristics, which was selected to represent properties that affect algorithm performance. The performance of the candidate algorithms on those datasets is used to generate a recommendation to the user in the form of a ranking. The performance is assessed using a multicriteria evaluation measure that takes not only accuracy, but also time into account. As it is not common in Machine Learning to work with rankings, we had to identify and adapt existing statistical techniques to devise an appropriate evaluation methodology. Using that methodology, we show that the meta-learning method presented leads to significantly better rankings than the baseline ranking method. The evaluation methodology is general and can be adapted to other ranking problems. Although here we have concentrated on ranking classification algorithms, the meta-learning framework presented can provide assistance in the selection of combinations of methods or more complex problem solving strategies.  相似文献   

3.
Preference learning is an emerging topic that appears in different guises in the recent literature. This work focuses on a particular learning scenario called label ranking, where the problem is to learn a mapping from instances to rankings over a finite number of labels. Our approach for learning such a mapping, called ranking by pairwise comparison (RPC), first induces a binary preference relation from suitable training data using a natural extension of pairwise classification. A ranking is then derived from the preference relation thus obtained by means of a ranking procedure, whereby different ranking methods can be used for minimizing different loss functions. In particular, we show that a simple (weighted) voting strategy minimizes risk with respect to the well-known Spearman rank correlation. We compare RPC to existing label ranking methods, which are based on scoring individual labels instead of comparing pairs of labels. Both empirically and theoretically, it is shown that RPC is superior in terms of computational efficiency, and at least competitive in terms of accuracy.  相似文献   

4.
Ranking items is an essential problem in recommendation systems. Since comparing two items is the simplest type of queries in order to measure the relevance of items, the problem of aggregating pairwise comparisons to obtain a global ranking has been widely studied. Furthermore, ranking with pairwise comparisons has recently received a lot of attention in crowdsourcing systems where binary comparative queries can be used effectively to make assessments faster for precise rankings. In order to learn a ranking based on a training set of queries and their labels obtained from annotators, machine learning algorithms are generally used to find the appropriate ranking model which describes the data set the best.In this paper, we propose a probabilistic model for learning multiple latent rankings by using pairwise comparisons. Our novel model can capture multiple hidden rankings underlying the pairwise comparisons. Based on the model, we develop an efficient inference algorithm to learn multiple latent rankings as well as an effective inference algorithm for active learning to update the model parameters in crowdsourcing systems whenever new pairwise comparisons are supplied. The performance study with synthetic and real-life data sets confirms the effectiveness of our model and inference algorithms.  相似文献   

5.
We discuss automatic rule generation techniques for learning relational properties of 2D visual patterns and 3D objects from training samples where the observed feature values are continuous. In particular, we explore a conditional rule generation method that defines patterns (or objects) in terms of ordered lists of bounds on unary (pattern part) and binary (part relation) features. The technique, termed conditional rule generation, was developed to integrate relational structure representations of patterns and the generalization characteristics of evidenced-based systems. We show how this technique can be used for recognition of complex patterns and of objects in scenes. Further, we show the extent to which the learned rules can identify patterns and objects that have undergone nonrigid distortions.  相似文献   

6.
Feature rankings are often used for supervised dimension reduction especially when discriminating power of each feature is of interest, dimensionality of dataset is extremely high, or computational power is limited to perform more complicated methods. In practice, it is recommended to start dimension reduction via simple methods such as feature rankings before applying more complex approaches. Single variable classifier (SVC) ranking is a feature ranking based on the predictive performance of a classifier built using only a single feature. While benefiting from capabilities of classifiers, this ranking method is not as computationally intensive as wrappers. In this paper, we report the results of an extensive study on the bias and stability of such feature ranking method. We study whether the classifiers influence the SVC rankings or the discriminative power of features themselves has a dominant impact on the final rankings. We show the common intuition of using the same classifier for feature ranking and final classification does not always result in the best prediction performance. We then study if heterogeneous classifiers ensemble approaches provide more unbiased rankings and if they improve final classification performance. Furthermore, we calculate an empirical prediction performance loss for using the same classifier in SVC feature ranking and final classification from the optimal choices.  相似文献   

7.
Many multi-class classification algorithms in statistics and machine learning typically combine several binary classifiers in order to construct an overall classifier. In the popular pairwise ensemble, one classifier is built for each pair of classes, resulting in pairwise bipartite rankings. In contrast, ordinal regression algorithms consider a single ranking function for several ordered classes. It is known in the literature that pairwise ensembles can be useful for ordinal regression. However, can single ranking models make a contribution to multi-class classification? The answer to this question should be affirmative, as supported by theoretical results presented in this article. We conduct a formal analysis of the consistency of pairwise bipartite rankings by uncovering the conditions under which they can be equivalently expressed in terms of a single ranking. Similar to the utility representability of pairwise preference relations, it turns out that transitivity plays a crucial role in the characterization of the ranking representability of pairwise bipartite rankings. To this end, we introduce the new concepts of strict ranking representability, a restrictive condition that can be verified easily, and AUC ranking representability, a practically more useful condition that is more difficult to verify. However, the link between pairwise bipartite rankings and dice games allows us to formulate necessary transitivity conditions for AUC ranking representability. A sufficient condition on the other hand is obtained by introducing a new type of transitivity that can be verified by solving an integer quadratic program.  相似文献   

8.
9.
Query-dependent cross-domain ranking in heterogeneous network   总被引:1,自引:1,他引:0  
Traditional learning-to-rank problem mainly focuses on one single type of objects. However, with the rapid growth of the Web 2.0, ranking over multiple interrelated and heterogeneous objects becomes a common situation, e.g., the heterogeneous academic network. In this scenario, one may have much training data for some type of objects (e.g. conferences) while only very few for the interested types of objects (e.g. authors). Thus, the two important questions are: (1) Given a networked data set, how could one borrow supervision from other types of objects in order to build an accurate ranking model for the interested objects with insufficient supervision? (2) If there are links between different objects, how can we exploit their relationships for improved ranking performance? In this work, we first propose a regularized framework called HCDRank to simultaneously minimize two loss functions related to these two domains. Then, we extend the approach by exploiting the link information between heterogeneous objects. We conduct a theoretical analysis to the proposed approach and derive its generalization bound to demonstrate how the two related domains could help each other in learning ranking functions. Experimental results on three different genres of data sets demonstrate the effectiveness of the proposed approaches.  相似文献   

10.
Graph representations of data are increasingly common. Such representations arise in a variety of applications, including computational biology, social network analysis, web applications, and many others. There has been much work in recent years on developing learning algorithms for such graph data; in particular, graph learning algorithms have been developed for both classification and regression on graphs. Here we consider graph learning problems in which the goal is not to predict labels of objects in a graph, but rather to rank the objects relative to one another; for example, one may want to rank genes in a biological network by relevance to a disease, or customers in a social network by their likelihood of being interested in a certain product. We develop algorithms for such problems of learning to rank on graphs. Our algorithms build on the graph regularization ideas developed in the context of other graph learning problems, and learn a ranking function in a reproducing kernel Hilbert space (RKHS) derived from the graph. This allows us to show attractive stability and generalization properties. Experiments on several graph ranking tasks in computational biology and in cheminformatics demonstrate the benefits of our framework.  相似文献   

11.
We study the problem of label ranking, a machine learning task that consists of inducing a mapping from instances to rankings over a finite number of labels. Our learning method, referred to as ranking by pairwise comparison (RPC), first induces pairwise order relations (preferences) from suitable training data, using a natural extension of so-called pairwise classification. A ranking is then derived from a set of such relations by means of a ranking procedure. In this paper, we first elaborate on a key advantage of such a decomposition, namely the fact that it allows the learner to adapt to different loss functions without re-training, by using different ranking procedures on the same predicted order relations. In this regard, we distinguish between two types of errors, called, respectively, ranking error and position error. Focusing on the position error, which has received less attention so far, we then propose a ranking procedure called ranking through iterated choice as well as an efficient pairwise implementation thereof. Apart from a theoretical justification of this procedure, we offer empirical evidence in favor of its superior performance as a risk minimizer for the position error.  相似文献   

12.
The problem of “Learning to rank” is a popular research topic in Information Retrieval (IR) and machine learning communities. Some existing list-wise methods, such as AdaRank, directly use the IR measures as performance functions to quantify how well a ranking function can predict rankings. However, the IR measures only count for the document ranks, but do not consider how well the algorithm predicts the relevance scores of documents. These methods do not make best use of the available prior knowledge and may lead to suboptimal performance. Hence, we conduct research by combining both the document ranks and relevance scores. We propose a novel performance function that encodes the relevance scores. We also define performance functions by combining our proposed one with MAP or NDCG, respectively. The experimental results on the benchmark data collections show that our methods can significantly outperform the state-of-the-art AdaRank baselines.  相似文献   

13.
Petković  Matej  Džeroski  Sašo  Kocev  Dragi 《Machine Learning》2020,109(11):2141-2159

In this paper, we propose three ensemble-based feature ranking scores for multi-label classification (MLC), which is a generalisation of multi-class classification where the classes are not mutually exclusive. Each of the scores (Symbolic, Genie3 and Random forest) can be computed from three different ensembles of predictive clustering trees: Bagging, Random forest and Extra trees. We extensively evaluate the proposed scores on 24 benchmark MLC problems, using 15 standard MLC evaluation measures. We determine the ranking quality saturation points in terms of the ensemble sizes, for each ranking-ensemble pair, and show that quality rankings can be computed really efficiently (typically 10 or 50 trees suffice). We also show that the proposed feature rankings are relevant and determine the most appropriate ensemble method for every feature ranking score. We empirically prove that the proposed feature ranking scores outperform current state-of-the-art methods in the quality of the rankings (for the majority of the evaluation measures), and in time efficiency. Finally, we determine the best performing feature ranking scores. Taking into account the quality of the rankings first and—in the case of ties—time efficiency, we identify the Genie3 feature ranking score as the optimal one.

  相似文献   

14.
Lin HT  Li L 《Neural computation》2012,24(5):1329-1367
We present a reduction framework from ordinal ranking to binary classification. The framework consists of three steps: extracting extended examples from the original examples, learning a binary classifier on the extended examples with any binary classification algorithm, and constructing a ranker from the binary classifier. Based on the framework, we show that a weighted 0/1 loss of the binary classifier upper-bounds the mislabeling cost of the ranker, both error-wise and regret-wise. Our framework allows not only the design of good ordinal ranking algorithms based on well-tuned binary classification approaches, but also the derivation of new generalization bounds for ordinal ranking from known bounds for binary classification. In addition, our framework unifies many existing ordinal ranking algorithms, such as perceptron ranking and support vector ordinal regression. When compared empirically on benchmark data sets, some of our newly designed algorithms enjoy advantages in terms of both training speed and generalization performance over existing algorithms. In addition, the newly designed algorithms lead to better cost-sensitive ordinal ranking performance, as well as improved listwise ranking performance.  相似文献   

15.
In recent years the analysis of preference rankings has become an increasingly important topic. One of the most important tasks in dealing with preference rankings is the identification of the median ranking, namely that ranking that best represents the preferences of a population of judges. This task is known with several alternative names, such as rank aggregation problem, consensus ranking problem, social choice problem. In this paper we propose a Differential Evolution algorithm for the Consensus Ranking detection (DECoR) within the Kemeny’s axiomatic framework. The algorithm works with full, partial and incomplete rankings. A simulation study shows that our proposal is particularly feasible when working with a very large number of objects to be ranked, because it is accurate and also faster than other proposals. Some applications on real data sets show the practical utility of our proposal in helping the users in taking decisions.  相似文献   

16.
Much like relational probabilistic models, the need for relational preference models naturally arises in real-world applications involving multiple, heterogeneous, and richly interconnected objects. On the one hand, relational preferences should be represented into statements which are natural for human users to express. On the other hand, relational preference models should be endowed with a structure that supports tractable forms of reasoning and learning. Based on these criteria, this paper introduces the framework of relational conditional preference networks (RCP-nets), that maintains the spirit of the popular ??CP-nets?? by expressing relational preferences in a natural way using the ceteris paribus semantics. We show that acyclic RCP-nets support tractable inference for optimization and ranking tasks. In addition, we show that in the online learning model, tree-structured RCP-nets (with bipartite orderings) are efficiently learnable from both optimization tasks and ranking tasks, using linear loss functions. Our results are corroborated by experiments on a large-scale movie recommendation dataset.  相似文献   

17.

In this paper we present a novel moment-based skeleton detection for representing human objects in RGB-D videos with animated 3D skeletons. An object often consists of several parts, where each of them can be concisely represented with a skeleton. However, it remains as a challenge to detect the skeletons of individual objects in an image since it requires an effective part detector and a part merging algorithm to group parts into objects. In this paper, we present a novel fully unsupervised learning framework to detect the skeletons of human objects in a RGB-D video. The skeleton modeling algorithm uses a pipeline architecture which consists of a series of cascaded operations, i.e., symmetry patch detection, linear time search of symmetry patch pairs, part and symmetry detection, symmetry graph partitioning, and object segmentation. The properties of geometric moment-based functions for embedding symmetry features into centers of symmetry patches are also investigated in detail. As compared with the state-of-the-art deep learning approaches for skeleton detection, the proposed approach does not require tedious human labeling work on training images to locate the skeleton pixels and their associated scale information. Although our algorithm can detect parts and objects simultaneously, a pre-learned convolution neural network (CNN) can be used to locate the human object from each frame of the input video RGB-D video in order to achieve the goal of constructing real-time applications. This much reduces the complexity to detect the skeleton structure of individual human objects with our proposed method. Using the segmented human object skeleton model, a video surveillance application is constructed to verify the effectiveness of the approach. Experimental results show that the proposed method gives good performance in terms of detection and recognition using publicly available datasets.

  相似文献   

18.
Various meta-modeling techniques have been developed to replace computationally expensive simulation models. The performance of these meta-modeling techniques on different models is varied which makes existing model selection/recommendation approaches (e.g., trial-and-error, ensemble) problematic. To address these research gaps, we propose a general meta-modeling recommendation system using meta-learning which can automate the meta-modeling recommendation process by intelligently adapting the learning bias to problem characterizations. The proposed intelligent recommendation system includes four modules: (1) problem module, (2) meta-feature module which includes a comprehensive set of meta-features to characterize the geometrical properties of problems, (3) meta-learner module which compares the performance of instance-based and model-based learning approaches for optimal framework design, and (4) performance evaluation module which introduces two criteria, Spearman's ranking correlation coefficient and hit ratio, to evaluate the system on the accuracy of model ranking prediction and the precision of the best model recommendation, respectively. To further improve the performance of meta-learning for meta-modeling recommendation, different types of feature reduction techniques, including singular value decomposition, stepwise regression and ReliefF, are studied. Experiments show that our proposed framework is able to achieve 94% correlation on model rankings, and a 91% hit ratio on best model recommendation. Moreover, the computational cost of meta-modeling recommendation is significantly reduced from an order of minutes to seconds compared to traditional trial-and-error and ensemble process. The proposed framework can significantly advance the research in meta-modeling recommendation, and can be applied for data-driven system modeling.  相似文献   

19.
Learning to rank is a supervised learning problem that aims to construct a ranking model for the given data. The most common application of learning to rank is to rank a set of documents against a query. In this work, we focus on point‐wise learning to rank, where the model learns the ranking values. Multivariate adaptive regression splines (MARS) and conic multivariate adaptive regression splines (CMARS) are supervised learning techniques that have been proven to provide successful results on various prediction problems. In this article, we investigate the effectiveness of MARS and CMARS for point‐wise learning to rank problem. The prediction performance is analyzed in comparison to three well‐known supervised learning methods, artificial neural network (ANN), support vector machine, and random forest for two datasets under a variety of metrics including accuracy, stability, and robustness. The experimental results show that MARS and ANN are effective methods for learning to rank problem and provide promising results.  相似文献   

20.
Finite Mixture Regression (FMR) refers to the mixture modeling scheme which learns multiple regression models from the training data set. Each of them is in charge of a subset. FMR is an effective scheme for handling sample heterogeneity, where a single regression model is not enough for capturing the complexities of the conditional distribution of the observed samples given the features. In this paper, we propose an FMR model that (1) finds sample clusters and jointly models multiple incomplete mixed-type targets simultaneously, (2) achieves shared feature selection among tasks and cluster components, and (3) detects anomaly tasks or clustered structure among tasks, and accommodates outlier samples. We provide non-asymptotic oracle performance bounds for our model under a high-dimensional learning framework. The proposed model is evaluated on both synthetic and real-world data sets. The results show that our model can achieve state-of-the-art performance.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号