首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
Karp  Peter D. 《Machine Learning》1993,12(1-3):89-116
Hypothesis-formation problems occur when the outcome of an experiment as predicted by a scientific theory does not match the outcome observed by a scientist. The problem is to modify the theory, and/or the scientist's conception of the intial conditions of the experiment, such that the prediction agrees with the observation. I treat hypothesis formation as adesign problem. A program calledHypgene designs hypotheses by reasoning backward from its goal of eliminating the difference between prediction and observation. This prediction error is eliminated bydesign operators that are applied by a planning system. The synthetic, goal-directed application of these operators should prove more efficient than past generate-and-test approaches to hypothesis generation.Hypgene uses heuristic search to guide a generator that is focused on the errors in a prediction. The advantages of the design approach to hypothesis formation over the generate-and-test approach are analogous to the advantages of dependency-directed backtracking over chronological backtracking. These hypothesis-formation methods were developed in the context of a historical study of a scientific research program in molecular biology. This article describes in detail the results of applying theHypgene program to several hypothesis-formation problems identified in this historical study.Hypgene found most of the same solutions as did the biologists, which demonstrates that it is capable of solving complex, real-world hypothesis-formation problems.  相似文献   

2.
Multi-instance clustering with applications to multi-instance prediction   总被引:2,自引:0,他引:2  
In the setting of multi-instance learning, each object is represented by a bag composed of multiple instances instead of by a single instance in a traditional learning setting. Previous works in this area only concern multi-instance prediction problems where each bag is associated with a binary (classification) or real-valued (regression) label. However, unsupervised multi-instance learning where bags are without labels has not been studied. In this paper, the problem of unsupervised multi-instance learning is addressed where a multi-instance clustering algorithm named Bamic is proposed. Briefly, by regarding bags as atomic data items and using some form of distance metric to measure distances between bags, Bamic adapts the popular k -Medoids algorithm to partition the unlabeled training bags into k disjoint groups of bags. Furthermore, based on the clustering results, a novel multi-instance prediction algorithm named Bartmip is developed. Firstly, each bag is re-represented by a k-dimensional feature vector, where the value of the i-th feature is set to be the distance between the bag and the medoid of the i-th group. After that, bags are transformed into feature vectors so that common supervised learners are used to learn from the transformed feature vectors each associated with the original bag’s label. Extensive experiments show that Bamic could effectively discover the underlying structure of the data set and Bartmip works quite well on various kinds of multi-instance prediction problems.  相似文献   

3.
We formalize the problem of Structured Prediction as a Reinforcement Learning task. We first define a Structured Prediction Markov Decision Process (SP-MDP), an instantiation of Markov Decision Processes for Structured Prediction and show that learning an optimal policy for this SP-MDP is equivalent to minimizing the empirical loss. This link between the supervised learning formulation of structured prediction and reinforcement learning (RL) allows us to use approximate RL methods for learning the policy. The proposed model makes weak assumptions both on the nature of the Structured Prediction problem and on the supervision process. It does not make any assumption on the decomposition of loss functions, on data encoding, or on the availability of optimal policies for training. It then allows us to cope with a large range of structured prediction problems. Besides, it scales well and can be used for solving both complex and large-scale real-world problems. We describe two series of experiments. The first one provides an analysis of RL on classical sequence prediction benchmarks and compares our approach with state-of-the-art SP algorithms. The second one introduces a tree transformation problem where most previous models fail. This is a complex instance of the general labeled tree mapping problem. We show that RL exploration is effective and leads to successful results on this challenging task. This is a clear confirmation that RL could be used for large size and complex structured prediction problems.  相似文献   

4.
Cost-sensitive learning with conditional Markov networks   总被引:1,自引:0,他引:1  
There has been a recent, growing interest in classification and link prediction in structured domains. Methods such as conditional random fields and relational Markov networks support flexible mechanisms for modeling correlations due to the link structure. In addition, in many structured domains, there is an interesting structure in the risk or cost function associated with different misclassifications. There is a rich tradition of cost-sensitive learning applied to unstructured (IID) data. Here we propose a general framework which can capture correlations in the link structure and handle structured cost functions. We present two new cost-sensitive structured classifiers based on maximum entropy principles. The first determines the cost-sensitive classification by minimizing the expected cost of misclassification. The second directly determines the cost-sensitive classification without going through a probability estimation step. We contrast these approaches with an approach which employs a standard 0/1-loss structured classifier to estimate class conditional probabilities followed by minimization of the expected cost of misclassification and with a cost-sensitive IID classifier that does not utilize the correlations present in the link structure. We demonstrate the utility of our cost-sensitive structured classifiers with experiments on both synthetic and real-world data.  相似文献   

5.

Recently, extreme learning machine (ELM) has attracted increasing attention due to its successful applications in classification, regression, and ranking. Normally, the desired output of the learning system using these machine learning techniques is a simple scalar output. However, there are many applications in machine learning which require more complex output rather than a simple scalar one. Therefore, structured output is used for such applications where the system is trained to predict structured output instead of simple one. Previously, support vector machine (SVM) has been introduced for structured output learning in various applications. However, from machine learning point of view, ELM is known to offer better generalization performance compared to other learning techniques. In this study, we extend ELM to more generalized framework to handle complex outputs where simple outputs are considered as special cases of it. Besides the good generalization property of ELM, the resulting model will possesses rich internal structure that reflects task-specific relations and constraints. The experimental results show that structured ELM achieves similar (for binary problems) or better (for multi-class problems) generalization performance when compared to ELM. Moreover, as verified by the simulation results, structured ELM has comparable or better precision performance with structured SVM when tested for more complex output such as object localization problem on PASCAL VOC2006. Also, the investigation on parameter selections is presented and discussed for all problems.

  相似文献   

6.
There exist numerous state of the art classification algorithms that are designed to handle the data with nominal or binary class labels. Unfortunately, less attention is given to the genre of classification problems where the classes are organized as a structured hierarchy; such as protein function prediction (target area in this work), test scores, gene ontology, web page categorization, text categorization etc. The structured hierarchy is usually represented as a tree or a directed acyclic graph (DAG) where there exist IS-A relationship among the class labels. Class labels at upper level of the hierarchy are more abstract and easy to predict whereas class labels at deeper level are most specific and challenging for correct prediction. It is helpful to consider this class hierarchy for designing a hypothesis that can handle the tradeoff between prediction accuracy and prediction specificity. In this paper, a novel ant colony optimization (ACO) based single path hierarchical classification algorithm is proposed that incorporates the given class hierarchy during its learning phase. The algorithm produces IF–THEN ordered rule list and thus offer comprehensible classification model. Detailed discussion on the architecture and design of the proposed technique is provided which is followed by the empirical evaluation on six ion-channels data sets (related to protein function prediction) and two publicly available data sets. The performance of the algorithm is encouraging as compared to the existing methods based on the statistically significant Student's t-test (keeping in view, prediction accuracy and specificity) and thus confirm the promising ability of the proposed technique for hierarchical classification task.  相似文献   

7.
It has been established that the second-order stochastic gradient descent (SGD) method can potentially achieve generalization performance as well as empirical optimum in a single pass through the training examples. However, second-order SGD requires computing the inverse of the Hessian matrix of the loss function, which is prohibitively expensive for structured prediction problems that usually involve a very high dimensional feature space. This paper presents a new second-order SGD method, called Periodic Step-size Adaptation (PSA). PSA approximates the Jacobian matrix of the mapping function and explores a linear relation between the Jacobian and Hessian to approximate the Hessian, which is proved to be simpler and more effective than directly approximating Hessian in an on-line setting. We tested PSA on a wide variety of models and tasks, including large scale sequence labeling tasks using conditional random fields and large scale classification tasks using linear support vector machines and convolutional neural networks. Experimental results show that single-pass performance of PSA is always very close to empirical optimum.  相似文献   

8.
We analyze in detail the content retrieval process in kad. kad implements content search (publish and retrieval) functions that use the Kademlia Distributed Hash Table for content routing. Node churn is quite common in peer-to-peer systems and results in information loss and stale routing table entries. To deal with node churn, kad issues parallel route requests and publishes multiple redundant copies of each piece of information. We identify the key design parameters in kad and present an analytical model to evaluate the impact of changes in the values of these parameters on the overall lookup latency and message overhead. Extensive measurements of the lookup performance using an instrumented client allow us to validate the model. The overall lookup latency is in most cases 5 s or larger. We elucidate the cause for such high lookup latencies and propose an improved scheme that significantly decreases the overall lookup latency without increasing the overhead.  相似文献   

9.
Hierarchical Fusion of Multiple Classifiers for Hyperspectral Data Analysis   总被引:3,自引:0,他引:3  
Many classification problems involve high dimensional inputs and a large number of classes. Multiclassifier fusion approaches to such difficult problems typically centre around smart feature extraction, input resampling methods, or input space partitioning to exploit modular learning. In this paper, we investigate how partitioning of the output space (i.e. the set of class labels) can be exploited in a multiclassifier fusion framework to simplify such problems and to yield better solutions. Specifically, we introduce a hierarchical technique to recursively decompose a C-class problem into C_1 two-(meta) class problems. A generalised modular learning framework is used to partition a set of classes into two disjoint groups called meta-classes. The coupled problems of finding a good partition and of searching for a linear feature extractor that best discriminates the resulting two meta-classes are solved simultaneously at each stage of the recursive algorithm. This results in a binary tree whose leaf nodes represent the original C classes. The proposed hierarchical multiclassifier framework is particularly effective for difficult classification problems involving a moderately large number of classes. The proposed method is illustrated on a problem related to classification of landcover using hyperspectral data: a 12-class AVIRIS subset with 180 bands. For this problem, the classification accuracies obtained were superior to most other techniques developed for hyperspectral classification. Moreover, the class hierarchies that were automatically discovered conformed very well with human domain experts’ opinions, which demonstrates the potential of using such a modular learning approach for discovering domain knowledge automatically from data. Received: 21 November 2000, Received in revised form: 02 November 2001, Accepted: 13 December 2001  相似文献   

10.
We study a family of problems, called Maximum Solution (Max Sol), where the objective is to maximise a linear goal function over the feasible integer assignments to a set of variables subject to a set of constraints. When the domain is Boolean (i.e. restricted to {0,1}), the maximum solution problem is identical to the well-studied Max Ones problem, and the complexity and approximability is completely understood for all restrictions on the underlying constraints. We continue this line of research by considering the Max Sol problem for relations defined by regular signed logic over finite subsets of the natural numbers; the complexity of the corresponding decision problem has recently been classified by Creignou et al. (Theory Comput. Syst. 42(2):239–255, 2008). We give sufficient conditions for when such problems are polynomial-time solvable and we prove that they are APX-hard otherwise. Similar dichotomies are also obtained for variants of the Max Sol problem.  相似文献   

11.
Essence is a formal language for specifying combinatorial problems in a manner similar to natural rigorous specifications that use a mixture of natural language and discrete mathematics. Essence provides a high level of abstraction, much of which is the consequence of the provision of decision variables whose values can be combinatorial objects, such as tuples, sets, multisets, relations, partitions and functions. Essence also allows these combinatorial objects to be nested to arbitrary depth, providing for example sets of partitions, sets of sets of partitions, and so forth. Therefore, a problem that requires finding a complex combinatorial object can be specified directly by using a decision variable whose type is precisely that combinatorial object.  相似文献   

12.
We tackle the structured output classification problem using the Conditional Random Fields (CRFs). Unlike the standard 0/1 loss case, we consider a cost-sensitive learning setting where we are given a non-0/1 misclassification cost matrix at the individual output level. Although the task of cost-sensitive classification has many interesting practical applications that retain domain-specific scales in the output space (e.g., hierarchical or ordinal scale), most CRF learning algorithms are unable to effectively deal with the cost-sensitive scenarios as they merely assume a nominal scale (hence 0/1 loss) in the output space. In this paper, we incorporate the cost-sensitive loss into the large margin learning framework. By large margin learning, the proposed algorithm inherits most benefits from the SVM-like margin-based classifiers, such as the provable generalization error bounds. Moreover, the soft-max approximation employed in our approach yields a convex optimization similar to the standard CRF learning with only slight modification in the potential functions. We also provide the theoretical cost-sensitive generalization error bound. We demonstrate the improved prediction performance of the proposed method over the existing approaches in a diverse set of sequence/image structured prediction problems that often arise in pattern recognition and computer vision domains.  相似文献   

13.
In pattern recognition, an elegant and powerful way to deal with classification problems is based on the minimisation of the classification risk. The risk function is defined in terms of loss functions that measure the penalty for wrong decisions. However, in practice a trivial loss function is usually adopted (the so-called 0–1 loss function) that do no make the most of this framework. This work is focused on the study of different loss functions, and specially on those loss functions that do not depend on the class proposed by the system. Loss functions of this kind have allowed us to theoretically explain heuristics that are successfully used with very complex pattern recognition problem, such as (statistical) machine translation. A comparative experimental work has also been carried out to compare different proposals of loss functions in the practical scenario of machine translation.  相似文献   

14.
This paper studies the problem of stabilizing reference trajectories (also called as the trajectory tracking problem) for underactuated marine vehicles under predefined tracking error constraints. The boundary functions of the predefined constraints are asymmetric and time‐varying. The time‐varying boundary functions allow us to quantify prescribed performance of tracking errors on both transient and steady‐state stages. To overcome difficulties raised by underactuation and nonzero off‐diagonal terms in the system matrices, we develop a novel transverse function control approach to introduce an additional control input in backstepping procedure. This approach provides practical stabilization of any smooth reference trajectory, whether this trajectory is feasible or not. By practical stabilization, we mean that the tracking errors of vehicle position and orientation converge to a small neighborhood of zero. With the introduction of an error transformation function, we construct an inverse‐hyperbolic‐tangent‐like barrier Lyapunov function to show practical stability of the closed‐loop systems with prescribed transient and steady‐state performances. To deal with unmodeled dynamic uncertainties and external disturbances, we employ neural network (NN) approximators to estimate uncertain dynamics and present disturbance observers to estimate unknown disturbances. Subsequently, we develop adaptive control, based on NN approximators and disturbance estimates, that guarantees the prescribed performance of tracking errors during the transient stage of on‐line NN weight adaptations and disturbance estimates. Simulation results show the performance of the proposed tracking control.  相似文献   

15.
Geometric model fitting is a typical chicken-&-egg problem: data points should be clustered based on geometric proximity to models whose unknown parameters must be estimated at the same time. Most existing methods, including generalizations of RANSAC, greedily search for models with most inliers (within a threshold) ignoring overall classification of points. We formulate geometric multi-model fitting as an optimal labeling problem with a global energy function balancing geometric errors and regularity of inlier clusters. Regularization based on spatial coherence (on some near-neighbor graph) and/or label costs is NP hard. Standard combinatorial algorithms with guaranteed approximation bounds (e.g. α-expansion) can minimize such regularization energies over a finite set of labels, but they are not directly applicable to a continuum of labels, e.g. R2{\mathcal{R}}^{2} in line fitting. Our proposed approach (PEaRL) combines model sampling from data points as in RANSAC with iterative re-estimation of inliers and models’ parameters based on a global regularization functional. This technique efficiently explores the continuum of labels in the context of energy minimization. In practice, PEaRL converges to a good quality local minimum of the energy automatically selecting a small number of models that best explain the whole data set. Our tests demonstrate that our energy-based approach significantly improves the current state of the art in geometric model fitting currently dominated by various greedy generalizations of RANSAC.  相似文献   

16.
Frequently invoked large functions are common in non‐numeric applications. These large functions present challenges to modern compilers not only because they require more time and resources at compilation time, but also because they may prevent optimizations such as function inlining. Often large portions of the code in a hot function fhost are executed much less frequently than fhost itself. Partial inlining is a natural solution to the problems caused by including cold code segments that are seldom executed into hot functions that are frequently invoked. When applying partial inlining, a compiler outlines cold statements from a hot function fhost. After outlining, fhost becomes smaller and thus can be easily inlined. This paper presents Ablego, a framework for function outlining and partial inlining that includes several innovations: (1) an abstract‐syntax‐tree‐based analysis and transformation to form cold regions for outlining; (2) a set of flexible heuristics to control the aggressiveness of function outlining; (3) several possible function outlining strategies; (4) explicit variable spilling, a new technique that overcomes negative side‐effects of function outlining. With the proper strategy, partial inlining improves performance by up to 5.75%. A performance study also suggests that partial inlining's effect on enabling more aggressive inlining is limited. The performance improvement from partial inlining actually comes from better code placement and better code generation. Copyright © 2006 John Wiley & Sons, Ltd.  相似文献   

17.

A large amount of research on Convolutional Neural Networks (CNN) has focused on flat Classification in the multi-class domain. In the real world, many problems are naturally expressed as hierarchical classification problems, in which the classes to be predicted are organized in a hierarchy of classes. In this paper, we propose a new architecture for hierarchical classification, introducing a stack of deep linear layers using cross-entropy loss functions combined to a center loss function. The proposed architecture can extend any neural network model and simultaneously optimizes loss functions to discover local hierarchical class relationships and a loss function to discover global information from the whole class hierarchy while penalizing class hierarchy violations. We experimentally show that our hierarchical classifier presents advantages to the traditional classification approaches finding application in computer vision tasks. The same approach can also be applied to some CNN for text classification.

  相似文献   

18.
Ribeiro  Rita P.  Moniz  Nuno 《Machine Learning》2020,109(9-10):1803-1835

Research in imbalanced domain learning has almost exclusively focused on solving classification tasks for accurate prediction of cases labelled with a rare class. Approaches for addressing such problems in regression tasks are still scarce due to two main factors. First, standard regression tasks assume each domain value as equally important. Second, standard evaluation metrics focus on assessing the performance of models on the most common values of data distributions. In this paper, we present an approach to tackle imbalanced regression tasks where the objective is to predict extreme (rare) values. We propose an approach to formalise such tasks and to optimise/evaluate predictive models, overcoming the factors mentioned and issues in related work. We present an automatic and non-parametric method to obtain relevance functions, building on the concept of relevance as the mapping of target values into non-uniform domain preferences. Then, we propose SERA, a new evaluation metric capable of assessing the effectiveness and of optimising models towards the prediction of extreme values while penalising severe model bias. An experimental study demonstrates how SERA provides valid and useful insights into the performance of models in imbalanced regression tasks.

  相似文献   

19.
In this article, the optimisation of the weighting functions for an H controller using genetic algorithms and structured genetic algorithms is considered. The choice of the weighting functions is one of the key steps in the design of an H controller. The performance of the controller depends on these weighting functions since poorly chosen weighting functions will provide a poor controller. One approach that can solve this problem is the use of evolutionary techniques to tune the weighting parameters. The article presents the improved performance of structured genetic algorithms over conventional genetic algorithms and how this technique can assist with the identification of appropriate weighting functions’ orders.  相似文献   

20.
针对说话人识别实际应用中训练数据不足的问题,选取GMM-UBM作为基准系统模型,用EigenVoice对其作自适应,应用泛化能力较强的多项式核函数和学习能力较强的径向基核函数进行线性加权组合后的组合核函数进行模型参数优化,并用多重网格搜索法确定核函数的最优参数,采用DAG方法实现SVM核函数的多元分类.在仿真实验中评估了线性核、多项式核、径向基核以及组合核函数,实验结果表明,在采用正确的参数前提下,在不同的多分类策略、自适应时间、信噪比和不同的说话人数量的情况下,组合核函数的识别性能明显都优于其它三个单核函数.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号