期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

1.

Transfer active learning by querying committee简

Hao SHAO Feng TAO Rui XU 《浙江大学学报:C卷英文版》2014,15(2):107-118

In real applications of inductive learning for classifi cation, labeled instances are often defi cient, and labeling them by an oracle is often expensive and time-consuming. Active learning on a single task aims to select only informative unlabeled instances for querying to improve the classifi cation accuracy while decreasing the querying cost. However, an inevitable problem in active learning is that the informative measures for selecting queries are commonly based on the initial hypotheses sampled from only a few labeled instances. In such a circumstance, the initial hypotheses are not reliable and may deviate from the true distribution underlying the target task. Consequently, the informative measures will possibly select irrelevant instances. A promising way to compensate this problem is to borrow useful knowledge from other sources with abundant labeled information, which is called transfer learning. However, a signifi cant challenge in transfer learning is how to measure the similarity between the source and the target tasks. One needs to be aware of different distributions or label assignments from unrelated source tasks;otherwise, they will lead to degenerated performance while transferring. Also, how to design an effective strategy to avoid selecting irrelevant samples to query is still an open question. To tackle these issues, we propose a hybrid algorithm for active learning with the help of transfer learning by adopting a divergence measure to alleviate the negative transfer caused by distribution differences. To avoid querying irrelevant instances, we also present an adaptive strategy which could eliminate unnecessary instances in the input space and models in the model space. Extensive experiments on both the synthetic and the real data sets show that the proposed algorithm is able to query fewer instances with a higher accuracy and that it converges faster than the state-of-the-art methods. 相似文献

2.

Active evaluation of ranking functions based on graded relevance

Christoph Sawade Steffen Bickel Timo von Oertzen Tobias Scheffer Niels Landwehr 《Machine Learning》2013,92(1):41-64

Evaluating the quality of ranking functions is a core task in web search and other information retrieval domains. Because query distributions and item relevance change over time, ranking models often cannot be evaluated accurately on held-out training data. Instead, considerable effort is spent on manually labeling the relevance of query results for test queries in order to track ranking performance. We address the problem of estimating ranking performance as accurately as possible on a fixed labeling budget. Estimates are based on a set of most informative test queries selected by an active sampling distribution. Query labeling costs depend on the number of result items as well as item-specific attributes such as document length. We derive cost-optimal sampling distributions for the commonly used performance measures Discounted Cumulative Gain and Expected Reciprocal Rank. Experiments on web search engine data illustrate significant reductions in labeling costs. 相似文献

3.

Unsupervised law article mining based on deep pre-trained language representation models with application to the Italian civil code

Tagarelli Andrea Simeri Andrea 《Artificial Intelligence and Law》2022,30(3):417-473

Modeling law search and retrieval as prediction problems has recently emerged as a predominant approach in law intelligence. Focusing on the law article retrieval task, we present a deep learning framework named LamBERTa, which is designed for civil-law codes, and specifically trained on the Italian civil code. To our knowledge, this is the first study proposing an advanced approach to law article prediction for the Italian legal system based on a BERT (Bidirectional Encoder Representations from Transformers) learning framework, which has recently attracted increased attention among deep learning approaches, showing outstanding effectiveness in several natural language processing and learning tasks. We define LamBERTa models by fine-tuning an Italian pre-trained BERT on the Italian civil code or its portions, for law article retrieval as a classification task. One key aspect of our LamBERTa framework is that we conceived it to address an extreme classification scenario, which is characterized by a high number of classes, the few-shot learning problem, and the lack of test query benchmarks for Italian legal prediction tasks. To solve such issues, we define different methods for the unsupervised labeling of the law articles, which can in principle be applied to any law article code system. We provide insights into the explainability and interpretability of our LamBERTa models, and we present an extensive experimental analysis over query sets of different type, for single-label as well as multi-label evaluation tasks. Empirical evidence has shown the effectiveness of LamBERTa, and also its superiority against widely used deep-learning text classifiers and a few-shot learner conceived for an attribute-aware prediction task.

相似文献

4.

Mapping queries to the Linking Open Data cloud: A case study using DBpedia

Edgar Meij Marc Bron Laura Hollink Bouke Huurnink Maarten de Rijke 《Journal of Web Semantics》2011,9(4):418-433

We introduce the task of mapping search engine queries to DBpedia, a major linking hub in the Linking Open Data cloud. We propose and compare various methods for addressing this task, using a mixture of information retrieval and machine learning techniques. Specifically, we present a supervised machine learning-based method to determine which concepts are intended by a user issuing a query. The concepts are obtained from an ontology and may be used to provide contextual information, related concepts, or navigational suggestions to the user submitting the query. Our approach first ranks candidate concepts using a language modeling for information retrieval framework. We then extract query, concept, and search-history feature vectors for these concepts. Using manual annotations we inform a machine learning algorithm that learns how to select concepts from the candidates given an input query. Simply performing a lexical match between the queries and concepts is found to perform poorly and so does using retrieval alone, i.e., omitting the concept selection stage. Our proposed method significantly improves upon these baselines and we find that support vector machines are able to achieve the best performance out of the machine learning algorithms evaluated. 相似文献

5.

A cost model for spatio-temporal queries using the TPR-tree

《Journal of Systems and Software》2004,73(1):101-112

A query optimizer requires cost models to calculate the costs of various access plans for a query. An effective method to estimate the number of disk (or page) accesses for spatio-temporal queries has not yet been proposed. The TPR-tree is an efficient index that supports spatio-temporal queries for moving objects. Existing cost models for the spatial index such as the R-tree do not accurately estimate the number of disk accesses for spatio-temporal queries using the TPR-tree, because they do not consider the future locations of moving objects, which change continuously as time passes.In this paper, we propose an efficient cost model for spatio-temporal queries to solve this problem. We present analytical formulas which accurately calculate the number of disk accesses for spatio-temporal queries. Extensive experimental results show that our proposed method accurately estimates the number of disk accesses over various queries to spatio-temporal data combining real-life spatial data and synthetic temporal data. To evaluate the effectiveness of our method, we compared our spatio-temporal cost model (STCM) with an existing spatial cost model (SCM). The application of the existing SCM has the average error ratio from 52% to 93%, whereas our STCM has the average error ratio from 11% to 32%. 相似文献

6.

Cross-domain video concept detection: A joint discriminative and generative active learning approach

Huan Li Yuan Shi Yang Liu Alexander G. Hauptmann Zhang Xiong 《Expert systems with applications》2012,39(15):12220-12228

In this work, we study the problem of cross-domain video concept detection, where the distributions of the source and target domains are different. Active learning can be used to iteratively refine a source domain classifier by querying labels for a few samples in the target domain, which could reduce the labeling effort. However, traditional active learning method which often uses a discriminative query strategy that queries the most ambiguous samples to the source domain classifier for labeling would fail, when the distribution difference between two domains is too large. In this paper, we tackle this problem by proposing a joint active learning approach which combines a novel generative query strategy and the existing discriminative one. The approach adaptively fits the distribution difference and shows higher robustness than the ones using single strategy. Experimental results on two synthetic datasets and the TRECVID video concept detection task highlight the effectiveness of our joint active learning approach. 相似文献

7.

Robot introspection through learned hidden Markov models

Maria Fox Malik Ghallab Derek Long 《Artificial Intelligence》2006,170(2):59-113

In this paper we describe a machine learning approach for acquiring a model of a robot behaviour from raw sensor data. We are interested in automating the acquisition of behavioural models to provide a robot with an introspective capability. We assume that the behaviour of a robot in achieving a task can be modelled as a finite stochastic state transition system.Beginning with data recorded by a robot in the execution of a task, we use unsupervised learning techniques to estimate a hidden Markov model (HMM) that can be used both for predicting and explaining the behaviour of the robot in subsequent executions of the task. We demonstrate that it is feasible to automate the entire process of learning a high quality HMM from the data recorded by the robot during execution of its task.The learned HMM can be used both for monitoring and controlling the behaviour of the robot. The ultimate purpose of our work is to learn models for the full set of tasks associated with a given problem domain, and to integrate these models with a generative task planner. We want to show that these models can be used successfully in controlling the execution of a plan. However, this paper does not develop the planning and control aspects of our work, focussing instead on the learning methodology and the evaluation of a learned model. The essential property of the models we seek to construct is that the most probable trajectory through a model, given the observations made by the robot, accurately diagnoses, or explains, the behaviour that the robot actually performed when making these observations. In the work reported here we consider a navigation task. We explain the learning process, the experimental setup and the structure of the resulting learned behavioural models. We then evaluate the extent to which explanations proposed by the learned models accord with a human observer's interpretation of the behaviour exhibited by the robot in its execution of the task. 相似文献

8.

SWGMM: a semi-wrapped Gaussian mixture model for clustering of circular–linear data

Anandarup Roy Swapan K. Parui Utpal Roy 《Pattern Analysis & Applications》2016,19(3):631-645

Finite mixture models are widely used to perform model-based clustering of multivariate data sets. Most of the existing mixture models work with linear data; whereas, real-life applications may involve multivariate data having both circular and linear characteristics. No existing mixture models can accommodate such correlated circular–linear data. In this paper, we consider designing a mixture model for multivariate data having one circular variable. In order to construct a circular–linear joint distribution with proper inclusion of correlation terms, we use the semi-wrapped Gaussian distribution. Further, we construct a mixture model (termed SWGMM) of such joint distributions. This mixture model is capable of approximating the distribution of multi-modal circular–linear data. An unsupervised learning of the mixture parameters is proposed based on expectation maximization method. Clustering is performed using maximum a posteriori criterion. To evaluate the performance of SWGMM, we choose the task of color image segmentation in LCH space. We present comprehensive results and compare SWGMM with existing methods. Our study reveals that the proposed mixture model outperforms the other methods in most cases. 相似文献

9.

Active concept learning in image databases. 总被引：2，自引：0，他引：2

Anlei Dong Bir Bhanu 《IEEE transactions on systems, man, and cybernetics. Part B, Cybernetics》2005,35(3):450-466

Concept learning in content-based image retrieval systems is a challenging task. This paper presents an active concept learning approach based on the mixture model to deal with the two basic aspects of a database system: the changing (image insertion or removal) nature of a database and user queries. To achieve concept learning, we a) propose a new user directed semi-supervised expectation-maximization algorithm for mixture parameter estimation, and b) develop a novel model selection method based on Bayesian analysis that evaluates the consistency of hypothesized models with the available information. The analysis of exploitation versus exploration in the search space helps to find the optimal model efficiently. Our concept knowledge transduction approach is able to deal with the cases of image insertion and query images being outside the database. The system handles the situation where users may mislabel images during relevance feedback. Experimental results on Corel database show the efficacy of our active concept learning approach and the improvement in retrieval performance by concept transduction. 相似文献

10.

Parsimonious reduction of Gaussian mixture models with a variational-Bayes approach

Pierrick Bruneau Author Vitae Marc Gelgon Author Vitae Fabien Picarougne^{Author Vitae} 《Pattern recognition》2010,43(3):850-858

Aggregating statistical representations of classes is an important task for current trends in scaling up learning and recognition, or for addressing them in distributed infrastructures. In this perspective, we address the problem of merging probabilistic Gaussian mixture models in an efficient way, through the search for a suitable combination of components from mixtures to be merged. We propose a new Bayesian modelling of this combination problem, in association to a variational estimation technique, that handles efficiently the model complexity issue. A main feature of the present scheme is that it merely resorts to the parameters of the original mixture, ensuring low computational cost and possibly communication, should we operate on a distributed system. Experimental results are reported on real data. 相似文献

11.

Continuous mixture modeling via goodness-of-fit ridges

Stephen R. Aylward 《Pattern recognition》2002,35(9):1821-1833

We present a novel method for representing “extruded” distributions. An extruded distribution is an M-dimensional manifold in the parameter space of the component distribution. Representations of that manifold are “continuous mixture models”. We present a method for forming one-dimensional continuous Gaussian mixture models of sampled extruded Gaussian distributions via ridges of goodness-of-fit. Using Monte Carlo simulations and ROC analysis, we explore the utility of a variety of binning techniques and goodness-of-fit functions. We demonstrate that extruded Gaussian distributions are more accurately and consistently represented by continuous Gaussian mixture models than by finite Gaussian mixture models formed via maximum likelihood expectation maximization. 相似文献

12.

Active Learning with Local Models

Hasenjäger Martina Ritter Helge 《Neural Processing Letters》1998,7(2):107-117

In this contribution, we deal with active learning, which gives the learner the power to select training samples. We propose a novel query algorithm for local learning models, a class of learners that has not been considered in the context of active learning until now. Our query algorithm is based on the idea of selecting a query on the borderline of the actual classification. This is done by drawing on the geometrical properties of local models that typically induce a Voronoi tessellation on the input space, so that the Voronoi vertices of this tessellation offer themselves as prospective query points. The performance of the new query algorithm is tested on the two-spirals problem with promising results. 相似文献

13.

Non-Gaussian Data Clustering via Expectation Propagation Learning of Finite Dirichlet Mixture Models and Applications

Wentao Fan Nizar Bouguila 《Neural Processing Letters》2014,39(2):115-135

Learning appropriate statistical models is a fundamental data analysis task which has been the topic of continuing interest. Recently, finite Dirichlet mixture models have proved to be an effective and flexible model learning technique in several machine learning and data mining applications. In this article, the problem of learning and selecting finite Dirichlet mixture models is addressed using an expectation propagation (EP) inference framework. Within the proposed EP learning method, for finite mixture models, all the involved parameters and the model complexity (i.e. the number of mixture components), can be evaluated simultaneously in a single optimization framework. Extensive simulations using synthetic data along with two challenging real-world applications involving automatic image annotation and human action videos categorization demonstrate that our approach is able to achieve better results than comparable techniques. 相似文献

14.

A Multistrategy Approach to Classifier Learning from Time Series 总被引：1，自引：0，他引：1

Hsu William H. Ray Sylvian R. Wilkins David C. 《Machine Learning》2000,38(1-2):213-236

We present an approach to inductive concept learning using multiple models for time series. Our objective is to improve the efficiency and accuracy of concept learning by decomposing learning tasks that admit multiple types of learning architectures and mixture estimation methods. The decomposition method adapts attribute subset selection and constructive induction (cluster definition) to define new subproblems. To these problem definitions, we can apply metric-based model selection to select from a database of learning components, thereby producing a specification for supervised learning using a mixture model. We report positive learning results using temporal artificial neural networks (ANNs), on a synthetic, multiattribute learning problem and on a real-world time series monitoring application. 相似文献

15.

任务相关的图像小样本深度学习分类方法研究

陈晨王亚立乔宇《集成技术》2020,9(3):15-25

传统基于度量学习的图像小样本分类方法与任务无关,这导致模型对新查询任务的泛化能力较差。针对该问题,该研究提出一种任务相关的图像小样本深度学习方法——可以根据查询任务自适应地调整支持集样本特征,从而有效形成任务相关的度量分类器。同时,该研究通过引入多种正则化方法,解决了数据量严重不足所带来的过拟合问题。基于 miniImageNet 和 tieredImageNet 两个常用标准数据集,在特征提取网络相同的前提下,所提出方法在 miniImageNe 中 1-shot 上获得了 66.05% 的准确率,较目前最好的模型提高了 4.29%。相似文献

16.

Semantic Annotation and Retrieval of Music and Sound Effects

Turnbull D. Barrington L. Torres D. Lanckriet G. 《IEEE transactions on audio, speech, and language processing》2008,16(2):467-476

We present a computer audition system that can both annotate novel audio tracks with semantically meaningful words and retrieve relevant tracks from a database of unlabeled audio content given a text-based query. We consider the related tasks of content-based audio annotation and retrieval as one supervised multiclass, multilabel problem in which we model the joint probability of acoustic features and words. We collect a data set of 1700 human-generated annotations that describe 500 Western popular music tracks. For each word in a vocabulary, we use this data to train a Gaussian mixture model (GMM) over an audio feature space. We estimate the parameters of the model using the weighted mixture hierarchies expectation maximization algorithm. This algorithm is more scalable to large data sets and produces better density estimates than standard parameter estimation techniques. The quality of the music annotations produced by our system is comparable with the performance of humans on the same task. Our ldquoquery-by-textrdquo system can retrieve appropriate songs for a large number of musically relevant words. We also show that our audition system is general by learning a model that can annotate and retrieve sound effects. 相似文献

17.

Beyond independence: probabilistic models for query approximation on binary transaction data 总被引：1，自引：0，他引：1

Pavlov D.N. Mannila H. Smyth P. 《Knowledge and Data Engineering, IEEE Transactions on》2003,15(6):1409-1421

We investigate the problem of generating fast approximate answers to queries posed to large sparse binary data sets. We focus in particular on probabilistic model-based approaches to this problem and develop a number of techniques that are significantly more accurate than a baseline independence model. In particular, we introduce two techniques for building probabilistic models from frequent itemsets: the itemset maximum entropy model and the itemset inclusion-exclusion model. In the maximum entropy model, we treat itemsets as constraints on the distribution of the query variables and use the maximum entropy principle to build a joint probability model for the query attributes online. In the inclusion-exclusion model, itemsets and their frequencies are stored in a data structure, called an ADtree, that supports an efficient implementation of the inclusion-exclusion principle in order to answer the query. We empirically compare these two itemset-based models to direct querying of the original data, querying of samples of the original data, as well as other probabilistic models such as the independence model, the Chow-Liu tree model, and the Bernoulli mixture model. These models are able to handle high-dimensionality (hundreds or thousands of attributes), whereas most other work on this topic has focused on relatively low-dimensional OLAP problems. Experimental results on both simulated and real-world transaction data sets illustrate various fundamental trade offs between approximation error, model complexity, and the online time required to compute a query answer. 相似文献

18.

Voting techniques for expert search 总被引：4，自引：2，他引：2

Craig Macdonald Iadh Ounis 《Knowledge and Information Systems》2008,16(3):259-280

In an expert search task, the users’ need is to identify people who have relevant expertise to a topic of interest. An expert search system predicts and ranks the expertise of a set of candidate persons with respect to the users’ query. In this paper, we propose a novel approach for predicting and ranking candidate expertise with respect to a query, called the Voting Model for Expert Search. In the Voting Model, we see the problem of ranking experts as a voting problem. We model the voting problem using 12 various voting techniques, which are inspired from the data fusion field. We investigate the effectiveness of the Voting Model and the associated voting techniques across a range of document weighting models, in the context of the TREC 2005 and TREC 2006 Enterprise tracks. The evaluation results show that the voting paradigm is very effective, without using any query or collection-specific heuristics. Moreover, we show that improving the quality of the underlying document representation can significantly improve the retrieval performance of the voting techniques on an expert search task. In particular, we demonstrate that applying field-based weighting models improves the ranking of candidates. Finally, we demonstrate that the relative performance of the voting techniques for the proposed approach is stable on a given task regardless of the used weighting models, suggesting that some of the proposed voting techniques will always perform better than other voting techniques. Extended version of ‘Voting for candidates: adapting data fusion techniques for an expert search task’. C. Macdonald and I. Ounis. In Proceedings of ACM CIKM 2006, Arlington, VA. 2006. doi: 10.1145/1183614.1183671. 相似文献

19.

Monte-Carlo tree search for Bayesian reinforcement learning 总被引：2，自引：2，他引：0

Ngo Anh Vien Wolfgang Ertel Viet-Hung Dang TaeChoong Chung 《Applied Intelligence》2013,39(2):345-353

Bayesian model-based reinforcement learning can be formulated as a partially observable Markov decision process (POMDP) to provide a principled framework for optimally balancing exploitation and exploration. Then, a POMDP solver can be used to solve the problem. If the prior distribution over the environment’s dynamics is a product of Dirichlet distributions, the POMDP’s optimal value function can be represented using a set of multivariate polynomials. Unfortunately, the size of the polynomials grows exponentially with the problem horizon. In this paper, we examine the use of an online Monte-Carlo tree search (MCTS) algorithm for large POMDPs, to solve the Bayesian reinforcement learning problem online. We will show that such an algorithm successfully searches for a near-optimal policy. In addition, we examine the use of a parameter tying method to keep the model search space small, and propose the use of nested mixture of tied models to increase robustness of the method when our prior information does not allow us to specify the structure of tied models exactly. Experiments show that the proposed methods substantially improve scalability of current Bayesian reinforcement learning methods. 相似文献

20.

A fusion spatial attention approach for few-shot learning

《Information Fusion》2022

Few-shot learning is a challenging problem in computer vision that aims to learn a new visual concept from very limited data. A core issue is that there is a large amount of uncertainty introduced by the small training set. For example, the few images may include cluttered backgrounds or different scales of objects. Existing approaches mostly address this problem from either the original image space or the embedding space by using meta-learning. To the best of our knowledge, none of them tackle this problem from both spaces jointly. To this end, we propose a fusion spatial attention approach that performs spatial attention in both image and embedding spaces. In the image space, we employ a Saliency Object Detection (SOD) module to extract the saliency map of an image and provide it to the network as an additional channel. In the embedding space, we propose an Adaptive Pooling (Ada-P) module tailored to few-shot learning that introduces a meta-learner to adaptively fuse local features of the feature maps for each individual embedding. The fusion process assigns different pooling weights to the features at different spatial locations. Then, weighted pooling can be conducted over an embedding to fuse local information, which can avoid losing useful information by considering the spatial importance of the features. The SOD and Ada-P modules can be used within a plug-and-play module and incorporated into various existing few-shot learning approaches. We empirically demonstrate that designing spatial attention methods for few-shot learning is a nontrivial task and our method has proven effective for it. We evaluate our method using both shallow and deeper networks on three widely used few-shot learning benchmarks, miniImageNet, tieredImageNet and CUB, and demonstrate very competitive performance. 相似文献