首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 78 毫秒
1.
Given multiple prediction problems such as regression or classification, we are interested in a joint inference framework that can effectively share information between tasks to improve the prediction accuracy, especially when the number of training examples per problem is small. In this paper we propose a probabilistic framework which can support a set of latent variable models for different multi-task learning scenarios. We show that the framework is a generalization of standard learning methods for single prediction problems and it can effectively model the shared structure among different prediction tasks. Furthermore, we present efficient algorithms for the empirical Bayes method as well as point estimation. Our experiments on both simulated datasets and real world classification datasets show the effectiveness of the proposed models in two evaluation settings: a standard multi-task learning setting and a transfer learning setting.  相似文献   

2.
为提高家庭服务机器人指令中目标对象预测的准确率,提出一种基于混合深度学习的多模态自然语言理处理(Natural Language Processing,NLP)指令分类方法.该方法从语言特征、视觉特征和关系特征多模态入手,采用两种深度学习方法分别以多模态特征进行编码.对于语言指令,采用多层双向长短期记忆(Bi-LSTM...  相似文献   

3.
A stopping criterion for active learning   总被引:1,自引:0,他引:1  
Active learning (AL) is a framework that attempts to reduce the cost of annotating training material for statistical learning methods. While a lot of papers have been presented on applying AL to natural language processing tasks reporting impressive savings, little work has been done on defining a stopping criterion. In this work, we present a stopping criterion for active learning based on the way instances are selected during uncertainty-based sampling and verify its applicability in a variety of settings. The statistical learning models used in our study are support vector machines (SVMs), maximum entropy models and Bayesian logistic regression and the tasks performed are text classification, named entity recognition and shallow parsing. In addition, we present a method for multiclass mutually exclusive SVM active learning.  相似文献   

4.
基于深度学习的语言模型研究进展   总被引:1,自引:0,他引:1  
王乃钰  叶育鑫  刘露  凤丽洲  包铁  彭涛 《软件学报》2021,32(4):1082-1115
语言模型旨在对语言的内隐知识进行表示,作为自然语言处理的基本问题,一直广受关注.基于深度学习的语言模型是目前自然语言处理领域的研究热点,通过预训练-微调技术展现了内在强大的表示能力,并能够大幅提升下游任务性能.围绕语言模型基本原理和不同应用方向,以神经概率语言模型与预训练语言模型作为深度学习与自然语言处理结合的切入点,...  相似文献   

5.
深度学习为组合优化问题提供了新的解决思路,目前该研究方向多关注于对模型和训练方法的改良,更多的论文引入自然语言处理方向的新模型来加以改进求解效果,而缺乏从实例的数据生成方向来关注模型的泛化能力和鲁棒性。为解决该问题,借鉴对抗学习的思想,针对经典组合优化问题——旅行商问题,从数据生成方向切入研究,设计生成器网络,使用监督学习的方式来产生对抗样本,并将对抗样本加入到随机样本中混合训练,以改善模型对该类问题的泛化性能。同时,依据强化学习训练过程中判别器模型的更新方式提出一种自适应机制,来训练对抗模型,最终得到能够在随机分布样本上和对抗样本上都取得较好结果的模型。仿真验证了所提出方法的有效性。  相似文献   

6.
Cardie  Claire 《Machine Learning》2000,41(1):85-116
Research in psychology, psycholinguistics, and cognitive science has discovered and examined numerous psychological constraints on human information processing. Short term memory limitations, a focus of attention bias, and a preference for the use of temporally recent information are three examples. This paper shows that psychological constraints such as these can be used effectively as domain-independent sources of bias to guide feature set selection and weighting for case-based learning algorithms.We first show that cognitive biases can be automatically and explicitly encoded into the baseline instance representation: each bias modifies the representation by changing features, deleting features, or modifying feature weights. Next, we investigate the related problems of cognitive bias selection and cognitive bias interaction for the feature weighting approach. In particular, we compare two cross-validation algorithms for bias selection that make different assumptions about the independence of individual component biases. In evaluations on four natural language learning tasks, we show that the bias selection algorithms can determine which cognitive bias or biases are relevant for each learning task and that the accuracy of the case-based learning algorithm improves significantly when the selected bias(es) are incorporated into the baseline instance representation.  相似文献   

7.
CCDM 2014数据挖掘竞赛基于医学诊断数据,提出了实际生活中广泛出现的多类标问题和多类分类问题。针对两个问题出现的类别不平衡现象以及训练样本较少等特点,为了更好地完成数据挖掘任务,借助二次学习和集成学习的思想,提出了一个新的学习框架--二次集成学习。该学习框架通过首次集成学习得到若干置信度较高的样本,将其加入到原始训练集,并在新的训练集上进行二次学习,进而得到泛化性能更高的分类器。竞赛结果表明,与常用的集成学习相比,二次集成学习在两个问题上均取得了非常理想的结果。  相似文献   

8.
Nearest neighbor editing aided by unlabeled data   总被引:1,自引:0,他引:1  
This paper proposes a novel method for nearest neighbor editing. Nearest neighbor editing aims to increase the classifier’s generalization ability by removing noisy instances from the training set. Traditionally nearest neighbor editing edits (removes/retains) each instance by the voting of the instances in the training set (labeled instances). However, motivated by semi-supervised learning, we propose a novel editing methodology which edits each training instance by the voting of all the available instances (both labeled and unlabeled instances). We expect that the editing performance could be boosted by appropriately using unlabeled data. Our idea relies on the fact that in many applications, in addition to the training instances, many unlabeled instances are also available since they do not need human annotation effort. Three popular data editing methods, including edited nearest neighbor, repeated edited nearest neighbor and All k-NN are adopted to verify our idea. They are tested on a set of UCI data sets. Experimental results indicate that all the three editing methods can achieve improved performance with the aid of unlabeled data. Moreover, the improvement is more remarkable when the ratio of training data to unlabeled data is small.  相似文献   

9.
Local deterministic string-to-string transductions arise in natural language processing (NLP) tasks such as letter-to-sound translation or pronunciation modeling. This class of transductions is a simple generalization of morphisms of free monoids; learning local transductions is essentially the same as inference of certain monoid morphisms. However, learning even a highly restricted class of morphisms, the so-called fine morphisms, leads to intractable problems: deciding whether a hypothesized fine morphism is consistent with observations is an NP-complete problem; and maximizing classification accuracy of the even smaller class of alphabetic substitution morphisms is APX-hard. These theoretical results provide some justification for using the kinds of heuristics that are commonly used for this learning task.  相似文献   

10.
Reduction Techniques for Instance-Based Learning Algorithms   总被引:19,自引:0,他引:19  
  相似文献   

11.
深度学习在图像、文本、语音等媒体数据的分析任务上取得了优异的性能. 数据增强可以非常有效地提升训练数据的规模以及多样性, 从而提高模型的泛化性. 但是, 对于给定数据集, 设计优异的数据增强策略大量依赖专家经验和领域知识, 而且需要反复尝试, 费时费力. 近年来, 自动化数据增强通过机器自动设计数据增强策略, 已引起了学界和业界的广泛关注. 为了解决现有自动化数据增强算法尚无法在预测准确率和搜索效率之间取得良好平衡的问题, 提出一种基于自引导进化策略的自动化数据增强算法SGES AA. 首先, 设计一种有效的数据增强策略连续化向量表示方法, 并将自动化数据增强问题转换为连续化策略向量的搜索问题. 其次, 提出一种基于自引导进化策略的策略向量搜索方法, 通过引入历史估计梯度信息指导探索点的采样与更新, 在能够有效避免陷入局部最优解的同时, 可提升搜索过程的收敛速度. 在图像、文本以及语音数据集上的大量实验结果表明, 所提算法在不显著增加搜索耗时的情况下, 预测准确率优于或者匹配目前最优的自动化数据增强方法.  相似文献   

12.
多任务学习在自然语言处理领域有广泛应用, 但多任务模型往往对任务间的相关性比较敏感. 如果任务相关性较低或信息传递不合理, 可能会严重影响任务性能. 本文提出了一种新的共享-私有结构的多任务学习模型BB-MTL (BERT-BiLSTM multi-task learning model), 并借助元学习的思想为其设计了一种特殊的参数优化方式MLL-TM (meta-learning-like train methods). 进一步引入一个新的信息融合门SoWLG (Softmax weighted linear gate), 用于选择性地融合每项任务的共享特征与私有特征. 实验验证所提出的多任务学习方法, 考虑到用户在网络上的行为与其个体特征密切相关, 文中结合了不良言论检测、人格检测和情绪检测任务进行了一系列实验. 实验结果表明, BB-MTL能够有效学习相关任务中的特征信息, 在3项任务上的准确率分别达到了81.56%、77.09%和70.82%.  相似文献   

13.
多任务LS-SVM在时间序列预测中的应用   总被引:1,自引:0,他引:1       下载免费PDF全文
针对单任务时间序列中存在的信息挖掘不充分、预测精度低等问题,提出了一种基于多任务最小二乘支持向量机(MTLS-SVM)的时间序列预测方法。该方法将多个时间序列任务同时进行学习,使得在训练过程中任务之间能够相互牵制起到归纳偏置作用,最终有效提高模型的预测精度。首先,利用相邻时间点之间的密切相关性,构造多个相邻时间点的学习任务,然后将每个任务对应的数据集同时训练MTLS-SVM模型并将其用于预测。将该方法用于几个时间序列数据集并与单任务LS-SVM方法相比,实验结果表明该方法具有较高的预测精度,验证了方法的可行性和有效性。  相似文献   

14.
Not all instances in a data set are equally beneficial for inferring a model of the data, and some instances (such as outliers) can be detrimental. Several machine learning techniques treat the instances in a data set differently during training such as curriculum learning, filtering, and boosting. However, it is difficult to determine how beneficial an instance is for inferring a model of the data. In this article, we present an automated method that orders the instances in a data set by complexity based on their likelihood of being misclassified (instance hardness) for supervised classification problems that generates a hardness ordering. The underlying assumption of this method is that instances with a high likelihood of being misclassified represent more complex concepts in a data set. Using a hardness ordering allows a learning algorithm to focus on the most beneficial instances. We integrate a hardness ordering into the learning process using curriculum learning, filtering, and boosting. We find that focusing on the simpler instances during training significantly increases generalization accuracy. Also, the effects of curriculum learning depend on the learning algorithm that is used. In general, filtering and boosting outperform curriculum learning, and filtering has the most significant effect on accuracy. © 2014 Wiley Periodicals, Inc.  相似文献   

15.
摘 要: 多维分类根据数据实例的特征向量将数据实例在多个维度上进行分类,具有广泛的应用前景。在多维分类算法的模型学习过程中,海量的训练数据使得准确的分类算法需要很长的模型训练时间。为了提高多维分类的执行效率,同时保持高的预测准确性,本文提出了一种基于贝叶斯网络的多维分类学习方法。首先,将多维分类问题描述为条件概率分布问题。其次,根据类别向量之间的依赖关系建立了条件树贝叶斯网络模型。最后,根据训练数据集对条件树贝叶斯网络模型的结构和参数进行学习,并提出了一种多维分类预测算法。大量的真实数据集实验表明,本文提出的方法与当前最好的多维分类算法MMOC相比,在保持高准确性的同时将模型的训练时间降低了两个数量级。因此,本文提出的方法更适用于海量数据的多维分类应用中。  相似文献   

16.
We describe the IGTree learning algorithm, which compresses an instance base into a tree structure. The concept of information gain is used as a heuristic function for performing this compression. IGTree produces trees that, compared to other lazy learning approaches, reduce storage requirements and the time required to compute classifications. Furthermore, we obtained similar or better generalization accuracy with IGTree when trained on two complex linguistic tasks, viz. letter–phoneme transliteration and part-of-speech-tagging, when compared to alternative lazy learning and decision tree approaches (viz., IB1, information-gain-weighted IB1, and C4.5). A third experiment, with the task of word hyphenation, demonstrates that when the mutual differences in information gain of features is too small, IGTree as well as information-gain-weighted IB1 perform worse than IB1. These results indicate that IGTree is a useful algorithm for problems characterized by the availability of a large number of training instances described by symbolic features with sufficiently differing information gain values.  相似文献   

17.
在多示例学习中引入利用未标记示例的机制,能降低训练的成本并提高学习器的泛化能力。当前半监督多示例学习算法大部分是基于对包中的每一个示例进行标记,把多示例学习转化为一个单示例半监督学习问题。考虑到包的类标记由包中示例及包的结构决定,提出一种直接在包层次上进行半监督学习的多示例学习算法。通过定义多示例核,利用所有包(有标记和未标记)计算包层次的图拉普拉斯矩阵,作为优化目标中的光滑性惩罚项。在多示例核所张成的RKHS空间中寻找最优解被归结为确定一个经过未标记数据修改的多示例核函数,它能直接用在经典的核学习方法上。在实验数据集上对算法进行了测试,并和已有的算法进行了比较。实验结果表明,基于半监督多示例核的算法能够使用更少量的训练数据而达到与监督学习算法同样的精度,在有标记数据集相同的情况下利用未标记数据能有效地提高学习器的泛化能力。  相似文献   

18.
文本表示学习作为自然语言处理的一项重要基础性工作, 在经历了向量空间模型、词向量模型以及上下文分布式表示的一系列发展后, 其语义表示能力已经取得了较大突破, 并直接促进模型在机器阅读、文本检索等下游任务上的表现不断提升. 然而, 预训练语言模型作为当前最先进的文本表示学习方法, 在训练阶段和预测阶段的时空复杂度较高, 造成了较高的使用门槛. 为此, 本文提出了一种基于深度哈希和预训练的新的文本表示学习方法, 旨在以更低的计算量实现尽可能高的文本表示能力. 实验结果表明, 在牺牲有限性能的情况下, 本文所提出的方法可以大幅降低模型在预测阶段的计算复杂度, 在很大程度上提升了模型在预测阶段的使用效率.  相似文献   

19.
文本分类将自然语言文本接内容归入一个或多个预定义类别中,在许多信息组织和管理中都是一项重要的内容。不同算法的分类准确性各不相同。通过训练实例可以得到准确率很高的文本分类器。  相似文献   

20.
作为自然语言处理技术中的底层任务之一,文本分类任务对于上游任务有非常重要的辅助价值。而随着最近几年深度学习广泛应用于NLP中的上下游任务的趋势,深度学习在下游任务文本分类中性能不错。但是目前的基于深层学习网络的模型在捕捉文本序列的长距离型上下文语义信息进行建模方面仍有不足,同时也没有引入语言信息来辅助分类器进行分类。针对这些问题,提出了一种新颖的结合Bert与Bi-LSTM的英文文本分类模。该模型不仅能够通过Bert预训练语言模型引入语言信息提升分类的准确性,还能基于Bi-LSTM网络去捕捉双向的上下文语义依赖信息对文本进行显示建模。具体而言,该模型主要有输入层、Bert预训练语言模型层、Bi-LSTM层以及分类器层搭建而成。实验结果表明,与现有的分类模型相比较,所提出的Bert-Bi-LSTM模型在MR数据集、SST-2数据集以及CoLA数据集测试中达到了最高的分类准确率,分别为86.2%、91.5%与83.2%,大大提升了英文文本分类模型的性能。  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号