首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
Structural Support Vector Machines (SSVMs) and Conditional Random Fields (CRFs) are popular discriminative methods used for classifying structured and complex objects like parse trees, image segments and part-of-speech tags. The datasets involved are very large dimensional, and the models designed using typical training algorithms for SSVMs and CRFs are non-sparse. This non-sparse nature of models results in slow inference. Thus, there is a need to devise new algorithms for sparse SSVM and CRF classifier design. Use of elastic net and L1-regularizer has already been explored for solving primal CRF and SSVM problems, respectively, to design sparse classifiers. In this work, we focus on dual elastic net regularized SSVM and CRF. By exploiting the weakly coupled structure of these convex programming problems, we propose a new sequential alternating proximal (SAP) algorithm to solve these dual problems. This algorithm works by sequentially visiting each training set example and solving a simple subproblem restricted to a small subset of variables associated with that example. Numerical experiments on various benchmark sequence labeling datasets demonstrate that the proposed algorithm scales well. Further, the classifiers designed are sparser than those designed by solving the respective primal problems and demonstrate comparable generalization performance. Thus, the proposed SAP algorithm is a useful alternative for sparse SSVM and CRF classifier design.  相似文献   

2.
We consider the problem of predicting a sequence of real-valued multivariate states that are correlated by some unknown dynamics, from a given measurement sequence. Although dynamic systems such as the State-Space Models are popular probabilistic models for the problem, their joint modeling of states and observations, as well as the traditional generative learning by maximizing a joint likelihood may not be optimal for the ultimate prediction goal. In this paper, we suggest two novel discriminative approaches to the dynamic state prediction: 1) learning generative state-space models with discriminative objectives and 2) developing an undirected conditional model. These approaches are motivated by the success of recent discriminative approaches to the structured output classification in discrete-state domains, namely, discriminative training of Hidden Markov Models and Conditional Random Fields (CRFs). Extending CRFs to real multivariate state domains generally entails imposing density integrability constraints on the CRF parameter space, which can make the parameter learning difficult. We introduce an efficient convex learning algorithm to handle this task. Experiments on several problem domains, including human motion and robot-arm state estimation, indicate that the proposed approaches yield high prediction accuracy comparable to or better than state-of-the-art methods.  相似文献   

3.
传统的基于条件随机场(CRF)的信息抽取方法在进行涉农商品名称抽取与类别标注时,需要大量的训练语料,标注工作量大,且抽取精度不高。为解决该问题,提出了一种基于农业本体与CRF相结合的涉农商品名称抽取与类别标注方法,将涉农商品名称的自动抽取与分类看作序列标注的任务。首先是原始数据的分词处理和词、词性、地理属性、本体概念特征选择;然后,采用改进的拟牛顿算法训练CRF模型参数,用维特比算法实现解码,共完成4组对比实验,识别出7种类别,并将CRF和隐马尔可夫模型(HMM)、最大熵马尔可夫模型(MEMM)通过实验进行比较;最后,将CRF应用于农产品供求趋势分析。结合合适的特征模板,本体概念的加入使CRF开放测试的总体准确率提高10.20%,召回率提高59.78%,F值提高37.17%,证明了本体与CRF结合方法在涉农商品名称和类别抽取中的可行性和有效性,可以促进农产品供求对接。  相似文献   

4.
中文分词是众多自然语言处理任务的基本工作。该文提出了一个用双层模型进行中文分词的方法。首先在低层利用前向最大匹配算法(FMM)进行粗分词,并将切分结果传至高层;在高层利用CRFs对文本重新进行标注,其中低层的识别结果作为CRFs的一项特征,最后将对每个字的标注结果转换为相应的分词结果。,跟以前单独利用CRF进行分词的模型相比.低层模型的加入对CRFs模型的标注起到了重要的辅助作用。在北京大学标注的1998年1月份的人民日报语料上进行了大量的实验,取得了精确率93.31%,召回车92.75%的切分结果,证明该方法是切实可行的。  相似文献   

5.
许勇  宋柔 《计算机工程》2007,33(10):16-18
CRF模型是标注、切分序列数据的较新的概率模型,在信息抽取等文本处理领域广受关注。该文介绍了CRF方法,并将其应用到百科全书文本段落的划分上,利用CRF的特征表述机制加入了文本单元序列中的长距离约束,取得了比传统的隐马尔科夫方法更好的结果。  相似文献   

6.
当前主流的中文分词方法是基于字标注的传统机器学习的方法。但传统机器学习方法需要人为地从中文文本中配置并提取特征,存在词库维度高且仅利用CPU训练模型时间长的缺点。针对以上问题,进行了研究提出基于LSTM(Long Short-Term Memory)网络模型的改进方法,采用不同词位标注集并加入预先训练的字嵌入向量(character embedding)进行中文分词。在中文分词评测常用的语料上进行实验对比,结果表明:基于LSTM网络模型的方法能得到比当前传统机器学习方法更好的性能;采用六词位标注并加入预先训练的字嵌入向量能够取得相对最好的分词性能;而且利用GPU可以大大缩短深度神经网络模型的训练时间;LSTM网络模型的方法也更容易推广并应用到其他自然语言处理(NLP)中序列标注的任务。  相似文献   

7.
该文提出一种基于汉语语块这一浅层句法信息,并利用条件随机场模型的中文文本韵律短语边界预测方法。首先介绍语块的定义和标注算法,然后在进行了语块结构标注以及归并处理的语料上,利用CRFs算法生成相应模型对韵律短语进行识别。实验结果表明,基于语块信息的CRFs韵律短语识别模型的识别效果优于不利用语块结构的模型,其F值平均能够提高约十个百分点。  相似文献   

8.
Conditional random fields (CRFs) are a statistical framework that has recently gained in popularity in both the automatic speech recognition (ASR) and natural language processing communities because of the different nature of assumptions that are made in predicting sequences of labels compared to the more traditional hidden Markov model (HMM). In the ASR community, CRFs have been employed in a method similar to that of HMMs, using the sufficient statistics of input data to compute the probability of label sequences given acoustic input. In this paper, we explore the application of CRFs to combine local posterior estimates provided by multilayer perceptrons (MLPs) corresponding to the frame-level prediction of phone classes and phonological attribute classes. We compare phonetic recognition using CRFs to an HMM system trained on the same input features and show that the monophone label CRF is able to achieve superior performance to a monophone-based HMM and performance comparable to a 16 Gaussian mixture triphone-based HMM; in both of these cases, the CRF obtains these results with far fewer free parameters. The CRF is also able to better combine these posterior estimators, achieving a substantial increase in performance over an HMM-based triphone system by mixing the two highly correlated sets of phone class and phonetic attribute class posteriors.  相似文献   

9.
10.
This paper presents a novel online relevant set algorithm for a linearly scored block sequence translation model. The key component is a new procedure to directly optimize the global scoring function used by a statistical machine translation (SMT) decoder. This training procedure treats the decoder as a black-box, and thus can be used to optimize any decoding scheme. The novel algorithm is evaluated using different feature types: 1) commonly used probabilistic features, such as translation, language, or distortion model probabilities, and 2) binary features. In particular, encouraging results on a standard Arabic–English translation task are presented for a translation system that uses only binary feature functions. To further demonstrate the effectiveness of the novel training algorithm, a detailed comparison with the widely used minimum-error-rate (MER) training algorithm is presented using the same decoder and feature set. The online algorithm is simplified by introducing so-called “seed” block sequences which enable the training to be carried out without a gold standard block translation. While the online training algorithm is extremely fast, it also improves translation scores over the MER algorithm in some experiments.   相似文献   

11.
Discriminative techniques, such as conditional random fields (CRFs) or structure aware maximum-margin techniques (maximum margin Markov networks (M3N), structured output support vector machines (S-SVM)), are state-of-the-art in the prediction of structured data. However, to achieve good results these techniques require complete and reliable ground truth, which is not always available in realistic problems. Furthermore, training either CRFs or margin-based techniques is computationally costly, because the runtime of current training methods depends not only on the size of the training set but also on properties of the output space to which the training samples are assigned. We propose an alternative model for structured output prediction, Joint Kernel Support Estimation (JKSE), which is rather generative in nature as it relies on estimating the joint probability density of samples and labels in the training set. This makes it tolerant against incomplete or incorrect labels and also opens the possibility of learning in situations where more than one output label can be considered correct. At the same time, we avoid typical problems of generative models as we do not attempt to learn the full joint probability distribution, but we model only its support in a joint reproducing kernel Hilbert space. As a consequence, JKSE training is possible by an adaption of the classical one-class SVM procedure. The resulting optimization problem is convex and efficiently solvable even with tens of thousands of training examples. A particular advantage of JKSE is that the training speed depends only on the size of the training set, and not on the total size of the label space. No inference step during training is required (as M3N and S-SVM would) nor do we have calculate a partition function (as CRFs do). Experiments on realistic data show that, for suitable kernel functions, our method works efficiently and robustly in situations that discriminative techniques have problems with or that are computationally infeasible for them.  相似文献   

12.
侯旭东  滕飞  张艺 《计算机应用》2022,42(9):2686-2692
针对在医疗命名实体识别(MNER)问题中随着网络加深,基于深度学习的识别模型出现的识别精度与算力要求不平衡的问题,提出一种基于深度自编码的医疗命名实体识别模型CasSAttMNER。首先,使用编码与解码间深度差平衡策略,以经过蒸馏的Transformer语言模型RBT6作为编码器以减小编码深度以及降低对训练和应用上的算力要求;然后,使用双向长短期记忆(BiLSTM)网络和条件随机场(CRF)提出了级联式多任务双解码器,从而完成实体提及序列标注与实体类别判断;最后,基于自注意力机制在实体类别中增加实体提及过程抽取的隐解码信息,以此来优化模型设计。实验结果表明,CasSAttMNER在两个中文医疗实体数据集上的F值度量可分别达到0.943 9和0.945 7,较基线模型分别提高了3个百分点和8个百分点,验证了该模型更进一步地提升了解码器性能。  相似文献   

13.
汉语组块分析是中文信息处理领域中一项重要的子任务.在一种新的结构化SVMs(support vectormachines)模型的基础上,提出一种基于大间隔方法的汉语组块分析方法.首先,针对汉语组块分析问题设计了序列化标注模型;然后根据大间隔思想给出判别式的序列化标注函数的优化目标,并应用割平面算法实现对特征参数的近似优化训练.针对组块识别问题设计了一种改进的F1 损失函数,使得F1损失值能够依据每个句子的实际长度进行相应的调整,从而能够引入更有效的约束不等式.通过在滨州中文树库CTB4 数据集上的实验数据显示,基于改进的F1 损失函数所产生的识别结果优于Hamming 损失函数,各种类型组块识别的总的F1 值为91.61%,优于CRFs(conditional random fields)和SVMs 方法.  相似文献   

14.
识别谓语动词是理解句子的关键。由于中文谓语动词结构复杂、使用灵活、形式多变,识别谓语动词在中文自然语言处理中是一项具有挑战的任务。本文从信息抽取角度,介绍了与中文谓语动词识别相关的概念,提出了一种针对中文谓语动词标注方法。在此基础上,研究了一种基于Attentional-BiLSTM-CRF神经网络的中文谓语动词识别方法。该方法通过双向递归神经网络获取句子内部的依赖关系,然后用注意力机制建模句子的焦点角色。最后通过条件随机场(Conditional random field, CRF)层返回一条最大化的标注路径。此外,为解决谓语动词输出唯一性的问题,提出了一种基于卷积神经网络的谓语动词唯一性识别模型。通过实验,该算法超出传统的序列标注模型CRF,在本文标注的中文谓语动词数据上到达76.75%的F值。  相似文献   

15.
意图识别和约束条件分析是口语理解(SLU)中的两个重要过程。前者是分类问题,判断话语意图;后者可以看作序列标注问题,给关键信息标特定标签。该文提出了一种LSTM联合模型,同时结合了CRF和注意力机制。在ID问题上,将所有词语输出层向量的加权和用于分类;在SF问题上,考虑标签之间的转移,计算标签序列在全局的可能性。在中文数据集和ATIS英文数据集上的实验验证了该文所提方法的有效性。  相似文献   

16.
基于BiLSTM-CRF的关键词自动抽取   总被引:1,自引:0,他引:1  
陈伟  吴友政  陈文亮  张民 《计算机科学》2018,45(Z6):91-96, 113
关键词自动抽取是自然语言处理(Natural Language Processing,NLP)的一项重要任务,给个性化推荐、网购等应用提供了重要的技术支撑。针对关键词自动抽取问题,提出一种新的基于双向长短期记忆网络条件随机场(Bidirectional Long Short-Term Memory Network Conditional Random Field,BiLSTM-CRF)的方法,并将该问题刻画为序列标注问题。首先,该方法通过对输入的文本进行建模,把文本表示为低维高密度的向量;然后,使用分类算法对各个词进行分类;最后,使用CRF对整个标注序列进行解码,得到最终结果。在一个大规模的真实数据中进行实验,结果表明该方法较基准系统性能提高约1个百分点。  相似文献   

17.
藏文分词问题是藏文自然语言处理的基本问题之一,该文首先通过对35.1M的藏文语料进行标注之后,通过条件随机场模型对其进行训练,生成模型参数,再用模版对未分词的语料进行分词,针对基于条件随机场分词结果中存在的非藏文字符切分错误,藏文黏着词识别错误,停用词切分错误,未登录词切分错误等问题分别总结了规则,并对分词的结果利用规则进行再加工,得到最终的分词结果,开放实验表明该系统的正确率96.11%,召回率96.03%,F值96.06%。  相似文献   

18.
命名实体识别的目的是识别文本中的实体指称的边界和类别。在进行命名实体识别模型训练的过程中,通常需要大量的标注样本。本文通过实现有效的选择算法,从大量样本中选择适合模型更新的样本,减少对样本的标注工作。通过5组对比实验,验证使用有效的选择算法能够获得更好的样本集,实现具有针对性的标注样本。通过设计在微博网络数据集上的实验,验证本文提出的基于流的主动学习算法可以针对大量互联网文本数据选择出更合适的样本集,能够有效减少人工标注的成本。本文通过2个模型分别实现实体的边界提取和类别区分。序列标注模型提取出实体在序列中的位置,实体分类模型实现对标注结果的分类,并利用主动学习的方法实现在无标注数据集上的训练。使用本文的训练方法在2个数据集上进行实验。在Weibo数据集上的实验展示算法能从无标签数据集中学习到文本特征。在MSRA数据集上的实验结果显示,在预训练数据集的比例达到40%以上时,模型在测试数据集上的F1值稳定在90%左右,与使用全部数据集的结果接近,说明模型在无标签数据集上具有一定的特征提取能力。  相似文献   

19.
Nowadays, deep neural networks (DNNs) for image processing are becoming more complex; thus, reducing computational cost is increasingly important. This study highlights the construction of a DNN for real‐time image processing, training various image processing operators efficiently through multitask learning. For real‐time image processing, the proposed algorithm takes a joint upsampling approach through bilateral guided upsampling. For multitask learning, the overall network is based on an encoder‐decoder architecture, which consists of encoding, processing, and decoding components, in which the encoding and decoding components are shared by all the image processing operators. In the processing component, a semantic guidance map, which contains processing information for each image processing operator, is estimated using simple linear shifts of the shared deep features. Through these components, the proposed algorithm requires an increase of only 5% in the number of parameters to add another image processing operator and achieves faster and higher performance than that of deep‐learning‐based joint upsampling methods in local image processing as well as global image processing.  相似文献   

20.
在俄语语音信息处理的资源建设中,字音转换技术起到了至关重要的作用。该文尝试对基于SAMPA的俄语音素集进行改进设计,使标音结果能够反映俄语单词的重音位置及元音弱化现象。依据改进的新音素集构建了包含20 000词的俄语发音词典。在此基础上,实现了一种数据驱动的俄语字音转换算法,将加权有限状态转化器(WFST)应用于算法的对齐、建模和解码过程中。首先利用期望最大化算法以“多对多”的方式对俄语字音进行对齐,然后将对齐结果通过联合N-gram模型训练,并转化为WFST发音模型,最后通过WFST解码算法对任意单词的发音进行预测。交叉验证实验结果表明,平均词形正确率为62.9%,平均音素正确率为92.2%。  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号