首页 | 本学科首页   官方微博 | 高级检索  
     

基于双向LSTM和两阶段方法的触发词识别
引用本文:何馨宇,李丽双. 基于双向LSTM和两阶段方法的触发词识别[J]. 中文信息学报, 2017, 31(6): 147-154
作者姓名:何馨宇  李丽双
作者单位:大连理工大学 计算机科学与技术学院,辽宁 大连 116023
基金项目:国家自然科学基金(61672126)
摘    要:生物事件抽取是生物文本挖掘领域的一个重要分支,而触发词识别作为事件抽取的重要子过程,已经吸引了众多的关注。现有的触发词识别方法多为浅层的一阶段方法,训练代价较大,且需要丰富的领域知识抽取大量特征,人工成本较高。因此,该文提出了一种基于两阶段和双向LSTM神经网络的触发词识别方法。首先,将触发词识别分为识别和分类两个阶段,有效地缓解了训练过程中存在的类不平衡问题;其次,在两个阶段中均采用目前性能较好的双向LSTM神经网络来完成二分类任务和多分类任务,避免了浅层机器学习方法抽取人工特征时的代价。此外,利用PubMed数据库下载大规模语料训练带有依存关系的词向量,获得了更加丰富的语义信息,从而有效地提高了触发词的识别性能。该文方法在生物事件抽取通用语料MLEE上已获得目前最好抽取性能,F值为78.46%。

关 键 词:触发词识别  两阶段方法  双向LSTM  依存词向量  

Trigger Detection Based on Bidirectional LSTM and Two-stage Method
HE Xinyu,LI Lishuang. Trigger Detection Based on Bidirectional LSTM and Two-stage Method[J]. Journal of Chinese Information Processing, 2017, 31(6): 147-154
Authors:HE Xinyu  LI Lishuang
Affiliation:School of Computer Science and Technology, Dalian University of Technology, Dalian, Liaoning 116023, China
Abstract:The trigger detection is of significance in the biomedical event extraction. The existing trigger detection methods are almost one-stage methods based on shallow machine learning, which demands on heavy training on the rich domain knowledge and sufficient manual features. In this paper, we propose a two-stage trigger detection method based on Bidirectional Long Short Term Memory (BLSTM), which divides trigger detection into recognition stage and classification stage. This approach can relieve the issue of imbalance class effectively, and avoid the cost of manual feature extraction. In addition, to obtain more semantic information, we use the large-scale corpus downloaded from the PubMed database to train the dependency word embeddings, which effectively improves the recognition performance of trigger detection. On the multi-level event extraction (MLEE) corpus dataset, our method achieves an F-score of 78.46%, which outperforms the state-of-the-art systems.
Keywords:trigger detection    two-stage method    bidirectional LSTM    dependency word embeddings  
点击此处可从《中文信息学报》浏览原始摘要信息
点击此处可从《中文信息学报》下载全文
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号