首页 | 本学科首页   官方微博 | 高级检索  
     

基于深度神经网络的语义角色标注
引用本文:王明轩,刘群.基于深度神经网络的语义角色标注[J].中文信息学报,2018,32(2):50-57.
作者姓名:王明轩  刘群
作者单位:1.中国科学院 计算技术研究所 智能信息重点实验室,北京 100190;
2.ADAPT Centre,School of Computing,Dublin City University,Glasnevin,Dublin 9,Ireland.
摘    要:该文提出了一个基于多层长短期记忆神经网络的语义角色标注方法,并装置了新颖的“直梯单元”(elevator unit, EU)。EU包含了对单元内部输入和输出的线性连接,使信息可以通畅地在不同层之间传播。通过EU,一个20层的LSTM网络可以得到比较充分的优化。重要的是,这个线性连接包含的“门”函数可以正则和控制信息在时间方向和空间方向上的传播。不同层次的抽象信息也可以被EU直接带到输出层进行语义角色标注。尽管这个模型非常简单,不需要任何额外的特征输入,但是它取得了理想的实验结果,在CoNLL-2005公开数据集上取得了F=81.56%的结果,在CoNLL-2012公开数据集上取得了F=82.53%的结果,比之前最好的结果分别提高了0.5%和1.26%。另外,在领域外的数据集上我们也取得了F值2.2%的显著提升,这是当前世界上最好的性能。该模型比较简洁,非常容易实现和并行,在单一的K40 GPU上取得了每秒11.8K单词的解析速度,远远高于之前的方法。

关 键 词:语义角色标注  深度学习  

A Simple and Effective Deep Model for Semantic Role Labeling
WANG Mingxuan,LIU Qun.A Simple and Effective Deep Model for Semantic Role Labeling[J].Journal of Chinese Information Processing,2018,32(2):50-57.
Authors:WANG Mingxuan  LIU Qun
Affiliation:1.Key Lab of Intelligent Information Processing of Chinese Academy of Sciences, Institute of Computing Technology, Chinese Academy of Sciences, Beijing 100190, China;
2.ADAPT Centre, School of Computing, Dublin City University, Glasnevin, Dublin 9, Ireland
Abstract:We propose a new deep long short-term memory (LSTM) network equipped with elevator unit (EU) for semantic role labeling (SRL). The EU conducts a linear combination of adjacent layers which allows unimpeded information flowing across several layers. With the EU, a very deep stack LSTM up to 20 layers can be easily optimized. Specially, the connection also contains a gate function to regulate and control the information flow through space direction. The appropriate levels of representations are directly guided to the output layer for predicting the corresponding semantic roles. Although the model is quite simple with only the very original utterances as the input, it yields strong empirical results. Our approach achieves 81.56% F1 score on CoNLL-2005 shared datasets, and 82.53% on CoNLL-2012 shared datasets, which outperform the previous state-of-the-art results by 0.5% and 1.26%, respectively. Remarkably, we obtain surprisingly improvement in out-domain datasets by 2.2% in F1 score compared with previous state-of-the-art system. The model is simpl, and easy to implement and parallelize, yielding 11.8k tokens per second on a single K40 GPU.
Keywords:
点击此处可从《中文信息学报》浏览原始摘要信息
点击此处可从《中文信息学报》下载全文
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号