首页 | 本学科首页   官方微博 | 高级检索  
     

基于BERT和BiLSTM的语义槽填充
引用本文:张玉帅,赵欢,李博.基于BERT和BiLSTM的语义槽填充[J].计算机科学,2021,48(1):247-252.
作者姓名:张玉帅  赵欢  李博
作者单位:湖南大学信息科学与工程学院 长沙 410082;湖南大学信息科学与工程学院 长沙 410082;湖南大学信息科学与工程学院 长沙 410082
摘    要:语义槽填充是对话系统中一项非常重要的任务,旨在为输入句子的每个单词标注正确的标签,其性能的好坏极大地影响着后续的对话管理模块。目前,使用深度学习方法解决该任务时,一般利用随机词向量或者预训练词向量作为模型的初始化词向量。但是,随机词向量存在不具备语义和语法信息的缺点;预训练词向量存在“一词一义”的缺点,无法为模型提供具备上下文依赖的词向量。针对该问题,提出了一种基于预训练模型BERT和长短期记忆网络的深度学习模型。该模型使用基于Transformer的双向编码表征模型(Bidirectional Encoder Representations from Transformers,BERT)产生具备上下文依赖的词向量,并将其作为双向长短期记忆网络(Bidirectional Long Short-Term Memory,BiLSTM)的输入,最后利用Softmax函数和条件随机场进行解码。将预训练模型BERT和BiLSTM网络作为整体进行训练,达到了提升语义槽填充任务性能的目的。在MIT Restaurant Corpus,MIT Movie Corpus和MIT Movie trivial Corpus 3个数据集上,所提模型得出了良好的结果,最大F1值分别为78.74%,87.60%和71.54%。实验结果表明,所提模型显著提升了语义槽填充任务的F1值。

关 键 词:语义槽填充  预训练模型  长短期记忆网络  上下文依赖  词向量

Semantic Slot Filling Based on BERT and BiLSTM
ZHANG Yu-shuai,ZHAO Huan,LI Bo.Semantic Slot Filling Based on BERT and BiLSTM[J].Computer Science,2021,48(1):247-252.
Authors:ZHANG Yu-shuai  ZHAO Huan  LI Bo
Affiliation:(College of Computer Science and Electronic Engineering,Hunan University,Changsha 410082,China)
Abstract:Semantic slot filling is an important task in the dialogue system,which aims to label each word of the input sentence correctly.Slot filling performance has a marked impact on the following dialog management module.At present,random word vector or pretrained word vector is usually used as the initialization word vector of the deep learningmodel used to solveslot filling task.However,the random word vector has no semantic and grammatical information,and the pre-trained word vector only pre-sent one meaning.Both of them cannot provide context-dependent word vector for the model.We proposed an end-to-end neural network model based on pre-trained model BERTand Long Short-Term Memory network(LSTM).First,the pre-trained model(BERT)encoded the input sentence as context-dependentword embedding.After that,the word embedding served as input to subsequent Bidirectional Long Short-Term Memory network(BiLSTM).Andusing the Softmax function and conditional random field to decode prediction labels finally.The pre-trained model BERT and BiLSTM networks were trained as a wholein order to improve the performance of semantic slot filling task.The model achieves F1 scores of 78.74%,87.60%and 71.54%on three data sets(MIT Restaurant Corpus,MIT Movie Corpus and MIT Movie trivial Corpus)respectively.The experimental results show that our model significantly improves the F1 value of Semantic slot filling task.
Keywords:Slot filling  Pre-trained model  Long short-term memory network  Context-dependent  Word embedding
本文献已被 维普 万方数据 等数据库收录!
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号