首页 | 本学科首页   官方微博 | 高级检索  
     

基于BERT和BiLSTM的语义槽填充
引用本文:张玉帅,赵欢,李博. 基于BERT和BiLSTM的语义槽填充[J]. 计算机科学, 2021, 48(1): 247-252. DOI: 10.11896/jsjkx.191200088
作者姓名:张玉帅  赵欢  李博
作者单位:湖南大学信息科学与工程学院 长沙 410082;湖南大学信息科学与工程学院 长沙 410082;湖南大学信息科学与工程学院 长沙 410082
摘    要:语义槽填充是对话系统中一项非常重要的任务,旨在为输入句子的每个单词标注正确的标签,其性能的好坏极大地影响着后续的对话管理模块.目前,使用深度学习方法解决该任务时,一般利用随机词向量或者预训练词向量作为模型的初始化词向量.但是,随机词向量存在不具备语义和语法信息的缺点;预训练词向量存在"一词一义"的缺点,无法为模型提供具...

关 键 词:语义槽填充  预训练模型  长短期记忆网络  上下文依赖  词向量

Semantic Slot Filling Based on BERT and BiLSTM
ZHANG Yu-shuai,ZHAO Huan,LI Bo. Semantic Slot Filling Based on BERT and BiLSTM[J]. Computer Science, 2021, 48(1): 247-252. DOI: 10.11896/jsjkx.191200088
Authors:ZHANG Yu-shuai  ZHAO Huan  LI Bo
Affiliation:(College of Computer Science and Electronic Engineering,Hunan University,Changsha 410082,China)
Abstract:Semantic slot filling is an important task in the dialogue system,which aims to label each word of the input sentence correctly.Slot filling performance has a marked impact on the following dialog management module.At present,random word vector or pretrained word vector is usually used as the initialization word vector of the deep learningmodel used to solveslot filling task.However,the random word vector has no semantic and grammatical information,and the pre-trained word vector only pre-sent one meaning.Both of them cannot provide context-dependent word vector for the model.We proposed an end-to-end neural network model based on pre-trained model BERTand Long Short-Term Memory network(LSTM).First,the pre-trained model(BERT)encoded the input sentence as context-dependentword embedding.After that,the word embedding served as input to subsequent Bidirectional Long Short-Term Memory network(BiLSTM).Andusing the Softmax function and conditional random field to decode prediction labels finally.The pre-trained model BERT and BiLSTM networks were trained as a wholein order to improve the performance of semantic slot filling task.The model achieves F1 scores of 78.74%,87.60%and 71.54%on three data sets(MIT Restaurant Corpus,MIT Movie Corpus and MIT Movie trivial Corpus)respectively.The experimental results show that our model significantly improves the F1 value of Semantic slot filling task.
Keywords:Slot filling  Pre-trained model  Long short-term memory network  Context-dependent  Word embedding
本文献已被 维普 万方数据 等数据库收录!
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号