基于BERT和BiLSTM的语义槽填充 Semantic Slot Filling Based on BERT and BiLSTM期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

基于BERT和BiLSTM的语义槽填充

引用本文：	张玉帅,赵欢,李博. 基于BERT和BiLSTM的语义槽填充[J]. 计算机科学, 2021, 48(1): 247-252. DOI: 10.11896/jsjkx.191200088

作者姓名：	张玉帅赵欢李博

作者单位：	湖南大学信息科学与工程学院长沙 410082;湖南大学信息科学与工程学院长沙 410082;湖南大学信息科学与工程学院长沙 410082

摘要：	语义槽填充是对话系统中一项非常重要的任务,旨在为输入句子的每个单词标注正确的标签,其性能的好坏极大地影响着后续的对话管理模块.目前,使用深度学习方法解决该任务时,一般利用随机词向量或者预训练词向量作为模型的初始化词向量.但是,随机词向量存在不具备语义和语法信息的缺点;预训练词向量存在"一词一义"的缺点,无法为模型提供具...
关键词：	语义槽填充预训练模型长短期记忆网络上下文依赖词向量
Semantic Slot Filling Based on BERT and BiLSTM

ZHANG Yu-shuai,ZHAO Huan,LI Bo. Semantic Slot Filling Based on BERT and BiLSTM[J]. Computer Science, 2021, 48(1): 247-252. DOI: 10.11896/jsjkx.191200088

Authors:	ZHANG Yu-shuai ZHAO Huan LI Bo

Affiliation:	(College of Computer Science and Electronic Engineering,Hunan University,Changsha 410082,China)

Abstract:	Semantic slot filling is an important task in the dialogue system,which aims to label each word of the input sentence correctly.Slot filling performance has a marked impact on the following dialog management module.At present,random word vector or pretrained word vector is usually used as the initialization word vector of the deep learningmodel used to solveslot filling task.However,the random word vector has no semantic and grammatical information,and the pre-trained word vector only pre-sent one meaning.Both of them cannot provide context-dependent word vector for the model.We proposed an end-to-end neural network model based on pre-trained model BERTand Long Short-Term Memory network(LSTM).First,the pre-trained model(BERT)encoded the input sentence as context-dependentword embedding.After that,the word embedding served as input to subsequent Bidirectional Long Short-Term Memory network(BiLSTM).Andusing the Softmax function and conditional random field to decode prediction labels finally.The pre-trained model BERT and BiLSTM networks were trained as a wholein order to improve the performance of semantic slot filling task.The model achieves F1 scores of 78.74%,87.60%and 71.54%on three data sets(MIT Restaurant Corpus,MIT Movie Corpus and MIT Movie trivial Corpus)respectively.The experimental results show that our model significantly improves the F1 value of Semantic slot filling task.

Keywords:	Slot filling Pre-trained model Long short-term memory network Context-dependent Word embedding
本文献已被维普万方数据等数据库收录！

设为首页 | 免责声明 | 关于勤云 | 加入收藏