首页 | 本学科首页   官方微博 | 高级检索  
     

基于最简门单元的循环神经网络分词
引用本文:刘志明,孙严伟,欧阳纯萍,万亚平. 基于最简门单元的循环神经网络分词[J]. 计算机工程与设计, 2019, 40(5): 1328-1333
作者姓名:刘志明  孙严伟  欧阳纯萍  万亚平
作者单位:南华大学计算机学院,湖南衡阳,421001;南华大学计算机学院,湖南衡阳,421001;南华大学计算机学院,湖南衡阳,421001;南华大学计算机学院,湖南衡阳,421001
基金项目:国家自然科学基金;湖南省哲学社会科学基金
摘    要:为解决长短期记忆(long short-term memory,LSTM)单元循环神经网络结构复杂,训练时间长,标注推理速度慢的问题,结合现有文献分析循环神经网络及其单元结构的理论基础,提出一种基于最简门单元(minimalist gated unit,MGU)的循环神经网络进行中文分词研究。使用MGU单元替换LSTM单元自动提取特征,建立长期依赖信息。在中文分词评测常用语料Bakeoff 2005数据集上进行实验,实验结果表明,MGU网络与LSTM网络精度相当,训练时间减少一半,标注推理速度可提升至3倍。

关 键 词:自然语言处理  中文分词  循环神经网络  长短期记忆  最简门单元

MGU RNN for Chinese word segmentation
LIU Zhi-ming,SUN Yan-wei,OUYANG Chun-ping,WAN Ya-ping. MGU RNN for Chinese word segmentation[J]. Computer Engineering and Design, 2019, 40(5): 1328-1333
Authors:LIU Zhi-ming  SUN Yan-wei  OUYANG Chun-ping  WAN Ya-ping
Affiliation:(Computer School,University of South China,Hengyang 421001,China)
Abstract:To solve the problem of long training time and low tagging speed of long short-term memory (LSTM) recurrent neural networks (RNN) caused by its complex structure,combined with the current literature,the basic theory of recurrent neural networks and its unit structure were analyzed.RNN based on minimalist gated unit (MGU) was proposed to study Chinese word segmentation.LSTM units were replaced by MGUs to extract features automatically and establish long term dependence.Results of experiments on common used Bakeoff 2005 datasets show that MGU network and LSTM network get similar accuracy,while MGU reduces training time by half,and raises the speed of word tagging to 3 times.
Keywords:natural language processing  Chinese word segmentation  recurrent neural networks (RNN)  long short-term me- mory (LSTM)  minimalist gated unit (MGU)
本文献已被 维普 万方数据 等数据库收录!
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号