首页 | 本学科首页   官方微博 | 高级检索  
     

基于膨胀卷积神经网络模型的中文分词方法
引用本文:王星,李超,陈吉.基于膨胀卷积神经网络模型的中文分词方法[J].中文信息学报,2019,33(9):24-30.
作者姓名:王星  李超  陈吉
作者单位:辽宁工程技术大学 电子与信息工程学院,辽宁 葫芦岛 125105
基金项目:国家自然科学基金(61402212);辽宁省高等学校杰出青年学者成长计划项目(LJQ2015045);中国博士后基金(2016M591452);辽宁省自然科学基金(2015020098)
摘    要:目前,许多深度神经网络模型以双向长短时记忆网络结构处理中文分词任务,存在输入特征不够丰富、语义理解不全、计算速度慢的问题。针对以上问题,该文提出一种基于膨胀卷积神经网络模型的中文分词方法。通过加入汉字字根信息并用卷积神经网络提取特征来丰富输入特征;使用膨胀卷积神经网络模型并加入残差结构进行训练,能够更好理解语义信息并提高计算速度。基于Bakeoff 2005语料库的4个数据集设计实验,与双向长短时记忆网络模型的中文分词方法做对比,实验表明该文提出的模型取得了更好的分词效果,并具有更快的计算速度。

关 键 词:中文分词  膨胀卷积  深度学习  自然语言处理  

Dilated Convolution Neural Networks for Chinese Word Segmentation
WANG Xing,LI Chao,CHEN Ji.Dilated Convolution Neural Networks for Chinese Word Segmentation[J].Journal of Chinese Information Processing,2019,33(9):24-30.
Authors:WANG Xing  LI Chao  CHEN Ji
Affiliation:School of Electronic and Information Engineering, Liaoning Technical University, Huludao, Liaoning 125105, China
Abstract:At present, many deep neural network models deal with Chinese word segmentation tasks with bidirectional long short term memory neural network structure. Issues remain in the aspects that the input features are not rich enough, the semantic understanding is not complete, and the calculation speed is slow. In this paper, a dilated convolution neural networks for Chinese Word Segmentation is proposed. Chinese radical information is integrated to enrich the input and the convolution neural network is applied to extract the feature. The dilated convolution neural networks with residual structure can better understand semantic information and improve training efficiency. Experimented on the four datasets in Bakeoff 2005, the proposed method achieves better performance in terms of accuracy and efficiency compared with the bidirectional long short term memory neural network method.
Keywords:Chinese word segmentation  dilated convolution  deep learning  NLP  
点击此处可从《中文信息学报》浏览原始摘要信息
点击此处可从《中文信息学报》下载全文
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号