首页 | 本学科首页   官方微博 | 高级检索  
     

基于深度学习的微博命名体识别
引用本文:刘玉娇.基于深度学习的微博命名体识别[J].四川大学学报(工程科学版),2016,48(Z2):142-146.
作者姓名:刘玉娇
基金项目:国家自然科技基金项目(61332066,81373239)
摘    要:针对微博用语不规范,噪声多,更新快,缩略语多,且数据量大等相关特点,本文提出基于深度学习的方法进行微博命名体的识别。本文首先利用大量的未标注的微博信息对自动编码器训练,获得抽象特征,随后将这些特征作为深度学习网络的输入,最后得出句子中每个字的类标。在进行自动编码器训练的过程中,本文提出卷积方法替代窗口移动方法,以此获取句子中的长依赖信息。通过对新浪微博数据的实验结果表明,本文所给出的深度学习方法能够提高微博中命名体识别的F1值,说明了本文算法的有效性。

关 键 词:微博  深度学习  自动编码器  卷积  命名体识别
收稿时间:9/9/2015 12:00:00 AM
修稿时间:2016/4/17 0:00:00

Named Entity Recognition in Chinese Micro-blog Based on Deep Learning
LIUYUJIAO.Named Entity Recognition in Chinese Micro-blog Based on Deep Learning[J].Journal of Sichuan University (Engineering Science Edition),2016,48(Z2):142-146.
Authors:LIUYUJIAO
Abstract:In order to recognize the Named Entity in micro-blog, according to the characteristics of micro-blog, which include non-standard language, much noise, updated quickly, more acronyms and the large amount of data, a method based on the deep learning is proposed in this paper. We trained an AutoEncoder network by a large number of untagged micro-blog data to get abstract characteristics firstly. And then, these features were input into the deep learning network. Finally, the word in sentence would be tagged. In the process of training AutoEncoder network, rather than moving window, periodic convolution method was used to obtain long rely on the information in the sentence. The experiment results on Sina-weibo show that deep learning method can effectively improve the effect of named entity recognition in micro-blog, which illustrated the effectiveness of the proposed method.
Keywords:Micro-blog  Deep learning  Auto coder  Convolution  Named entity recognition
点击此处可从《四川大学学报(工程科学版)》浏览原始摘要信息
点击此处可从《四川大学学报(工程科学版)》下载全文
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号