首页 | 本学科首页   官方微博 | 高级检索  
     

基于语言特征自动获取的反问句识别方法
引用本文:李旸,吴卓嘉,王素格,梁吉业.基于语言特征自动获取的反问句识别方法[J].中文信息学报,2020,34(2):96-104.
作者姓名:李旸  吴卓嘉  王素格  梁吉业
作者单位:1.山西大学 计算机与信息技术学院,山西 太原 030006;
2.山西大学 计算智能与中文信息处理教育部重点实验室,山西 太原 030006
基金项目:国家自然科学基金(61632011,61573231,61432011,61672331);山西省重点研发计划(201803D421024)
摘    要:反问句是以疑问的形式表达强烈情感的修辞方式,对其有效识别可为自然语言处理中的情感分析任务提供技术支持。该文提出了一种基于语言特征自动获取的反问句识别方法。首先,利用标签注意机制,建立了一个数据驱动的特征抽取模型,用于获取与任务相关的词汇、句法结构、符号标记和话题等语言特征。其次,利用Bi-LSTM模型分别对句子和语言特征进行表示,两者的交互注意被用于获取句子的各个词和符号的注意力权重向量。该权重向量作用于句子的表示,用于构建一个强化语言特征的反问句识别模型。在中文微博数据集上的实验结果表明,提出的方法与之前的工作相比,反问句识别性能有显著提升。

关 键 词:反问句  特征抽取  注意力机制  识别模型

A Rhetorical Question Identification Method Based on Automatic Language Feature Acquisition
LI Yang,WU Zhuojia,WANG Suge,LIANG Jiye.A Rhetorical Question Identification Method Based on Automatic Language Feature Acquisition[J].Journal of Chinese Information Processing,2020,34(2):96-104.
Authors:LI Yang  WU Zhuojia  WANG Suge  LIANG Jiye
Affiliation:1.School of Computer & Information Technology, Shanxi University, Taiyuan, Shanxi 030006, China;
2.Key Laboratory of Computational Intelligence and Chinese Information Processing of Ministry of Education, Shanxi University, Taiyuan, Shanxi 030006, China
Abstract:Rhetorical question is a rhetorical way to express strong emotion in the form of interrogative sentence, and the effective identification of it can provide the technical support to sentiment analysis in natural language processing. An identification method of rhetorical question based on automatic acquisition of language features is proposed in this paper. Firstly, a data-driven feature extraction model is established by using label attention mechanism to obtain the language features of word, syntactic structure, symbolic and topic from a sentence. Secondly, the target sentence and corresponding language features are expressed by Bi-LSTM model. On this basis, the interactive attention of the both is used to obtain the attention weight vector of words and symbolic flags in the target sentence. By making the attention weight vector act on the Bi-LSTM expression of the target sentence, a language feature strengthened rhetorical question identification model is established. Comparing with the previous works on a Chinese microblog data, the experimental results show that the proposed method significantly improved the performance of rhetorical question identification.
Keywords:rhetorical questions  feature extraction  attention mechanism  identification model  
本文献已被 维普 等数据库收录!
点击此处可从《中文信息学报》浏览原始摘要信息
点击此处可从《中文信息学报》下载全文
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号