融合Self-Attention机制和n-gram卷积核的印尼语复合名词自动识别方法研究 Automatic Recognition of Indonesian Compound Noun Phrases with a Combination of Self-Attention Mechanism and n-gram Convolution Kernel期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

融合Self-Attention机制和n-gram卷积核的印尼语复合名词自动识别方法研究

引用本文：	丘心颖,陈汉武,陈源,谭立聪,张皓,肖莉娴.融合Self-Attention机制和n-gram卷积核的印尼语复合名词自动识别方法研究[J].湖南工业大学学报,2020,34(3):1-9.

作者姓名：	丘心颖陈汉武陈源谭立聪张皓肖莉娴

作者单位：	广东外语外贸大学广州市非通用语种智能处理重点实验室广东外语外贸大学信息科学与技术学院,广东外语外贸大学广州市非通用语种智能处理重点实验室广东外语外贸大学信息科学与技术学院,广东外语外贸大学广州市非通用语种智能处理重点实验室广东外语外贸大学信息科学与技术学院,广东外语外贸大学广州市非通用语种智能处理重点实验室广东外语外贸大学信息科学与技术学院,广东外语外贸大学广州市非通用语种智能处理重点实验室广东外语外贸大学信息科学与技术学院,广东外语外贸大学东方语言文化学院

基金项目：	广东省教育厅特色创新基金资助项目（2015KTSCX033），国家社会科学基金资助项目（17BGL068）

摘要：	针对印尼语复合名词短语自动识别,提出一种融合Self-Attention机制、n-gram卷积核的神经网络和统计模型相结合的方法,改进现有的多词表达抽取模型。在现有SHOMA模型的基础上,使用多层CNN和Self-Attention机制进行改进。对Universal Dependencies公开的印尼语数据进行复合名词短语自动识别的对比实验,结果表明:TextCNN+Self-Attention+CRF模型取得32.20的短语多词识别F_1值和32.34的短语单字识别F_1值,比SHOMA模型分别提升了4.93%和3.04%。
关键词：	印尼语复合名词短语 Self-Attention机制卷积神经网络自动识别条件随机场
收稿时间：	2020/3/29 0:00:00
Automatic Recognition of Indonesian Compound Noun Phrases with a Combination of Self-Attention Mechanism and n-gram Convolution Kernel

QIU Xinying,CHEN Hanwu,CHEN Yuan,TAN Licong,ZHANG Hao and XIAO Lixian.Automatic Recognition of Indonesian Compound Noun Phrases with a Combination of Self-Attention Mechanism and n-gram Convolution Kernel[J].Journal of Hnnnan University of Technology,2020,34(3):1-9.

Authors:	QIU Xinying CHEN Hanwu CHEN Yuan TAN Licong ZHANG Hao and XIAO Lixian

Abstract:	In view of the automatic recognition of Indonesian compound noun phrases, this paper proposes a method with Self-Attention mechanism, n-gram convolution kernel neural network and statistical model combined together so as to improve the performance of the existing multi-word expression extraction model. On the basis of the existing SHOMA model, a further improvement can be made by using the multi-layer CNN and Self-Attention mechanism, followed by an automatic recognition of compound noun phrases based on Indonesian data disclosed by Universal Dependencies. The comparative experiment results show that the F1 multi-word phrase recognition value of 32.20, as well as the F1 single-word recognition value of 32.34 obtained by TextCNN+Self-Attention+CRF model obtains respectively is 4.93% and 3.04% respectively higher than that of SHOMA model.

Keywords:
本文献已被 CNKI 等数据库收录！
	点击此处可从《湖南工业大学学报》浏览原始摘要信息
	点击此处可从《湖南工业大学学报》下载全文

设为首页 | 免责声明 | 关于勤云 | 加入收藏