首页 | 本学科首页   官方微博 | 高级检索  
     

基于强化学习的壮语词性标注
引用本文:唐素勤,孙亚茹,李志欣,张灿龙.基于强化学习的壮语词性标注[J].计算机工程,2020,46(4):309-315.
作者姓名:唐素勤  孙亚茹  李志欣  张灿龙
作者单位:广西师范大学广西多源信息挖掘与安全重点实验室,广西桂林541004;广西师范大学教育学部教育技术系,广西桂林541004;广西师范大学广西多源信息挖掘与安全重点实验室,广西桂林541004
基金项目:广西自然科学基金;国家自然科学基金;广西科技基地和人才专项
摘    要:目前壮语智能信息处理研究处于起步阶段,缺乏自动词性标注方法.针对壮语标注语料匮乏、人工标注费时费力而机器标注性能较差的现状,提出一种基于强化学习的壮语词性标注方法.依据壮语的文法特点和中文宾州树库符号构建标注词典,通过依存句法分析融合语义特征,并以长短期记忆网络为策略网络,利用循环记忆完善部分观测信息.在此基础上,引入强化学习框架,将目标词性作为环境反馈,通过特征学习不断逼近目标真实值.实验结果表明,该方法可缓解词性标注模型对训练语料库的依赖,能够快速扩大壮语标注词典的规模,实现壮语词性的自动标注.

关 键 词:智能信息处理  词性标注  强化学习  长短期记忆网络  策略网络

Part of Speech Tagging of Zhuang Language Based on Reinforcement Learning
TANG Suqin,SUN Yaru,LI Zhixin,ZHANG Canlong.Part of Speech Tagging of Zhuang Language Based on Reinforcement Learning[J].Computer Engineering,2020,46(4):309-315.
Authors:TANG Suqin  SUN Yaru  LI Zhixin  ZHANG Canlong
Affiliation:(Guangxi Key Lab of Multi-source Information Mining and Security,Faculty of Education,Guangxi Normal University,Guilin,Guangxi 541004,China;Department of Educational Technology,Faculty of Education,Guangxi Normal University,Guilin,Guangxi 541004,China)
Abstract:Currently,intelligent information processing of the Zhuang language is in its fancy and lacks automatic tagging methods for parts of speech.To address the lack of Zhuang corpus,arduousness of manual tagging,and poor performance of machine tagging,this paper proposes a part of speech tagging method for the Zhuang language based on reinforcement learning.The method builds a tag dictionary according to the grammatical features of Zhuang and Chinese Penzhou Tree Bank(CTB)symbols,and uses dependency syntax analysis to fuse semantic features.Then Long Short-Term Memory(LSTM)network serves as strategic network,using cyclic memory to improve part of observation information.On this basis,a reinforcement learning framework is introduced,and the target part of speech is used as environmental feedback.The true value of the target is gradually approached through feature learning.Experimental results show that this method has excellent performance in part of speech tagging of Zhuang.It can alleviate the dependency of the part of speech tagging model on training corpus,and enlarge the tag dictionary of Zhuang language quickly.Besides,the proposed method can realize part of speech tagging of Zhuang language automatically.
Keywords:intelligent information processing  part of speech tagging  reinforcement learning  Long Short-Term Memory(LSTM)network  strategic network
本文献已被 维普 万方数据 等数据库收录!
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号