首页 | 本学科首页   官方微博 | 高级检索  
     

印尼语、马来语自然语言处理研究综述
引用本文:蒋盛益,李珊珊,符斯慧,林楠铠. 印尼语、马来语自然语言处理研究综述[J]. 模式识别与人工智能, 2020, 33(6): 530-541. DOI: 10.16451/j.cnki.issn1003-6059.202006006
作者姓名:蒋盛益  李珊珊  符斯慧  林楠铠
作者单位:1.广东外语外贸大学 信息科学与技术学院 广州 510006
2.广东外语外贸大学 广州市非通用语种智能处理重点实验
基金项目:国家自然科学基金;广州市科技计划
摘    要:随着印尼语、马来语互联网普及率的上升,对海量印尼语、马来语文本进行信息处理存在重大需求.虽然研究人员对印尼语、马来语展开较广泛的研究,但是作为低资源语言,受到的关注远不及通用语,未能较好利用前沿的深度学习方法.文中梳理总结包括词法分析、句法分析、机器翻译、拼写检查等印尼语、马来语相关的自然语言处理技术.对比分析相关的研究成果发现,大多数研究因语料规模及评测标准不同难以客观对比各种算法的差异.最后结合印尼语、马来语现有的各领域语言资源开放情况,指出印尼语、马来语的自然语言处理研究面临的问题,并展望未来发展趋势.

关 键 词:印尼语  马来语  黏着语  低资源语言  自然语言处理
收稿时间:2020-03-26

An Overview of Natural Language Processing for Indonesian and Malay
JIANG Shengyi,LI Shanshan,FU Sihui,LIN Nankai. An Overview of Natural Language Processing for Indonesian and Malay[J]. Pattern Recognition and Artificial Intelligence, 2020, 33(6): 530-541. DOI: 10.16451/j.cnki.issn1003-6059.202006006
Authors:JIANG Shengyi  LI Shanshan  FU Sihui  LIN Nankai
Affiliation:1. School of Information Science and Technology, Guangdong University of Foreign Studies, Guangzhou 510006
2. Guangzhou Key Laboratory of Multilingual Intelligent Processing, Guangdong University of Foreign Studies, Guangzhou 510006
Abstract:As the penetration rate of Indonesian and Malay rises, it is significant to carry out information processing on massive texts of these two languages. Extensive research is conducted on Indonesian and Malay. However, as low-resource languages, Indonesian and Malay draw less attention than common languages. Thus, the deep learning methods cannot be fully utilized. In this paper, research on Indonesian and Malay morphological analysis, syntactic parsing, machine translation, spelling check etc., is analyzed and summarized. In the most research findings, algorithms cannot be compared objectively due to their different corpus scales and evaluation metrics. Finally, problems and future directions of natural language processing on Indonesian and Malay are discussed with the consideration of the existing open language resources in various fields.
Keywords:Indonesian  Malay  Agglutinative Language  Low-Resource Language  Natural Language Processing  
本文献已被 万方数据 等数据库收录!
点击此处可从《模式识别与人工智能》浏览原始摘要信息
点击此处可从《模式识别与人工智能》下载全文
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号