首页 | 本学科首页   官方微博 | 高级检索  
     

基于隐藏变量区别模型的中文多词表达抽取
引用本文:孙晓.基于隐藏变量区别模型的中文多词表达抽取[J].中国通信学报,2012,9(3):124-133.
作者姓名:孙晓
摘    要:

收稿时间:2012-04-20;

Discriminative Latent Model Based Chinese Multiword Expression Extraction
Sun Xiao.Discriminative Latent Model Based Chinese Multiword Expression Extraction[J].China communications magazine,2012,9(3):124-133.
Authors:Sun Xiao
Affiliation:AnHui Province Key Laboratory of Affective Computing and Advanced Intelligent Machine, Hefei University of Technology, Hefei 230009, P. R. China School of Computer Science and Technology, Dalian University of Technology, Dalian 116024, P. R. China
Abstract:Discriminative Latent Model (DLM) is proposed for Multiword Expressions (MWEs) extraction in Chinese text to improve the performance of Machine Translation (MT) system such as Template Based MT (TBMT). For MT systems to become of further practical use, they need to be enhanced with MWEs processing capability. As our study towards this goal, we propose DLM, which is developed for sequence labeling task including hidden structures, to extract MWEs for MT systems. DLM combines the advantages of existing discriminative models, which can learn hidden structures in sequence labeling task. In our evaluations, DLM achieves precisions ranging up to 90.73% for some type of MWEs, which is higher than state-of-the-art discriminative models. Such results demonstrate that it is feasible to automatically identify many Chinese MWEs using our DLM tool. With MWEs processing model, BLEU score of MT system has also been increased by up to 0.3 in close test.
Keywords:information processing  natural language processing  MT  DLM  multiword expressions
点击此处可从《中国通信学报》浏览原始摘要信息
点击此处可从《中国通信学报》下载全文
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号