首页 | 本学科首页   官方微博 | 高级检索  
     

融合人名知识分布特征的汉泰双语人名对齐
引用本文:张金鹏,苏姣,杨蓓,张占. 融合人名知识分布特征的汉泰双语人名对齐[J]. 计算机工程与应用, 2019, 55(23): 163-169. DOI: 10.3778/j.issn.1002-8331.1809-0240
作者姓名:张金鹏  苏姣  杨蓓  张占
作者单位:1.云南财经大学 信息管理中心,昆明 6502212.云南财经大学 国际语言文化学院,昆明 6502213.武昌理工学院 信息工程学院,武汉 4302234.中山大学 数据科学与计算机学院,广州 510006
摘    要:双语人名对齐方法研究直接影响到跨语言信息处理的效果,由于泰语与汉语的发音差异大,汉泰双语平行语料库资源有限,基于统计的音译人名对齐模型难以解决汉泰双语人名对齐问题,提出一种在音译特征基础上融合人名知识分布特征相似性的汉泰双语人名对齐方法。计算双语人名音译相似度特征,通过卡方检验等计算汉语人名与泰语人名的知识分布相似度特征,借助支持向量机学习汉泰人名翻译对的两种特征生成人名翻译对分类器,对分类器分类结果调优生成对齐结果。实验结果表明该方法在汉泰人名发音差异大和缺少双语语料资源支持的情况下取得了较好效果。

关 键 词:汉语  泰语  双语人名对齐  人名知识分布  分类结果调优  

Chinese-Thai Bilingual Name Alignment with Merging Name Knowledge Distribution Characteristics
ZHANG Jinpeng,SU Jiao,YANG Bei,ZHANG Zhan. Chinese-Thai Bilingual Name Alignment with Merging Name Knowledge Distribution Characteristics[J]. Computer Engineering and Applications, 2019, 55(23): 163-169. DOI: 10.3778/j.issn.1002-8331.1809-0240
Authors:ZHANG Jinpeng  SU Jiao  YANG Bei  ZHANG Zhan
Affiliation:1.Center of Information Management, Yunnan University of Finance and Economics, Kunming 650221, China2.School of International Languages and Cultures, Yunnan University of Finance and Economics, Kunming 650221, China3.School of Information Engineering, Wuchang University of Technology, Wuhan 430223, China4.School of Data and Compute Science, Sun Yat-sen University, Guangzhou 510006, China
Abstract:The study of bilingual name alignment method directly affects the effect of cross-language information processing. For the pronunciation of Chinese is quite different from Thai, and the resources of Chinese-Thai bilingual corpus are limited, and the present transliteration bilingual name alignment model based on statistics is not enough to solve those problems, this paper proposes a method which based on transliteration features, merges the similarity of the name knowledge distribution characteristics. Firstly, it calculates the similarity characteristics of bilingual name transliteration. Then the similarity of knowledge distribution characteristics between Chinese and Thai names is calculated by Chi-square test and others. Support vector machine is used to learn two features of translation of Chinese-Thai personal names to generate personal names translation pair classifier,the alignment results are generated by optimizing the classifier classification results. Experimental results show that this method has also achieved better results, even if bilateral people’s pronunciation is quite difference and lacking of bilingual corpus resources.
Keywords:Chinese  Thai  bilingual name alignment  name knowledge distribution  adjusting and optimizing classification results  
点击此处可从《计算机工程与应用》浏览原始摘要信息
点击此处可从《计算机工程与应用》下载免费的PDF全文
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号