首页 | 本学科首页   官方微博 | 高级检索  
     

基于统计方法的中文姓名识别研究
引用本文:贾品贵,杨一平,卢朋.基于统计方法的中文姓名识别研究[J].计算机工程与应用,2006,42(31):168-170.
作者姓名:贾品贵  杨一平  卢朋
作者单位:中国科学院,自动化所,综合信息中心,北京,100080
摘    要:采用统计方法来识别中文姓名。该方法将中文姓名的识别过程分为姓名候选和姓名确认两个阶段。采用隐马尔可夫模型(HMM)分类器从未经切分的汉字串中候选姓名。利用人名与上下文词汇的互信息对候选人名进行最后的确认。该方法是完全数据驱动的,不需要姓名识别模板和规则。试验结果表明,该方法的召回率为82.7%,准确率为89.6%。

关 键 词:中文姓名识别  基于汉字  隐马尔可夫模型  互信息
文章编号:1002-8331(2006)31-0168-03
收稿时间:2006-01
修稿时间:2006-01

Statistical Chinese Personal Name Recognition
JIA Pin-gui,YANG Yi-ping,LU Peng.Statistical Chinese Personal Name Recognition[J].Computer Engineering and Applications,2006,42(31):168-170.
Authors:JIA Pin-gui  YANG Yi-ping  LU Peng
Affiliation:Integrated Information System Research Center, Institute of Automation, Chinese Academy of Sciences,Beijing 100080,China
Abstract:Automatic recognition of Chinese personal name is an important part of Chinese Named Entity recognition.A statistical approach for Chinese personal name is presented in this paper.That is:a Hidden Markov Model(HMM)classifier is applied for the extraction of candidate names from character sequence;mutual information between name and its context words is introduced for final recognition of Chinese name.This approach is data-driven without any template or rule.The test experiments show that the precision and recall rate reach 89.6% and 82.7% respectively.
Keywords:Chinese personal name recognition  character-based  HMM  mutual information
本文献已被 CNKI 维普 万方数据 等数据库收录!
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号