首页 | 本学科首页   官方微博 | 高级检索  
     

基于上下文的二阶隐马尔可夫模型
引用本文:刘洁彬,宋茂强,赵方,杨志宇.基于上下文的二阶隐马尔可夫模型[J].计算机工程,2010,36(10):231-232.
作者姓名:刘洁彬  宋茂强  赵方  杨志宇
作者单位:1. 北京邮电大学软件学院,北京,100876
2. 北京航天航空大学软件学院,北京,100083
基金项目:国家“863”计划基金资助项目“高精度高鲁棒性室内定位关键技术装置研究”(2007AA12Z321)
摘    要:为体现上下文信息对当前词汇词性的影响,在传统隐马尔可夫模型的基础上提出一种基于上下文的二阶隐马尔可夫模型,并应用于中文词性标注中。针对改进后的统计模型中由于训练数据过少而出现的数据稀疏问题,给出基于指数线性插值改进平滑算法,对参数进行有效平滑。实验表明,基于上下文的二阶隐马尔可夫模型比传统的隐马尔可夫模型具有更高的词性标注正确率和消歧率。

关 键 词:词性标注  二阶隐马尔可夫模型  参数平滑  Viterbi算法

Second-order Hidden Markov Model Based on Context
LIU Jie-bin,SONG Mao-qiang,ZHAO Fang,YANG Zhi-yu.Second-order Hidden Markov Model Based on Context[J].Computer Engineering,2010,36(10):231-232.
Authors:LIU Jie-bin  SONG Mao-qiang  ZHAO Fang  YANG Zhi-yu
Affiliation:(1. College of Software Engineering, Beijing University of Posts and Telecommunications, Beijing 100876;2. College of Software, Beihang University, Beijing 100083)
Abstract:To better represent the influence of the context to the part of speech of the current word, this paper proposes a second-order hidden Markov model based on the traditional hidden Markov model and applies it to part-of-speech tagging in Chinese. In the improved statistical model, sparse data problem occurs due to the shortage of training data. To solve this problem, an improved smoothing algorithm based on index linear interpolation is proposed, which provides effective smoothing. Experiments show that the second-order Hidden Markov Model(HMM) based on the context has higher correct rate and disambiguation rate of part-of-speech tagging than the traditional hidden Markov model.
Keywords:part-of-speech tagging  second-order Hidden Markov Model(HMM)  parameter smoothing  Viterbi algorithm
本文献已被 CNKI 维普 万方数据 等数据库收录!
点击此处可从《计算机工程》浏览原始摘要信息
点击此处可从《计算机工程》下载全文
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号