首页 | 本学科首页   官方微博 | 高级检索  
     

基于最大熵方法的汉语词性标注
引用本文:林红,苑春法,郭树军. 基于最大熵方法的汉语词性标注[J]. 计算机应用, 2004, 24(1): 14-16
作者姓名:林红  苑春法  郭树军
作者单位:河北省气象局,省气象台,河北,石家庄,050021;清华大学,计算机科学与技术系,北京,100084
基金项目:国家自然科学基金资助项目 (699750 0 8),国家 973规划资助项目 (G1 9980 30 50 7)
摘    要:最大熵模型的应用研究在自然语言处理领域中受到关注,文中利用语料库中词性标注的上下文信息建立基于最大熵方法的汉语词性系统。研究的重点在于其特征的选取,因为汉语不同于其它语言,有其特殊性,所以特征的选取上与英语有差别。实验结果证明该模型是有效的,词性标注正确率达到97.34%。

关 键 词:语言模型  最大熵模型  词性标注
文章编号:1001-9081(2004)01-0014-03

A Chinese Part of Speech Tagging Method Based on Maximum Entropy Principle
LIN Hong. A Chinese Part of Speech Tagging Method Based on Maximum Entropy Principle[J]. Journal of Computer Applications, 2004, 24(1): 14-16
Authors:LIN Hong
Abstract:A lot of researches have been made on the application of the maximum entropy modeling in the natural language processing during recent years. This paper presents a new Chinese part of speech tagging method based on maximum entropy principle because Chinese is quite different from many other languages. The feature selection is the key point in this system which is distinct from the one used in English. Experiment results have shown that the part of speech tagging accuracy ratio of this system is up to 97.34%.
Keywords:language model  maximum entropy  part of speech tagging
本文献已被 CNKI 维普 万方数据 等数据库收录!
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号