语言模型复杂度度量与汉语熵的估算 Perplexity Measuring of Language Model and the Entropy Estimating of Chinese期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

语言模型复杂度度量与汉语熵的估算

引用本文：	张仰森,曹元大,俞士汶. 语言模型复杂度度量与汉语熵的估算[J]. 小型微型计算机系统, 2006, 27(10): 1931-1934

作者姓名：	张仰森曹元大俞士汶

作者单位：	1. 北京大学,计算语言学研究所,北京,100871;北京信息科技大学,计算机及自动化系,北京,100085 2. 北京理工大学,计算机科学工程系,北京,100081 3. 北京大学,计算语言学研究所,北京,100871

基金项目：	国家重点基础研究发展计划(973计划);国家高技术研究发展计划(863计划);中国博士后科学基金

摘要：	运用信息论理论，从信息熵的角度对统计语言模型的复杂度度量方法进行了定量化的推理与描述，得出了语言模型对语言熵的估算值越小，说明该模型对语言的描述越精确以及两个n-1元文法模型插值形成的新模型，其性能好于n-1元文法模型，但不及n元文法模型的结论．并对应用语言模型估算汉语信息熵的方法进行了探讨．
关键词：	语言模型复杂度熵语言模型评价
文章编号：	1000-1220（2006）10-1931-04
收稿时间：	2005-06-27
修稿时间：	2005-06-27
Perplexity Measuring of Language Model and the Entropy Estimating of Chinese

ZHANG Yang-sen,CAO Yuan-da,YU Shi-wen. Perplexity Measuring of Language Model and the Entropy Estimating of Chinese[J]. Mini-micro Systems, 2006, 27(10): 1931-1934

Authors:	ZHANG Yang-sen CAO Yuan-da YU Shi-wen

Affiliation:	1.Institute of Computational Linguistics, Peking University, Beijing, 100871, China;2.Department of Computer and Automatization, Beijing Institute of Technology, Beijing 100085, China;3.Department of Computer Science and Engineering, Beijing Institute of Technology, Beijing 100081, China

Abstract:	The perplexity measuring methods of language model are quantificationally expressed from entropy angle with information theory. We have result in the inference that the more little entropy estimated by a LM is, the more well this model's performance, as well as the new model obtained by combining linearly two (n-1)-gram statistical linguistic models, its performance is better than (n-1)-gram model, but is not as good as n-gram model.

Keywords:	language model perplexity entropy perplexity measuring
本文献已被 CNKI 维普万方数据等数据库收录！

设为首页 | 免责声明 | 关于勤云 | 加入收藏