期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

全文获取类型

收费全文	2篇
免费	1篇

专业分类

自动化技术

3篇

出版年

2003年	1篇
2002年	1篇
2001年	1篇

排序方式： 共有3条查询结果，搜索用时 15 毫秒

A method to build a super small but practically accurate language model for handheld devices 总被引：3，自引：0，他引：3

下载免费PDF全文

吴根清郑方《计算机科学技术学报》2003,18(6):0-0

In this paper,an important question,whether a small language model can be practically accurate enough,is raised.Afterwards,the purpose of a language model,the problems that a language model faces,and the factors that affect the performance of a language model,are analyzed. Finally,a novel method for language model compression is proposed,which makes the large language model usable for applications in handheld devices,such as mobiles,smart phones,personal digital assistants (PDAs),and handheld personal computers (HPCs).In the proposed language model compression method,three aspects are included.First,the language model parameters are analyzed and a criterion based on the importance measure of n-grams is used to determine which n-grams should be kept and which removed.Second,a piecewise linear warping method is proposed to be used to compress the uni-gram count values in the full languagemodel.And third, a rank-based quantization method is adopted to quantize the bi-gram probability values.Experiments show that by using this compression method the language model can be reduced dramatically to only about 1M bytes while the performance almost does not decrease.This provides good evidence that a language model compressed by means of a well-designed compression technique is practically accurate enough,and it makes the language model usable in handheld devices. 相似文献

一种在线递增式语言模型自适应方法

吴根清郑方金凌吴文虎《中文信息学报》2002,16(1):61-66

本文针对传统统计语言模型的离线自适应方法,提出了一种在线实时的递增式自适应方法。该自适应方法需要解决几个问题。第一是要设计一种语言模型结构以适应在线的自适应;第二是如何利用在线收集到的语料对语言模型进行实时的参数修改;在我们设计的中文音转字平台中,将语言模型分成两个部分,分别是通用模型和用户模型。对于通用模型,采用高效的存储结构结合参数预取技术,提高了模型的速度;对于用户模型,使用动态的加权方法结合MAP 动态调整参数。本文所做的实验证明使用该方法能较大程度的降低中文音转字的错误率。相似文献

距离加权统计语言模型及其应用 总被引：5，自引：2，他引：3

金凌吴文虎郑方吴根清《中文信息学报》2001,15(6):48-53

本文在统计语言模型构造中,提出了将词间距离信息结合到N-gram统计语言模型中的思路,并称之为距离加权的关联词统计语言模型。该模型可以考虑一个句子中非相邻词之间的关系,基于“词距越近关系越密切”的原则,通过距离加权函数来引入距离信息,提高模型的预测能力。本文还将其应用到一个中文整句拼音输入法系统中。实验表明,该模型与传统的N-gram统计语言模型相比,汉字误识率有所降低,模型性能有了一定提高。相似文献