Combining Trigram and Automatic Weight Distribution in Chinese Spelling Error Correction Combining trigram and automatic weight distribution in Chinese spelling error correction期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

Combining Trigram and Automatic Weight Distribution in Chinese Spelling Error Correction

作者姓名：	李建华王晓龙

作者单位：	SchoolofComputerScienceandTechnology,HarbinInstituteofTechnology,Harbin150001;P.R.China

基金项目：	This research is supported by the National Natural Science Foundation of China under Grant No.69973015.

摘要：	The researches on spelling correction aiming at detecting errors in texts tend to focus on context-sensitive spelling error correction,which is more difficult than traditional isolated-word error correction,A novel and efficient algorithm for the system of Chinese spelling error correction,CInsunSpell,is presented.In this system,the work of correction includes two parts:checking phase and correcting phase,At the first phase ,a Trigram algorithm within one fixed-size window is designed to locate potential errors in local area.The second phase employs a new method of automatically and dynamically distributing weights among the characters in the confusion set as well as in the Bayesian language model.The tactics used above exhibits good performances.
关键词：	中文信息处理拼音错误矫正 Bayesian语言模型
收稿时间：	16 August 2006
Combining trigram and automatic weight distribution in Chinese spelling error correction

Jianhua Li,Xiaolong Wang.Combining Trigram and Automatic Weight Distribution in Chinese Spelling Error Correction[J].Journal of Computer Science and Technology,2002,17(6):0-0.

Authors:	Jianhua Li Xiaolong Wang

Affiliation:	(1) School of Computer Science and Technology, Harbin Institute of Technology, 150001 Harbin, P.R. China

Abstract:	The researches on spelling correction aiming at detecting errors in texts tend to focus on context-sensitive spelling error correction, which is more difficult than traditional isolated-word error correction. A novel and efficient algorithm for the system of Chinese spelling error correction, CInsunSpell, is presented. In this system, the work of correction includes two parts: checking phase and correcting phase. At the first phase, a Trigram algorithm within one fixed-size window is designed to locate potential errors in local area. The second phase employs a new method of automatically and dynamically distributing weights among the characters in the confusion set as well as in the Bayesian language model. The tactics used above exhibits good performances.

Keywords:	spelling error correction language model edit distance weight distribution
本文献已被 CNKI 维普万方数据 SpringerLink 等数据库收录！
	点击此处可从《计算机科学技术学报》浏览原始摘要信息
	点击此处可从《计算机科学技术学报》下载全文

设为首页 | 免责声明 | 关于勤云 | 加入收藏