首页 | 本学科首页   官方微博 | 高级检索  
     

基于统计的纠错建议给出算法及其实现
引用本文:张仰森,曹元大,徐波.基于统计的纠错建议给出算法及其实现[J].计算机工程,2004,30(11):106-109.
作者姓名:张仰森  曹元大  徐波
作者单位:1. 北京理工大学计算机系,北京,100081;山西大学计算机系,太原,030006;中国科学院自动化所模式识别国家重点实验室,北京100080
2. 北京理工大学计算机系,北京,100081
3. 中国科学院自动化所模式识别国家重点实验室,北京100080
基金项目:山西省青年科技研究基金资助项目(20021015)
摘    要:介绍了为自动校对系统检测出的错误字串提供有效纠错建议的算法。该算法针对音同、音近、形似或编码键位相近的错误产生特点,构造了字驱动的双向词典和近似字词典,并利用模糊匹配算法为错误字串提供纠错建议,然后对所有建议根据上下文信息和统计频率进行排序。通过在Windows环境下所实现的系统试验,表明正确建议的召回率达到91.8%,而前5选建议的正确率为76.4%。

关 键 词:纠错建议  词典构造  排序算法
文章编号:1000-3428(2004)11-0106-04

Correcting Candidate Suggestion Algorithm and Its Realization Based on Statistics
ZHANG Yangsen,,CAO Yuanda,XU Bo.Correcting Candidate Suggestion Algorithm and Its Realization Based on Statistics[J].Computer Engineering,2004,30(11):106-109.
Authors:ZHANG Yangsen      CAO Yuanda  XU Bo
Affiliation:ZHANG Yangsen1,2,3,CAO Yuanda1,XU Bo3
Abstract:This paper introduces an algorithm to offer the effective correct candidates for the detected error strings by automatic proofreading system. Constructing the bi-way dictionary drove by Chinese character and the approximate word dictionary based on characteristic of the similarity or same of pronunciation, shape, and/or input coding key position, this algorithm offers reasonable candidates for the error strings through the likelihood matching method, and then sorts the candidates by text context information and statistical frequency. The test through the system that realizes under Windows environment, shows that the correct suggestion recall ratio is 91.8%, and the correct rate of the fore 5 candidates is 76.4%.
Keywords:Correcting candidate suggestion  Dictionary construction  Suggestion sort algorithm
本文献已被 CNKI 维普 万方数据 等数据库收录!
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号