首页 | 本学科首页   官方微博 | 高级检索  
     

多知识综合判决的字符切分算法
引用本文:刘刚,丁晓青,彭良瑞,刘长松.多知识综合判决的字符切分算法[J].计算机工程与应用,2002,38(17):59-61,72.
作者姓名:刘刚  丁晓青  彭良瑞  刘长松
作者单位:清华大学电子工程系智能技术与系统国家重点实验室,北京,100084
基金项目:国家863高技术研究发展计划(编号:2001AA114081),国家自然科学基金(编号:69972024)
摘    要:高性能的印刷体文字识别系统中,在单字识别技术比较成熟的条件下,字符切分成为比较关键的环节。字符切分可以看作是对字符边界正确切分位置的一个决策过程,该决策需要同时考虑字符局部的识别情况和全局的上下文关系。该文通过对中日韩三国文字字符切分的研究,提出一种基于多知识综合判决的字符切分算法。该算法成功应用于AsiaOCR项目,对于东方文字中常见的混排英文问题也能很好处理。实验结果表明,和以前的算法相比,新算法在中日韩三国文字识别系统中的切分错误率平均下降50%。

关 键 词:光学字符识别  字符切分  上下文分析
文章编号:1002-8331-(2002)17-0059-03

A Character Segmentation Algorithm Based on Synthetic Decision
Liu Gang Ding Xiaoqing Peng Liangrui,Liu Changsong.A Character Segmentation Algorithm Based on Synthetic Decision[J].Computer Engineering and Applications,2002,38(17):59-61,72.
Authors:Liu Gang Ding Xiaoqing Peng Liangrui  Liu Changsong
Abstract:Character segmentation becomes a crucial process in higher performance document recognition systems.In character segmentation process,geometric information is not enough to accurately decide the position of character edges.The character recognition results as well as contextual information are also needed.Based on the research on Chinese /Japanese /Korean character segmentation,a novel multiple information-based character segmentation decision algorithm is proposed in this paper.This new algorithm has been successfully used in AsiaOCR project,and it can handle English-mixed bilingual text which is common in current documents.Experiments show that the error ratio of character segmen-tation averagely reduced by50%.
Keywords:optical character recognition  character segmentation  contextual analysis  
本文献已被 CNKI 维普 万方数据 等数据库收录!
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号