首页 | 本学科首页   官方微博 | 高级检索  
     

高性能的多体印刷英文识别系统的实现
引用本文:陈国平,张明新,付跃文,王劲林.高性能的多体印刷英文识别系统的实现[J].计算机工程与应用,2006,42(12):183-186.
作者姓名:陈国平  张明新  付跃文  王劲林
作者单位:1. 中国科学院声学所语音交互中心,北京,100080;中国科学院研究生院,北京,100039
2. 南京工业大学信息科学与工程学院,南京,210009
摘    要:提高低质量文本图像的识别率是现今文字识别研究的重要方向。文章对倾斜文本行的切分算法,断裂、粘连、交叠字符的切分算法以及后处理作了较为深入的研究,提出一些新的算法。该系统能够识别多达260种字体,包括黑体、斜体等字体,对训练集的识别率达到98.5%,并在实际应用中取得了良好效果。

关 键 词:光学字符识别  行切分  字符切分  后处理
文章编号:1002-8331-(2006)12-0184-04
收稿时间:2005-07
修稿时间:2005-07

Implementation of High Performance Multi-Font Printed English Character Recognition System
Chen Guoping,Zhang Mingxing,Fu Yuewen,Wang Jinlin.Implementation of High Performance Multi-Font Printed English Character Recognition System[J].Computer Engineering and Applications,2006,42(12):183-186.
Authors:Chen Guoping  Zhang Mingxing  Fu Yuewen  Wang Jinlin
Affiliation:1.Speech Interaction Technology Research,Institute of Acoustic, CAS, Beijing 100080; 2.Graduate School of Chinese Academy Sciences,Beijing 100039
Abstract:It is important to do research in improving recognition rate for low quality text images.This paper discusses the algorithms of skew text llne segmentation and splitting,touching and overlapping character segmentation,and postprocessing after the deep study of these fields.Some novel algorithms are provided in the paper.The system can recognize at least 260 kinds of fonts,including black font and italic font,The recognition rate in the training set is 98.5%,and the experiments in real-world documents are very promising.
Keywords:OCR  text line segmentation  character segrnenation  post-processing
本文献已被 CNKI 维普 万方数据 等数据库收录!
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号