首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 234 毫秒
1.
粘连断裂字符行的切分识别,是很多OCR 实际应用中存在的主要困难之一. 本文针对粘连断裂的印刷体数字行,提出了一种基于Viterbi 算法的切分识别方案,该方案采用两次切分识别的层次型结构. 在第二次切分识别过程中,首先,在候选切分点区域,结合灰度图像与二值轮廓信息,采用基于Viterbi 算法搜索的非直线路径进行切分,得到有效的切分路径;然后,结合分类器输出的可信度,采用Viterbi 算法来合并前面得到的候选切分图像块,进行动态切分与识别. 实际的金融票据识别系统实验表明,本文提出的印刷体数字行切分识别方法能够较好的克服字符行的粘连与断裂情况,提高了识别系统的识别率和鲁棒性.  相似文献   

2.
刘阳兴 《计算机应用研究》2011,28(10):3998-4000
针对粘连和搭接字符切分算法的不足,提出一种基于折线切分路径的字符切分算法。该算法利用投影法将粘连搭接字符与非粘连搭接字符分离开,而后结合粘连搭接字符独有的外形特征,通过引入惩罚权重的路径搜索算法快速而准确地得到粘连搭接字符间的折线切分路径;为了避免一些字符在以上的切分过程中被误切碎,利用识别反馈信息对一些字符子图像进行合并。实验结果表明,该算法对印刷体日英混排字符切分有很强的适应性,取得了较理想的切分效果。  相似文献   

3.
一种视频中字符的集成型切分与识别算法   总被引:3,自引:0,他引:3  
杨武夷  张树武 《自动化学报》2010,36(10):1468-1476
视频文本行图像识别的技术难点主要来源于两个方面: 1)粘连字符的切分与识别问题; 2)复杂背景中字符的切分与识别问题. 为了能够同时切分和识别这两种情况中的字符, 提出了一种集成型的字符切分与识别算法. 该集成型算法首先对文本行图像二值化, 基于二值化的文本行图像的水平投影估计文本行高度. 其次根据字符笔划粘连的程度, 基于图像分析或字符识别对二值图像中的宽连通域进行切分. 然后基于字符识别组合连通域得到候选识别结果, 最后根据候选识别结果构造词图, 基于语言模型从词图中选出字符识别结果. 实验表明该集成型算法大大降低了粘连字符及复杂背景中字符的识别错误率.  相似文献   

4.
一种离线手写体汉字切分的自适应算法   总被引:2,自引:0,他引:2  
目前用于字符切分的算法主要有基于灰度直方图投影、字符连通域算法等,但这些算法不适用于相邻粘连汉字的切分。论文针对粘连汉字的切分,以邮件信函地址为对象,提出了一种离线手写体汉字切分的自适应算法。其基本步骤为:首先,基于灰值投影将手写汉字地址粗分为几个字段;其次,用傅立叶变换判断这些字段是否为粘连字段;再次,用汉字的字高和字宽的比值大小判断非粘连字是单字还是单字的部首;最后,以伸缩框法对粘连字分割,并对过分的部首进行合并。此算法的优势在于根据每个人书写的不同习惯,确定不同的伸缩框对汉字进行分割。  相似文献   

5.
在分析传统的手写汉字投影切分算法的基础上,提出了改进算法.算法初切分阶段通过对局部图像进行水平投影法切分来确定行切分点,从而得到当行字符,之后针对单行字符,通过多种策略进行单字切分,包括针对字间间距、标点符号等信息做出的单字切分策略.实验结果表明,该算法能有效的减少传统垂直投影分割法的误切分现象,且分割速度快,实现容易...  相似文献   

6.
手写文本识别方法主要应用于文本输入技术,对人机交互领域的发展起关键作用。针对多数在线输入法无法识别中英文混合手写识别的问题,提出一种在线中英文混合手写文本识别方法。通过对文本笔画进行基于水平相对位置、垂直重叠率、面积重叠率规则的整合以及连笔切分,得到一系列字符片段,同时利用笔画个数、宽高比、中心偏离、平滑度等几何特征和识别置信度,对字符片段进行中英文分类。在此基础上,根据分类结果并结合自然语言模型的路径评价及动态规划搜索算法,分别对候选的中、英文字符片段进行合并处理,得到待识别的中、英文字符序列,并将其分别送入卷积神经网络的中、英文识别模型中,得到手写文本识别结果。实验结果表明,在线手写中英文混合文本识别正确率达93.67%,不仅能切分在线手写中文文本行,而且对包含字符连笔的在线手写中英文文本行也有较好的切分效果。  相似文献   

7.
高性能的多体印刷英文识别系统的实现   总被引:3,自引:0,他引:3  
提高低质量文本图像的识别率是现今文字识别研究的重要方向。文章对倾斜文本行的切分算法,断裂、粘连、交叠字符的切分算法以及后处理作了较为深入的研究,提出一些新的算法。该系统能够识别多达260种字体,包括黑体、斜体等字体,对训练集的识别率达到98.5%,并在实际应用中取得了良好效果。  相似文献   

8.
基于连通域的汉字切分技术研究   总被引:3,自引:0,他引:3  
字符切分技术已经成为汉字识别系统设计中的关键问题,对于质量较差的文本图像,用灰度图像取代传统的二值化黑白图像能够取得更好的切分效果,基于连通域的切分算法能够对灰度图像进行较好的切分,基于连通域的汉字切分算法能有效地对文本图像中汉字字符部件进行合并及对粘连字符进行分割。  相似文献   

9.
针对信函地址行中存在着大量的笔画交叉与粘连,采用了一种基于笔画提取合并的手写体汉字切分方法,并应用与地址解释相结合的动态规划得到最终切分结果,获得投递区域。用从邮政分拣机上获得的443个信函地址行二值图像样本进行测试,省市一级和市县一级投递地址的正确识别率已经达到了66%。  相似文献   

10.
为研究开发维吾尔文摄像头取词翻译系统,解决其中维吾尔文字单词图像切分难题,提出一种印刷体维吾尔文字符自适应切分算法。针对摄像头取词图像特点,准确提取目标单词;利用维吾尔文单词基线以上的主体部分做像素积分投影,从投影结果中自动提取切分阈值;利用该阈值完成字符切分,达到自适应的效果。经过实验验证,该方法切分正确率达到了96%以上,针对不同图像具有较好的适应性,对维吾尔文摄像头取词翻译系统的研究具有促进作用。  相似文献   

11.
信函自动分拣软件系统   总被引:3,自引:0,他引:3  
该文详细介绍了信函分拣系统软件部分以及在实现过程中所采用的具体方法,包括图像预处理、邮编定位与识别、版面分析和版面理解、单字分割及后处理等各个过程。系统采用了邮政编码与地址相互校正的分拣方法,有效地提高了分拣率。系统正在某地进行试点,取得了较好的效果。  相似文献   

12.
This paper describes a recognition algorithm for zip code field recognition. The algorithm consists of an initial character segmentation algorithm and a connected-numeral splitting algorithm. The initial character segmentation algorithm employs connected component analysis with component merge technique based on proximity. The numeral splitting algorithm consists of a slant splitting algorithm based on discriminant analysis and two postprocessing algorithms based on local shape analysis. The splitting algorithm is integrated with a statistical classifier to form a segmentation-recognition algorithm to resolve the ambiguity of connected numeral splitting. The performance is tested by recognition experiments on zip code fields collected from real USPS mail envelopes.  相似文献   

13.
《Pattern recognition letters》2001,22(6-7):639-656
Virtually all mail sorting machines currently used in China only recognize post code and ignore the useful destination address information on the envelopes. This paper discusses how to efficiently utilize such important information on handwritten Chinese envelopes in order to improve the sorting performance. For this purpose, two particular problems are addressed, respectively. One is the location of destination address block (DAB) on the envelope, and a new bottom-up location method is described in detail. The other is the interpretation of handwritten Chinese destination address strings. We present our effort on using as many geometric constraints as possible in the string segmentation. Then a novel address interpretation algorithm with global optimization is proposed. It combines the segmentation, recognition and address context information by the best-path search. The effectiveness of the proposed algorithms is fully demonstrated by our experiments on real envelopes.  相似文献   

14.
The large volume of mail and the increased cost of handling it has made postal automation an important domain for pattern recognition and computer vision research. A substantial amount of work is being done to design an automatic mail sorting system which can read and interpret the destination address on a mail piece and direct it to the appropriate bin. Robust optical character recognition (OCR) systems are now available which can read printed characters with great accuracy (> 99%). But, in order to read the destination address, the region in the image containing the address must first be located. Even though several approaches to address block location have been proposed in the literature, it remains a difficult problem. A simple method is presented for automatically identifying regions in envelope images which are candidates for being the destination address. The envelope image is considered to contain different textured regions, one of which corresponds to the text-content in the image. Thus, a texture-based segmentation method is used to identify the regions of text in the image. The method for texture discrimination is based on Gabor filters which have been successfully used earlier for a variety of texture classification and segmentation tasks. It is shown that only a small number of even-symmetric Gabor filters are needed in this application. The success of the texture-based segmentation algorithm for identifying address blocks is demonstrated on a number of test images. These results also demonstrate the invariance of the method to the orientation of text in the envelope image and the variations in the size and font of the text.  相似文献   

15.
针对仪表标牌上一些字符间距较小,传统分割方法不准确,字符识别率不高的问题,提出了一种标牌粘连字符自适应定位分割重建识别算法。首先对标牌图像进行中值滤波、二值化等预处理;其次运用数学形态学方法对预处理后的图像进行开运算及腐蚀,将字符间一些无用的信息去掉,增大字符间距;继而通过形心算法找出每个字符的几何中心坐标,并通过Sobel边缘检测算子根据几何中心坐标获取每个字符边框,建立ROI,再返回标牌原图利用已经建立的ROI从中分割字符,在分割的每个字符后加5像素宽的矩形间隔条后重建字符图像,再进行OCR字符识别。经过对993块标牌进行字符识别实验,算法的识别率达到95.7%,表明文中算法是对标牌字符识别的一种有效算法。  相似文献   

16.
This paper describes a handwritten character string recognition system for Japanese mail address reading on a very large vocabulary. The address phrases are recognized as a whole because there is no extra space between words. The lexicon contains 111,349 address phrases, which are stored in a trie structure. In recognition, the text line image is matched with the lexicon entries (phrases) to obtain reliable segmentation and retrieve valid address phrases. The paper first introduces some effective techniques for text line image preprocessing and presegmentation. In presegmentation, the text line image is separated into primitive segments by connected component analysis and touching pattern splitting based on contour shape analysis. In lexicon matching, consecutive segments are dynamically combined into candidate character patterns. An accurate character classifier is embedded in lexicon matching to select characters matched with a candidate pattern from a dynamic category set. A beam search strategy is used to control the lexicon matching so as to achieve real-time recognition. In experiments on 3,589 live mail images, the proposed method achieved correct rate of 83.68 percent while the error rate is less than 1 percent.  相似文献   

17.
傅立叶变换在粘连文字图像切分中的应用   总被引:3,自引:0,他引:3  
朱小燕  王松 《计算机学报》1999,22(12):1246-1252
对于已具有相当识别率的手写体文字识别系统来说切分算法已成为一个关键技术之一,它的正确率对系统性能有着极大影响。该文主要对文字图像的傅立叶变换的性质进行了讨论,提出了消除交换中笔画宽度影响的算法。在此基础上建立了基于傅立叶变换的单/多字图像的判定的基本准则以及基于此准则的粘连文字判别算法。实验表明该算法的粘连文字判断正确率达到96%。为粘连文字的正确切分开辟了新的途径。  相似文献   

18.
Correct segmentation of handwritten Chinese characters is crucial to their successful recognition. However, due to many difficulties involved, little work has been reported in this area. In this paper, a two-stage approach is presented to segment unconstrained handwritten Chinese characters. A handwritten Chinese character string is first coarsely segmented according to the background skeleton and vertical projection after a proper image preprocessing. With several geometric features, all possible segmentation paths are evaluated by using the fuzzy decision rules learned from examples. As a result, unsuitable segmentation paths are discarded. In the fine segmentation stage that follows, the strokes that may contain segmentation points are first identified. The feature points are then extracted from candidate strokes and taken as segmentation point candidates through each of which a segmentation path may be formed. The geometric features similar to the coarse segmentation stage are used and corresponding fuzzy decision rules are generated to evaluate fine segmentation paths. Experimental results on 1000 Chinese character strings from postal mail show that our approach can achieve a reasonable good overall accuracy in segmenting unconstrained handwritten Chinese characters.  相似文献   

19.
目的 针对仪表、电梯等标牌上一些字符间距较小,传统分割方法分割不准确,字符识别率不高的问题,提出了一种标牌粘连字符自适应定位分割重建识别算法。方法 首先对标牌图像进行中值滤波、二值化等预处理;其次运用数学形态学方法对预处理后的图像进行开运算及腐蚀,将字符间一些无用的信息去掉,增大字符间距;继而通过形心算法找出每个字符的几何中心,并通过Sobel边缘检测算子根据几何中心获取每个字符边框,建立ROI(region of interest),再返回标牌原图利用已经建立的ROI从中分割字符,依据国家字符间距相关标准,在分割的每个字符后加一定像素宽的矩形间隔条后重建字符图像,再进行OCR(optical character recognition)字符识别。结果 经过对993块标牌进行字符识别实验,算法的识别率达到95.7%。结论 实验结果表明本文算法是对标牌字符识别的一种有效算法。  相似文献   

20.
多知识综合判决的字符切分算法   总被引:3,自引:0,他引:3  
高性能的印刷体文字识别系统中,在单字识别技术比较成熟的条件下,字符切分成为比较关键的环节。字符切分可以看作是对字符边界正确切分位置的一个决策过程,该决策需要同时考虑字符局部的识别情况和全局的上下文关系。该文通过对中日韩三国文字字符切分的研究,提出一种基于多知识综合判决的字符切分算法。该算法成功应用于AsiaOCR项目,对于东方文字中常见的混排英文问题也能很好处理。实验结果表明,和以前的算法相比,新算法在中日韩三国文字识别系统中的切分错误率平均下降50%。  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号