首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
Since Chinese characters are composed from a small set of fundamental shapes (radicals) the problem of recognising large numbers of characters can be converted to that of extracting a small number of radicals and then finding their optimal combination. In this paper, radical extraction is carried out by nonlinear active shape models, in which kernel principal component analysis is employed to capture the nonlinear variation. Treating Chinese character composition as a discrete Markov process, we also propose an approach to recognition with the Viterbi algorithm. Our initial experiments are conducted on off-line recognition of 430,800 loosely-constrained characters, comprised of 200 radical categories covering 2154 character categories from 200 writers. The correct recognition rate is 93.5% characters correct (writer-independent). Consideration of published figures for existing radical approaches suggests that our method achieves superior performance.  相似文献   

2.
汉字笔段形成规律及其提取方法   总被引:8,自引:0,他引:8  
该文从点阵图像行(列)连通像素段出发,研究汉字图像的笔段构成,发现汉字点阵图像仅由阶梯型笔段和平行长笔段两种类型的笔段构成,并归纳出阶梯型笔段和平行长笔段的形成规律.以笔段形成规律为基础提出了汉字笔段的提取方法,该方法将像素级汉字图像转变为以笔段为单位的图像,有利于汉字识别、汉字细化及汉字字体的自动生成.最后该文给出了印刷体和手写体汉字笔段提取的实验结果.  相似文献   

3.
A stroke-based approach to extract skeletons and structural features for handwritten Chinese character recognition is proposed. We first determine stroke directions based on the directional run-length information of binary character patterns. According to the stroke directions and their adjacent relationships, we split strokes into stroke and fork segments, and then extract the skeletons of the stroke segments called skeleton segments. After all skeleton segments are extracted, fork segments are processed to find the fork points and fork degrees. Skeleton segments that touch a fork segment are connected at the fork point, and all connected skeleton segments form the character skeleton. According to the extracted skeletons and fork points, we can extract primitive strokes and stroke direction maps for recognition. A simple classifier based on the stroke direction map is presented to recognize regular and rotated characters to verify the ability of the proposed feature extraction for handwritten Chinese character recognition. Several experiments are carried out, and the experimental results show that the proposed approach can easily and effectively extract skeletons and structural features, and works well for handwritten Chinese character recognition.  相似文献   

4.
Chinese characters are constructed by strokes according to structural rules. Therefore, the geometric configurations of characters are important features for character recognition. In handwritten characters, stroke shapes and their spatial relations may vary to some extent. The attribute value of a structural identification is then a fuzzy quantity rather than a binary quantity. Recognizing these facts, we propose a fuzzy attribute representation (FAR) to describe the structural features of handwritten Chinese characters for an on-line Chinese character recognition (OLCCR) system. With a FAR. a fuzzy attribute graph for each handwritten character is created, and the character recognition process is thus transformed into a simple graph matching problem. This character representation and our proposed recognition method allow us to relax the constraints on stroke order and stroke connection. The graph model provides a generalized character representation that can easily incorporate newly added characters into an OLCCR system with an automatic learning capability. The fuzzy representation can describe the degree of structural deformation in handwritten characters. The character matching algorithm is designed to tolerate structural deformations to some extent. Therefore, even input characters with deformations can be recognized correctly once the reference dictionary of the recognition system has been trained using a few representative learning samples. Experimental results are provided to show the effectiveness of the proposed method.  相似文献   

5.
相似字识别的正确与否对整个识别系统的准确性和可用性都有着极大的影响。在实际应用中,我们发现相似汉字之间的误识存在不对称性,并对这种不对称现象的成因进行了细致的探讨和分析。基于这种不对称性,本文提出了一种分类的部分空间方法来解决相似字的识别问题。相似字按其结构特点被分成若干基本类别,不同类别在相应的部分空间提取不同的特征进行比较,以达到正确识别相似字的目的。实验结果表明了本方法的有效性,相似字识别的准确性得到了很大的提高,其中易错相似字的识别正确率平均提高了4.55个百分点,不易错相似字的识别正确率平均提高了0.38个百分点。  相似文献   

6.
卢达  浦炜  陈琦玮  谢铭培 《计算机应用》2005,25(10):2418-2421
对手写汉字识别问题,提出了一种在识别之前对手写汉字预分类的新方法,该方法用Neocognitron网提取字符笔画特征,然后采用有监督的扩展ART神经网络(SEART)产生一定数量的预分类组并通过基于模糊相似测量的匹配算法进行预分类。实验表明,该方法用于手写汉字分类效果良好,预分类正确率达到98.22%。  相似文献   

7.
针对银行支票图像大写金额的无限制手写体汉字识别问题,进行了基于密度均衡原则的非线性规范化研究。提出了一种改进的非线性规范化方法.该方法定义的基于笔画间距和宽度的密度函数,不仅能较好地克服笔画变形的局部性、不规则性,而且能使同一字符内以及不同字符之间的笔画粗细趋于一致;同时,确定了图像中字符的有效区域,并据此改进了基于密度均衡原则的通用表达式,有效地解决了字符整体倾斜和单个笔画比较突出的问题,实验结果表明:该方法比其他同类方法效果更佳,可使银行支票图像的大写金额识别系统的识别正确率提高约1.5%。  相似文献   

8.
为实现纸质医药包装钢印字符的实时检测,设计一种基于图像处理和深度学习的钢印字符识别系统。系统首先采用多种图像处理的方法对原始打光下的图像进行预处理,从而自动提取图片中的感兴趣区域,并将其输入训练好的Mask-RCNN网络进行实例分割,得到每张图片中的不同字符的像素位置与其字符数值。实验结果表明,对比传统的字符识别方法,该方法可以很好地解决纸质医药包装钢印字符图片中灰度跳变不明显的问题,准确分割出纸质包装盒图片中的钢印字符并进行标记,其字符的识别准确率达到99%,为生产线上钢印字符的识别和记录提供了新的解决思路,具有较高的实用价值。  相似文献   

9.
We propose a variational method for model based segmentation of gray-scale images of highly degraded historical documents. Given a training set of characters (of a certain letter), we construct a small set of shape models that cover most of the training set's shape variance. For each gray-scale image of a respective degraded character, we construct a custom made shape prior using those fragments of the shape models that best fit the character's boundary. Therefore, we are not limited to any particular shape in the shape model set. In addition, we demonstrate the application of our shape prior to degraded character recognition. Experiments show that our method achieves very accurate results both in segmentation of highly degraded characters and both in recognition. When compared with manual segmentation, the average distance between the boundaries of respective segmented characters was 0.8 pixels (the average size of the characters was 70*70 pixels).  相似文献   

10.
对图像文字进行细化有助于突出文字的形状特点和减少冗余的信息量,在文字识别 领域有着重要的应用。在分析研究传统细化算法后,针对传统细化出现的畸变、细化不完全现象, 提出了一种对国际音标图像字符的细化方法。该算法通过对文字区域的边缘分类标记,并判断被 标记点是否满足可去除条件,然后逐步去除边缘像素点,最终能让国际音标图像字符的宽度细化 到一个像素宽度。针对国际音标图像字符的实验表明,该算法能够准确地对国际音标图像字符进 行细化,且简单高效。  相似文献   

11.
根据数字字符整体特征, 提出一种基于字符整体特征的Bp神经网络数字二次识别方法. 该方法首先根据Bp神经网络原理对数字字符进行预识别; 然后对预识别结果中存在混淆的字符按照字符整体特征进行二次识别, 从而准确获得识别结果. 该方法结合了神经网络非线性、自主学习特点和字符整体特征形状结构不变性特点, 有效的在低样本量情况下, 获得较高的字符识别精度.  相似文献   

12.
文章提出了一种手写汉字预分类的新方法,该方法分两步进行,首先提取笔划密度特征并用模糊规则产生四个预分类组;然后通过模糊逻辑处理将各组字符分别转换成基于非线性加权函数的模糊样板并通过基于模糊相似测量的匹配算法、相似性测量样板的分级分类进行预分类。测试结果表明,该方法效果良好,预分类正确率达到98.17%。  相似文献   

13.
多字体多字号印刷汉字识别方法的研究   总被引:2,自引:0,他引:2  
本文对多体多字号印别汉字识别的方法进行了研究, 本文提出的方法是首先对不同字号印刷 汉字进行归一化处理, 再抽取汉字四周笔端数特征、改进粗外围特征、笔划穿插次数特征和投影变换特征, 然后对组合特征进行多级分类识别。实验在IBM一PC AT 微型机上进行, 结果表明, 实验系统在识别实际印别文本时识别率大于98%。  相似文献   

14.
食品、药品包装上的点阵字符信息一般包含生产日期和其他重要信息。针对目前单一的点阵字符识别方法准确率不高,且对点阵字符在复杂环境下(既包含点阵字符又包含连续字符)字符定位准确性低的问题,提出了一种基于模板匹配和支持向量机(Support Vector Machine,SVM)的组合点阵字符识别方法。该方法利用点阵字符的离散性质来准确定位点阵字符,然后分别通过基于灰度的模板匹配和基于特征的模板匹配方法得到两个判定结果。若判定结果相同,则识别出字符;若判定结果相异,将这两个结果送给SVM进行识别,得出识别结果。实验结果表明,该方法在点阵字符的定位准确性和识别率方面都优于传统字符识别方法,且识别鲁棒性较好,字符识别率达到96.10%。  相似文献   

15.
车牌识别技术作为交通管理自动化的重要手段,在交通监视和控制中占有很重要 的地位。车牌识别过程可分为车牌定位、车牌校正、字符分割和字符识别四个部分。在车牌定 位中,若单纯采用纹理特征或颜色特征来进行定位,往往适用于背景较为简单的场景,对复杂 背景的定位效果尚有待改进。在字符分割中,目前单行车牌的分割已比较成熟,但双行车牌的 分割仍不理想。提出一种在HSV 空间下两次颜色标定和纹理特征相结合的定位方法和一种单双 行车牌的字符分割方法。该定位方法利用车牌固定颜色搭配特性,对图片两次标记并利用投影 法定位车牌,对200 张不同背景图片测试,定位准确率达到98%。在字符分割部分,利用改进 的模板匹配方法对字符分割,可适用于单、双行车牌分割,准确率达到95%。  相似文献   

16.
17.
研究LeNet-5在扫描文档中手写体日期字符识别的应用,由于文档扫描的过程中会引入各种噪声,特别是光照和颜色干扰,直接使用LeNet-5算法不能取得较好效果。先在整份文档中对特定待识别字符的进行定位和划分,并对划分出的字符图像进行去噪、灰度化和二值化处理等预处理,接着将字符图像分割成一个个单个字符,然后在LeNet-5网络基础上结合模型匹配法实现对手写体日期字符的识别。分析在不同参数组合下的识别效果,调整算法模型参数有效地提升了模型对于实际对象的性能,实现出一种能够对手写体日期字符集实现较好识别效果的算法。实验结果表明了算法的有效性,并应用于具体工程实践。  相似文献   

18.
车牌图像定位是车牌照识别系统的关键,该文提出了一种在高速公路复杂背景下的车牌定位与车牌字符分割方法。该方法利用水平相关特征、车牌区域的梯度形态特征和车牌配色特征进行车牌定位,并利用车牌的结构特征采用多尺度模板匹配方法切分车牌字符。实验表明该方法在复杂背景下具有较好的定位切分效果和较强的鲁棒性。  相似文献   

19.
The present work is an attempt to develop a robust character recognizer for Telugu texts. We aim at designing a recognizer, which exploits the inherent characteristics of the Telugu Script. Our proposed method uses wavelet multi-resolution analysis for the purpose extracting features and associative memory model to accomplish the recognition tasks. Our system learns the style and font from the document itself and then it recognizes the remaining characters in the document. The major contribution of the present study can be outlined as follows. It is a robust OCR system for Telugu printed text. It avoids feature extraction process and it exploits the inherent characteristics of the Telugu character by a clever selection of Wavelet Basis function, which extracts the invariant features of the characters. It has a Hopfield-based Dynamic Neural Network for the purpose of learning and recognition. This is important because it overcomes the inherent difficulties of memory limitation and spurious states in the Hopfield Network. The DNN has been demonstrated to be efficient for associative memory recall. However, though it is normally not suitable for image processing application, the multi-resolution analysis reduces the sizes of the images to make the DNN applicable to the present domain. Our experimental results show extremely promising results.  相似文献   

20.
由于汉字笔画复杂,从视频中提取的汉字图像质量往往较差,采用传统光学字符识别(OCR)的结果不理想.为了解决低质量汉字图像的识别问题,提出一种基于分块搜索的两级识别方法.首先建立汉字图像的分块结构并模仿低质量汉字生成训练集,然后对训练集中各分块图像应用主成分分析提取特征并建立索引.待识别图像应用分块搜索和投票的方式从索引中获取候选汉字集合(一级识别),再根据投票结果的显著性辅以全局结构特征匹配识别汉字(二级识别).实验结果证明,该方法对于低质量汉字图像比普通的OCR方法具有更高的识别率.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号