首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
SVM多值分类器在脱机手写体相似汉字识别中的应用   总被引:7,自引:0,他引:7  
相似字的普遍存在是影响脱机手写体汉字识别率低的主要原因之一。论文研究了支持向量机(SVM)多值分类器在手写相似汉字识别中的应用,所提出的方法采用了小波弹性网格技术提取汉字的特征,通过实验比较了三种不同的SVM分类器组合策略的分类效果。  相似文献   

2.
提出一种基于结构特征的手写维吾尔字符识别算法,首先根据字符的笔画数目将待识别字符划分为五个子集,然后再根据"附加笔画位置"等特征对字符集再进行划分。根据每个子集中的字符分布情况,提取不同长度的特征向量,然后利用SVM为每个字符集构造一个分类器,进行训练和识别。  相似文献   

3.
4.
The task of handwritten Chinese character recognition is one of the most challenging areas of human handwriting classification. The main reason for this is related to the writing system itself which encompasses thousands of characters, coupled with high levels of diversity in personal writing styles and attributes. Much of the existing work for both online and off-line handwritten Chinese character recognition has focused on methods which employ feature extraction and segmentation steps. The preprocessed data from these steps form the basis for the subsequent classification and recognition phases. This paper proposes an approach for handwritten Chinese character recognition and classification using only an image alignment technique and does not require the aforementioned steps. Rather than extracting features from the image, which often means building models from very large training data, the proposed method instead uses the mean image transformations as a basis for model building. The use of an image-only model means that no subjective tuning of the feature extraction is required. In addition by employing a fuzzy-entropy-based metric, the work also entails improved ability to model different types of uncertainty. The classifier is a simple distance-based nearest neighbour classification system based on template matching. The approach is applied to a publicly available real-world database of handwritten Chinese characters and demonstrates that it can achieve high classification accuracy and is robust in the presence of noise.  相似文献   

5.
This paper deals with an optical character recognition (OCR) system for handwritten Gujarati numbers. One may find so much of work for Indian languages like Hindi, Kannada, Tamil, Bangala, Malayalam, Gurumukhi etc, but Gujarati is a language for which hardly any work is traceable especially for handwritten characters. Here in this work a neural network is proposed for Gujarati handwritten digits identification. A multi layered feed forward neural network is suggested for classification of digits. The features of Gujarati digits are abstracted by four different profiles of digits. Thinning and skew-correction are also done for preprocessing of handwritten numerals before their classification. This work has achieved approximately 82% of success rate for Gujarati handwritten digit identification.  相似文献   

6.
Correct segmentation of handwritten Chinese characters is crucial to their successful recognition. However, due to many difficulties involved, little work has been reported in this area. In this paper, a two-stage approach is presented to segment unconstrained handwritten Chinese characters. A handwritten Chinese character string is first coarsely segmented according to the background skeleton and vertical projection after a proper image preprocessing. With several geometric features, all possible segmentation paths are evaluated by using the fuzzy decision rules learned from examples. As a result, unsuitable segmentation paths are discarded. In the fine segmentation stage that follows, the strokes that may contain segmentation points are first identified. The feature points are then extracted from candidate strokes and taken as segmentation point candidates through each of which a segmentation path may be formed. The geometric features similar to the coarse segmentation stage are used and corresponding fuzzy decision rules are generated to evaluate fine segmentation paths. Experimental results on 1000 Chinese character strings from postal mail show that our approach can achieve a reasonable good overall accuracy in segmenting unconstrained handwritten Chinese characters.  相似文献   

7.
王建平  蔺菲  陈军 《计算机工程》2007,33(10):230-232,248
提出了手写体汉字笔画宽度提取、基于提取出的笔画宽度归一化手写体汉字的方法,给出手写体汉字笔画重构的思想,实现了一种基于手写体汉字笔画提取的汉字重构并最终识别手写体汉字的算法,构建了手写体汉字的识别系统。实验证实,该方法可保证原有笔画特征信息,且能有效地识别手写体汉字。  相似文献   

8.
In this paper, we describe a system for rapid verification of unconstrained off-line handwritten phrases using perceptual holistic features of the handwritten phrase image. The system is used to verify handwritten street names automatically extracted from live US mail against recognition results of analytical classifiers. Presented with a binary image of a street name and an ASCII street name, holistic features (reference lines, large gaps and local contour extrema) of the street name hypothesis are “predicted” from the expected features of the constituent characters using heuristic rules. A dynamic programming algorithm is used to match the predicted features with the extracted image features. Classes of holistic features are matched sequentially in increasing order of cost, allowing an ACCEPT/REJECT decision to be arrived at in a time-efficient manner. The system rejects errors with 98 percent accuracy at the 30 percent accept level, while consuming approximately 20/msec per image on the average on a 150 MHz SPARC 10  相似文献   

9.
Chinese characters are constructed by strokes according to structural rules. Therefore, the geometric configurations of characters are important features for character recognition. In handwritten characters, stroke shapes and their spatial relations may vary to some extent. The attribute value of a structural identification is then a fuzzy quantity rather than a binary quantity. Recognizing these facts, we propose a fuzzy attribute representation (FAR) to describe the structural features of handwritten Chinese characters for an on-line Chinese character recognition (OLCCR) system. With a FAR. a fuzzy attribute graph for each handwritten character is created, and the character recognition process is thus transformed into a simple graph matching problem. This character representation and our proposed recognition method allow us to relax the constraints on stroke order and stroke connection. The graph model provides a generalized character representation that can easily incorporate newly added characters into an OLCCR system with an automatic learning capability. The fuzzy representation can describe the degree of structural deformation in handwritten characters. The character matching algorithm is designed to tolerate structural deformations to some extent. Therefore, even input characters with deformations can be recognized correctly once the reference dictionary of the recognition system has been trained using a few representative learning samples. Experimental results are provided to show the effectiveness of the proposed method.  相似文献   

10.
We propose support vector machine (SVM) based hierarchical classification schemes for recognition of handwritten Bangla characters. A comparative study is made among multilayer perceptron, radial basis function network and SVM classifier for this 45 class recognition problem. SVM classifier is found to outperform the other classifiers. A fusion scheme using the three classifiers is proposed which is marginally better than SVM classifier. It is observed that there are groups of characters having similar shapes. These groups are determined in two different ways on the basis of the confusion matrix obtained from SVM classifier. In the former, the groups are disjoint while they are overlapped in the latter. Another grouping scheme is proposed based on the confusion matrix obtained from neural gas algorithm. Groups are disjoint here. Three different two-stage hierarchical learning architectures (HLAs) are proposed using the three grouping schemes. An unknown character image is classified into a group in the first stage. The second stage recognizes the class within this group. Performances of the HLA schemes are found to be better than single stage classification schemes. The HLA scheme with overlapped groups outperforms the other two HLA schemes.  相似文献   

11.
A handwritten Chinese character recognition method based on primitive and compound fuzzy features using the SEART neural network model is proposed. The primitive features are extracted in local and global view. Since handwritten Chinese characters vary a great deal, the fuzzy concept is used to extract the compound features in structural view. We combine the two categories of features and use a fast classifier, called the Supervised Extended ART (SEART) neural network model, to recognize handwritten Chinese characters. The SEART classifier has excellent performance, is fast, and has good generalization and exception handling abilities in complex problems. Using the fuzzy set theory in feature extraction and the neural network model as a classifier is helpful for reducing distortions, noise and variations. In spite of the poor thinning, a 90.24% recognition rate on average for the 605 test character categories was obtained. The database used is CCL/HCCR3 (provided by CCL, ITRI, Taiwan). The experiment not only confirms the feasibility of the proposed system, but also suggests that applying the fuzzy set theory and neural networks to recognition of handwritten Chinese characters is an efficient and promising approach.  相似文献   

12.
手写文字图像补全是图像补全问题中一个重要研究分支,其难点在于图片中具有 无约束书写风格的文字的结构关系补全。为了模拟实际中复杂和困难的应用情景,在图像补全 研究工作的启发下,针对大类别、小样本、多风格、未知语种等复杂情况下进行手写象形文字 图像补全。采用全局和局部一致性保持的生成式对抗神经网络(GLC-GAN)。在大类别多风格的 手写文字图像补全中,补全图片往往因可能的补全候选很丰富而导致补全区域模糊不清。为此, 提出两级补全系统:第一级粗补全模块考虑文字结构的完整性,第二级细补全模块实现文字的 清晰化、细致化。通过在大类别手写汉字数据库 CASIA-HWDB1.1 上的实验,验证了该两级系 统的有效性,同时分析系统在不同书写风格和不同缺失区域情况下的补全效果。  相似文献   

13.
The main problem in the handwritten character recognition systems (HCR) is to describe each character by a set of features that can distinguish it from the other characters. Thus, in this paper, we propose a robust set of features extracted from isolated Amazigh characters based on decomposing the character image into zones and calculate the density and the total length of the histogram projection in each zone. In the experimental evaluation, we test the proposed set of features, to show its performance, with different classification algorithms on a large database of handwritten Amazigh characters. The obtained results give recognition rates that reach 99.03% which we presume good and satisfactory compared to other approaches and show that our proposed set of features is useful to describe the Amazigh characters.  相似文献   

14.
The aim of our work is to present a new method based on structural characteristics and a fuzzy classifier for off-line recognition of handwritten Arabic characters in all their forms (beginning, end, middle and isolated). The proposed method can be integrated in any handwritten Arabic words recognition system based on an explicit segmentation process. First, three preprocessing operations are applied on character images: thinning, contour tracing and connected components detection. These operations extract structural characteristics used to divide the set of characters into five subsets. Next, features are extracted using invariant pseudo-Zernike moments. Classification was done using the Fuzzy ARTMAP neural network, which is very fast in training and supports incremental learning. Five Fuzzy ARTMAP neural networks were employed; each one is designed to recognize one subset of characters. The recognition process is achieved in two steps: in the first one, a clustering method affects characters to one of the five character subsets. In the second one, the pseudo-Zernike features are used by the appropriate Fuzzy ARTMAP classifier to identify the character. Training process and tests were performed on a set of character images manually extracted from the IFN/ENIT database. A height recognition rate was reported.  相似文献   

15.
针对小波包变换的特点,提出了一种基于小波包变换的手写体金融汉字识别算法。该算法首先对汉字图像进行二维小波包分解,利用基于子图像能量方差的准则选择适当的部分分解树;然后将得到的子图像划分成多个局部窗口,计算局部窗口的能量值组成特征向量;再通过主成分分析(PCA)选择分类能力最强的一组特征,降低特征空间的维数;最后,将特征向量送入支持向量机进行分类。实验结果表明,该算法取得了较好的识别效果。  相似文献   

16.
借鉴仿生模式识别的认知观点,从汉字的构造机理和人类认识汉字的习惯角度出发,提出一种基于小波变换的图像汉字识别方法。制定了图像汉字笔划特征提取的具体规则,采用小波变换的方法对图像汉字边缘和笔划轮廓进行检测,通过有效提取图像汉字笔段信息,进行笔段合成,生成汉字或汉字的基本笔划。仿真实验结果表明,这种方法提高了图像汉字笔划特征提取的准确率和稳定性,对于印刷体和书写较规范的手写体图像汉字具有极高的识别率。  相似文献   

17.
An off-line handwritten word recognition system is described. Images of handwritten words are matched to lexicons of candidate strings. A word image is segmented into primitives. The best match between sequences of unions of primitives and a lexicon string is found using dynamic programming. Neural networks assign match scores between characters and segments. Two particularly unique features are that neural networks assign confidence that pairs of segments are compatible with character confidence assignments and that this confidence is integrated into the dynamic programming. Experimental results are provided on data from the U.S. Postal Service.  相似文献   

18.
19.
基于流形学习的单字符字体辨别   总被引:1,自引:1,他引:0       下载免费PDF全文
文字种类识别及字体辨别已成为继印刷体文字识别以后新的国内外研究的热点,关于单字的手写体和印刷体辨别的研究不多,但在表单中却极为常用。对于字体辨别问题,引入流形学习算法局部线性嵌套(LLE),假定数据为存在于嵌入高维空间的一个低维流形。提出了用于单字字体辨别的LLE泛化方法及邻域和内在维数的参数估计方法,基于印刷体/手写体汉字字符及数字的辨别实验表明,其性能优于直接支持向量机(SVM)分类,且经过LLE降维后的数据直接用线性判别分析方法(LDA)分类可以获得与LLE计算后SVM分类相近甚至更高的正确率和更快的分类速度。  相似文献   

20.
This paper presents an effective automated analysis system for mixed documents consisting of handwritten texts and graphic images. In the preprocessing step, an input image is binarized, then graphic regions are separated from text parts using chain codes of connected components. In the character recognition step, we recognize two different sets of handwritten characters: Korean and alphanumeric characters. Considering the structural complexity and variations of Korean characters, we separate them based on partial recognition results of vowels and extract primitive phonemes using a branch and bound algorithm based on dynamic programming (DP) matching. Finally, to validate recognition results, a dictionary and knowledge are employed. Computer simulation with 50 test documents shows that the proposed algorithm analyzes effectively mixed documents.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号