首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
2.
This paper compares the current state of the art in online Japanese character recognition with techniques in western handwriting recognition. It discusses important developments in preprocessing, classification, and postprocessing for Japanese character recognition in recent years and relates them to the developments in western handwriting recognition. Comparing eastern and western handwriting recognition techniques allows learning from very different approaches and understanding the underlying common foundations of handwriting recognition. This is very important when it comes to developing compact modules for integrated systems supporting many writing systems capable of recognizing multilanguage documents.Received: January 12, 2002, Accepted: March 6, 2003, Published online: 4 July 2003  相似文献   

3.
4.
5.
The retrieval of information from scanned handwritten documents is becoming vital with the rapid increase of digitized documents, and word spotting systems have been developed to search for words within documents. These systems can be either template matching algorithms or learning based. This paper presents a coherent learning based Arabic handwritten word spotting system which can adapt to the nature of Arabic handwriting, which can have no clear boundaries between words. Consequently, the system recognizes Pieces of Arabic Words (PAWs), then re-constructs and spots words using language models. The proposed system produced promising result for Arabic handwritten word spotting when tested on the CENPARMI Arabic documents database.  相似文献   

6.
Printed Arabic character recognition using HMM   总被引:1,自引:0,他引:1       下载免费PDF全文
The Arabic Language has a very rich vocabulary. More than 200 million people speak this language as their native speaking, and over 1 billion people use it in several religion-related activities. In this paper a new technique is presented for recognizing printed Arabic characters. After a word is segmented, each character/word is entirely transformed into a feature vector. The features of printed Arabic characters include strokes and bays in various directions, endpoints, intersection points, loops, dots and zigzags. The word skeleton is decomposed into a number of links in orthographic order, and then it is transferred into a sequence of symbols using vector quantization. Single hidden Markov model has been used for recognizing the printed Arabic characters. Experimental results show that the high recognition rate depends on the number of states in each sample.  相似文献   

7.
8.
9.
Great challenges are faced in the off-line recognition of realistic Chinese handwriting. This paper presents a segmentation-free strategy based on Hidden Markov Model (HMM) to handle this problem, where character segmentation stage is avoided prior to recognition. Handwritten textlines are first converted to observation sequence by sliding windows. Then embedded Baum-Welch algorithm is adopted to train character HMMs. Finally, best character string maximizing the a posteriori is located through Viterbi algorithm. Experiments are conducted on the HIT-MW database written by more than 780 writers. The results show the feasibility of such systems and reveal apparent complementary capacities between the segmentation-free systems and the segmentation-based ones.  相似文献   

10.
Given the number and variety of methods used for handwriting recognition, it has been shown that there is no single method that can be called the "best". In recent years, the combination of different classifiers and the use of contextual information have become major areas of interest in improving recognition results. This paper addresses a case study on the combination of multiple classifiers and the integration of syntactic level information for the recognition of handwritten Arabic literal amounts. To the best of our knowledge, this is the first time either of these methods has been applied to Arabic word recognition. Using three individual classifiers with high level global features, we performed word recognition experiments. A parallel combination method was tested for all possible configuration cases of the three chosen classifiers. A syntactic analyzer makes a final decision on the candidate words generated by the best configuration scheme. The effectiveness of contextual knowledge integration in our application is confirmed by the obtained results.  相似文献   

11.
黄弋石  梁艳  陆峥嵘 《软件》2013,34(5):67-70,90
为了解决手机的汉字联机识别,我们提供了完整的解决方案。先定义了一套六组基本定义,形成一种特殊的但是很简单的图形描述方法。将汉字楷书的常用字的笔画,进行统计归纳归类,找到有限的笔画组成。使用基本定义,描述楷书的独立笔画,这些描述是互不相同的。回避了传统的复杂的二维图形学的方法。经过穷举法验证,几乎每个常用汉字的拆解的内容,都互不相同,所以从逻辑上可判定这种方法是有效的。也提供并公布对汉字行书、草书的部分解决方案。形成了一个手机汉字连笔识别的体系。  相似文献   

12.
字符识别应用于图书系统能有效提高图书馆的数字化和自动化程度,设计了多图书图像标签字符的分割识别,达到了一次操作识别多本图书的功能。首先根据图像边缘连接后的统计值分割出单本图书,然后基于RGB颜色特征提取字符区,并根据轮廓分割字符,最后设计BP网络识别字符。通过matlab仿真了处理流程,仿真结果表明该算法能基本识别图书标签,有较好的实用价值和应用前景。  相似文献   

13.
张堃  张习文 《计算机应用研究》2008,25(11):3486-3489
在识别矢量笔迹文本时,不同类型单字需要采用不同识别器,确定详细类别是单字识别的前提。对实际中文矢量笔迹文本中单字进行汉字、标点、数字、字母和单词的详细分类,提出了自身和相对(包括近邻和同行)特征,选用决策树、逻辑模型树、贝叶斯网络和支持向量机四种分类器。针对大量实际数据,测试和比较了多种特征和分类器的性能。实验表明,近邻单字的组合特征具有较好的分类能力,支持向量机对各种单字均有较好分类性能。  相似文献   

14.
目的 针对仪表、电梯等标牌上一些字符间距较小,传统分割方法分割不准确,字符识别率不高的问题,提出了一种标牌粘连字符自适应定位分割重建识别算法。方法 首先对标牌图像进行中值滤波、二值化等预处理;其次运用数学形态学方法对预处理后的图像进行开运算及腐蚀,将字符间一些无用的信息去掉,增大字符间距;继而通过形心算法找出每个字符的几何中心,并通过Sobel边缘检测算子根据几何中心获取每个字符边框,建立ROI(region of interest),再返回标牌原图利用已经建立的ROI从中分割字符,依据国家字符间距相关标准,在分割的每个字符后加一定像素宽的矩形间隔条后重建字符图像,再进行OCR(optical character recognition)字符识别。结果 经过对993块标牌进行字符识别实验,算法的识别率达到95.7%。结论 实验结果表明本文算法是对标牌字符识别的一种有效算法。  相似文献   

15.
The purpose of this study is to investigate a new representation of shape and its use in handwritten online character recognition by a Kohonen associative memory. This representation is based on the empirical distribution of features such as tangents and tangent differences at regularly spaced points along the character signal. Recognition is carried out by a Kohonen neural network trained using the representation. In addition to the Euclidean distance traditionally used in the Kohonen training algorithm to measure the similarities among feature vectors, we also investigate the Kullback–Leibler divergence and the Hellinger distance, functions that measure distance between distributions. Furthermore, we perform operations (pruning and filtering) on the trained memory to improve its classification potency. We report on extensive experiments using a database of online Arabic characters produced without constraints by a large number of writers. Comparative results show the pertinence of the representation and the superior performance of the scheme.  相似文献   

16.
Segmentation of cursive words into letters has been one of the major problems in handwriting recognition. We introduce a new segmentation algorithm, guided in part by the global characteristics of the handwriting. We find the successive segmentation points by evaluating a cost function at each point along the baseline. The cost of segmenting at a point is a weighted sum of four feature values at that point. The weights of the features are determined using linear programming.In our tests with 750 words written by 10 writers, 97% of the letter boundaries were correctly located.  相似文献   

17.
刘阳兴 《计算机应用研究》2011,28(10):3998-4000
针对粘连和搭接字符切分算法的不足,提出一种基于折线切分路径的字符切分算法。该算法利用投影法将粘连搭接字符与非粘连搭接字符分离开,而后结合粘连搭接字符独有的外形特征,通过引入惩罚权重的路径搜索算法快速而准确地得到粘连搭接字符间的折线切分路径;为了避免一些字符在以上的切分过程中被误切碎,利用识别反馈信息对一些字符子图像进行合并。实验结果表明,该算法对印刷体日英混排字符切分有很强的适应性,取得了较理想的切分效果。  相似文献   

18.
In this paper, an integrated offline recognition system for unconstrained handwriting is presented. The proposed system consists of seven main modules: skew angle estimation and correction, printed-handwritten text discrimination, line segmentation, slant removing, word segmentation, and character segmentation and recognition, stemming from the implementation of already existing algorithms as well as novel algorithms. This system has been tested on the NIST, IAM-DB, and GRUHD databases and has achieved accuracy that varies from 65.6% to 100% depending on the database and the experiment.  相似文献   

19.
在维吾尔文联机手写识别过程的训练阶段,单词被切分成字母,经过特征提取和聚类形成特征向量作为模型的输入。构造出以字符为基元的隐马尔可夫模型(HMM),将其嵌入到识别字典网络中。通过基于HMM的分类识别器,最终得到识别结果。首次将消除延迟笔画、建立有延迟笔画和无延迟笔画的字典的方法应用于维吾尔文手写识别中,取得了较高的识别率。  相似文献   

20.
为了实现字段式液晶数字的自动判读,提出了一种基于字符切割和拼接的识别算法。定位字符垂直投影的波峰,在两峰之间确定切割线并对字符进行切割,利用投影特征,对切割块进行拼接,实现字符的准确分割。实验表明,该方法解决了传统算法难以准确分割字段式字符的问题,提高了识别率以及准确率。  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号