共查询到20条相似文献,搜索用时 15 毫秒
1.
在维吾尔文联机手写识别过程的训练阶段,单词被切分成字母,经过特征提取和聚类形成特征向量作为模型的输入。构造出以字符为基元的隐马尔可夫模型(HMM),将其嵌入到识别字典网络中。通过基于HMM的分类识别器,最终得到识别结果。首次将消除延迟笔画、建立有延迟笔画和无延迟笔画的字典的方法应用于维吾尔文手写识别中,取得了较高的识别率。 相似文献
2.
This paper presents a handwriting recognition system that deals with unconstrained handwriting and large vocabularies. The system is based on the segmentation-recognition paradigm where words are first loosely segmented into characters or pseudocharacters and the final segmentation is obtained during the recognition process, which is carried out with a lexicon. Characters are modeled by multiple hidden Markov models (HMMs), which are concatenated to build up word models. The lexicon is organized as a tree structure, and during the decoding words with similar prefixes share the same computation steps. To avoid an explosion of the search space due to the presence of multiple character models, a lexicon-driven level building algorithm (LDLBA) is used to decode the lexical tree and to choose at each level the more likely models. Bigram probabilities related to the variation of writing styles within the words are inserted between the levels of the LDLBA to improve the recognition accuracy. To further speed up the recognition process, some constraints are added to limit the search efforts to the more likely parts of the search space. Experimental results on a dataset of 4674 unconstrained words show that the proposed recognition system achieves recognition rates from 98% for a 10-word vocabulary to 71% for a 30,000-word vocabulary and recognition times from 9 ms to 18.4 s, respectively.Received: 8 July 2002, Accepted: 1 July 2003, Published online: 12 September 2003
Correspondence to: Alessandro L. Koerich 相似文献
3.
Claus Bahlmann Author Vitae 《Pattern recognition》2006,39(1):115-125
The selection of valuable features is crucial in pattern recognition. In this paper we deal with the issue that part of features originate from directional instead of common linear data. Both for directional and linear data a theory for a statistical modeling exists. However, none of these theories gives an integrated solution to problems, where linear and directional variables are to be combined in a single, multivariate probability density function. We describe a general approach for a unified statistical modeling, given the constraint that variances of the circular variables are small. The method is practically evaluated in the context of our online handwriting recognition system frog on hand and the so-called tangent slope angle feature. Recognition results are compared with two alternative modeling approaches. The proposed solution gives significant improvements in recognition accuracy, computational speed and memory requirements. 相似文献
4.
This paper compares the current state of the art in online Japanese character recognition with techniques in western handwriting recognition. It discusses important developments in preprocessing, classification, and postprocessing for Japanese character recognition in recent years and relates them to the developments in western handwriting recognition. Comparing eastern and western handwriting recognition techniques allows learning from very different approaches and understanding the underlying common foundations of handwriting recognition. This is very important when it comes to developing compact modules for integrated systems supporting many writing systems capable of recognizing multilanguage documents.Received: January 12, 2002, Accepted: March 6, 2003, Published online: 4 July 2003 相似文献
5.
Christian Viard-Gaudin Pierre-Michel Lallican Stefan Knerr 《Pattern recognition letters》2005,26(16):2537-2548
This paper analyses a handwriting recognition system for offline cursive words based on HMMs. It compares two approaches for transforming offline handwriting available as two-dimensional images into one-dimensional input signals that can be processed by HMMs. In the first approach, a left–right scan of the word is performed resulting in a sequence of feature vectors. In the second approach, a more subtle process attempts to recover the temporal order of the strokes that form words as they were written. This is accomplished by a graph model that generates a set of paths, each path being a possible temporal order of the handwriting. The recognition process then selects the most likely temporal stroke order based on knowledge that has been acquired from a large set of handwriting samples for which the temporal information was available. We show experimentally that such an offline recognition system using the recovered temporal order can achieve recognition performances that are much better than those obtained with the simple left–right order, and that come close to those of an online recognition system. We have been able to assess the ordering quality of handwriting when comparing true ordering and recovered one, and we also analyze the situations where offline and online information differ and what the consequences are on the recognition performances. For these evaluations, we have used about 30,000 words from the IRONOFF database that features both the online signal and offline signal for each word. 相似文献
6.
We propose a general framework to combine multiple sequence classifiers working on different sequence representations of a given input. This framework, based on Multi-Stream Hidden Markov Models (MS-HMMs), allows the combination of multiple HMMs operating on partially asynchronous information streams. This combination may operate at different levels of modeling: from the feature level to the post-processing level. This framework is applied to on-line handwriting word recognition by combining temporal and spatial representation of the signal. Different combination schemes are compared experimentally on isolated character recognition and word recognition tasks, using the UNIPEN international database.Received: 16 August 2002, Accepted: 21 November 2002, Published online: 6 June 2003 相似文献
7.
Giovanni Seni John Seybold 《International Journal on Document Analysis and Recognition》1999,2(1):24-29
Out-of-order diacriticals introduce significant complexity to the design of an online handwriting recognizer, because they
require some reordering of the time domain information. It is common in cursive writing to write the body of an `i' or `t'
during the writing of the word, and then to return and dot or cross the letter once the word is complete. The difficulty arises
because we have to look ahead, when scoring one of these letters, to find the mark occurring later in the writing stream that
completes the letter. We should also remember that we have used this mark, so that we don't use it again for a different letter,
and we should also penalize a word if there are some marks that look like diacriticals that are not used. One approach to
this problem is to scan the writing some distance into the future to identify candidate diacriticals, remove them in a preprocessing
step, and associate them with the matching letters earlier in the word. If done as a preliminary operation, this approach
is error-prone: marks that are not diacriticals may be incorrectly identified and removed, and true diacriticals may be skipped.
This paper describes a novel extension to a forward search algorithm that provides a natural mechanism for considering alternative
treatments of potential diacriticals, to see whether it is better to treat a given mark as a diacritical or not, and directly
compare the two outcomes by score.
Received October 30, 1998 / Revised January 25, 1999 相似文献
8.
C.V. Jawahar Author Vitae A. Balasubramanian Author VitaeAuthor Vitae Anoop M. Namboodiri Author Vitae 《Pattern recognition》2009,42(7):1445-1457
Search and retrieval is gaining importance in the ink domain due to the increase in the availability of online handwritten data. However, the problem is challenging due to variations in handwriting between various writers, digitizers and writing conditions. In this paper, we propose a retrieval mechanism for online handwriting, which can handle different writing styles, specifically for Indian languages. The proposed approach provides a keyboard-based search interface that enables to search handwritten data from any platform, in addition to pen-based and example-based queries. One of the major advantages of this framework is that information retrieval techniques such as ranking relevance, detecting stopwords and controlling word forms can be extended to work with search and retrieval in the ink domain. The framework also allows cross-lingual document retrieval across Indian languages. 相似文献
9.
Tong-Hua Su Author Vitae Tian-Wen Zhang Author Vitae Hu-Jie Huang 《Pattern recognition》2009,42(1):167-182
Great challenges are faced in the off-line recognition of realistic Chinese handwriting. This paper presents a segmentation-free strategy based on Hidden Markov Model (HMM) to handle this problem, where character segmentation stage is avoided prior to recognition. Handwritten textlines are first converted to observation sequence by sliding windows. Then embedded Baum-Welch algorithm is adopted to train character HMMs. Finally, best character string maximizing the a posteriori is located through Viterbi algorithm. Experiments are conducted on the HIT-MW database written by more than 780 writers. The results show the feasibility of such systems and reveal apparent complementary capacities between the segmentation-free systems and the segmentation-based ones. 相似文献
10.
John F. Pitrelli Jayashree Subrahmonia Michael P. Perrone 《International Journal on Document Analysis and Recognition》2006,8(1):35-46
Confidence scoring can assist in determining how to use imperfect handwriting-recognition output. We explore a confidence-scoring framework for post-processing recognition for two purposes: deciding when to reject the recognizer's output, and detecting when to change recognition parameters e.g., to relax a word-set constraint. Varied confidence scores, including likelihood ratios and posterior probabilities, are applied to an Hidden-Markov-Model (HMM) based on-line recognizer. Receiver-operating characteristic curves reveal that we successfully reject 90% of word recognition errors while rejecting only 33% of correctly-recognized words. For isolated digit recognition, we achieve 90% correct rejection while limiting false rejection to 13%. 相似文献
11.
Paulo R. Cavalin Robert Sabourin Ching Y. Suen Alceu S. Britto Jr.Author vitae 《Pattern recognition》2009,42(12):3241-3253
We present an evaluation of incremental learning algorithms for the estimation of hidden Markov model (HMM) parameters. The main goal is to investigate incremental learning algorithms that can provide as good performances as traditional batch learning techniques, but incorporating the advantages of incremental learning for designing complex pattern recognition systems. Experiments on handwritten characters have shown that a proposed variant of the ensemble training algorithm, employing ensembles of HMMs, can lead to very promising performances. Furthermore, the use of a validation dataset demonstrated that it is possible to reach better performances than the ones presented by batch learning. 相似文献
12.
M. Kobayashi S. Masaki O. Miyamoto Y. Nakagawa Y. Komiya T. Matsumoto 《International Journal on Document Analysis and Recognition》2001,3(3):181-191
A new algorithm RAV (reparameterized angle variations) is proposed which makes explicit use of trajectory information where the time evolution of the pen coordinates plays a crucial role. The algorithm is robust against stroke connections/abbreviations
as well as shape distortions, while maintaining reasonable robustness against stroke-order variations. Preliminary experiments
are reported on tests against the Kuchibue_d-96-02 database from the Tokyo University of Agriculture and Technology.
Received July 24, 2000 / Revised October 6, 2000 相似文献
13.
采用支持向量机(SVM)和隐马尔可夫模型(HMM)相结合的方法进行人脸识别。首先对照片中的人脸进行定位,从定位区域提取人脸各个器官的独立基特征,然后使用支持向量机和隐马尔可夫混合模型对定位区域进行人脸识别。利用SVM和HMM结合的优点,取得较高的识别率。 相似文献
14.
给出了一个基于HMM和GMM双引擎识别模型的维吾尔语联机手写体整词识别系统。在GMM部分,系统提取了8-方向特征,生成8-方向特征样式图像、定位空间采样点以及提取模糊的方向特征。在对模型精细化迭代训练之后,得到GMM模型文件。HMM部分,系统采用了笔段特征的方法来获取笔段分段点特征序列,在对模型进行精细化迭代训练后,得到HMM模型文件。将GMM模型文件和HMM模型文件分别打包封装再进行联合封装成字典。在第一期的实验中,系统的识别率达到97%,第二期的实验中,系统的识别率高达99%。 相似文献
15.
以基于隐马尔可夫模型和统计语言模型的研究作为基础,着重研究联机手写哈萨克文的切分技术、连体段分类和特征参数的独特提取技术。系统先将提取延迟笔划后的连体段主笔划作为HMM识别器的输入,再根据被识别的主笔划的编号和延迟笔划标记从连体段分类词典中查找,找到对应的连体段识别结果。通过去除连体段延迟笔画的方法可以有效地减少需建立的模型数目,进而提高识别速度和避免由字符切分所带来的问题。 相似文献
16.
现有的多数人脸识别系统都专注于如何提高人脸识别算法的性能,但缺乏一种对数据源(人脸样本)进行分析和评估的机制。针对此问题,提出了一种建立在数据源分析基础上对典型人脸识别算法进行后处理的方法。为了揭示现有典型识别算法的识别性能在无约束环境下的鲁棒性,通过建立Lambertian反射模型和3D人脸模型,对特征脸算法的识别性能随数据源的变化(人脸姿态和光照改变)而变化的情况进行了分析评估。针对“数据源灾难”问题,提出了一种基于隐马尔可夫模型(HMM)的后处理解决方法,该方法通过利用视频序列图像的连续性和对训练人脸库的统计分析来提高判别分析方法对无约束环境的鲁棒性。实验结果表明,该方法可以有效地提高识别算法对“数据源灾难”的鲁棒性,提高识别率。 相似文献
17.
18.
The aim of this paper is to introduce a novel prototype generation technique for handwriting digit recognition. Prototype generation is approached as a two-stage process. The first stage uses an Adaptive Resonance Theory 1 (ART1) based algorithm to select an effective initial solution, while the second one executes a fine tuning designed to generate the best prototypes. 相似文献
19.
In this paper we present a multiple classifier system (MCS) for on-line handwriting recognition. The MCS combines several individual recognition systems based on hidden Markov models (HMMs) and bidirectional long short-term memory networks (BLSTM). Beside using two different recognition architectures (HMM and BLSTM), we use various feature sets based on on-line and off-line features to obtain diverse recognizers. Furthermore, we generate a number of different neural network recognizers by changing the initialization parameters. To combine the word sequences output by the recognizers, we incrementally align these sequences using the recognizer output voting error reduction framework (ROVER). For deriving the final decision, different voting strategies are applied. The best combination ensemble has a recognition rate of 84.13%, which is significantly higher than the 83.64% achieved if only one recognition architecture (HMM or BLSTM) is used for the combination, and even remarkably higher than the 81.26% achieved by the best individual classifier. To demonstrate the high performance of the classification system, the results are compared with two widely used commercial recognizers from Microsoft and Vision Objects. 相似文献
20.
在研究基于隐马尔可夫模型的识别器和基于距离分类器的识别器的识别结果基础上,提出两种基于集成神经网络的手写识别系统:比较神经网络识别系统和全排列神经网络识别系统.实验分析表明,该系统对西文手写体的识别率最高可达到99%,比单独使用原始识别器的识别率提高10个百分点,达到了良好的识别效果. 相似文献