首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
2.
In keyword spotting from handwritten documents by text query, the word similarity is usually computed by combining character similarities, which are desired to approximate the logarithm of the character probabilities. In this paper, we propose to directly estimate the posterior probability (also called confidence) of candidate characters based on the N-best paths from the candidate segmentation-recognition lattice. On evaluating the candidate segmentation-recognition paths by combining multiple contexts, the scores of the N-best paths are transformed to posterior probabilities using soft-max. The parameter of soft-max (confidence parameter) is estimated from the character confusion network, which is constructed by aligning different paths using a string matching algorithm. The posterior probability of a candidate character is the summation of the probabilities of the paths that pass through the candidate character. We compare the proposed posterior probability estimation method with some reference methods including the word confidence measure and the text line recognition method. Experimental results of keyword spotting on a large database CASIA-OLHWDB of unconstrained online Chinese handwriting demonstrate the effectiveness of the proposed method.  相似文献   

3.
For part I see ibid. vol.8, no. 1 (2000). This paper presents an application of the generalized hidden Markov models to handwritten word recognition. The system represents a word image as an ordered list of observation vectors by encoding features computed from each column in the given word image. Word models are formed by concatenating the state chains of the constituent character hidden Markov models. The novel work presented includes the preprocessing, feature extraction, and the application of the generalized hidden Markov models to handwritten word recognition. Methods for training the classical and generalized (fuzzy) models are described. Experiments were performed on a standard data set of handwritten word images obtained from the US Post Office mail stream, which contains real-word samples of different styles and qualities  相似文献   

4.
Describes a hidden Markov model-based approach designed to recognize off-line unconstrained handwritten words for large vocabularies. After preprocessing, a word image is segmented into letters or pseudoletters and represented by two feature sequences of equal length, each consisting of an alternating sequence of shape-symbols and segmentation-symbols, which are both explicitly modeled. The word model is made up of the concatenation of appropriate letter models consisting of elementary HMMs and an HMM-based interpolation technique is used to optimally combine the two feature sets. Two rejection mechanisms are considered depending on whether or not the word image is guaranteed to belong to the lexicon. Experiments carried out on real-life data show that the proposed approach can be successfully used for handwritten word recognition  相似文献   

5.
该文针对关键词检测和实用语音识别中OOV(Out-Of-vocabulary)问题的检测和拒绝进行研究,通过使用判别式分析,利用L-Best本地分数和N-best言语假设判别分数,进行言语判别(utterarce verification);该文进行了两组实验,分别针对OOV问题,在小词汇量特定人孤立词识别系统、小词汇量非特定人的孤立词识别系统中进行研究。  相似文献   

6.
In this paper, we describe a system for rapid verification of unconstrained off-line handwritten phrases using perceptual holistic features of the handwritten phrase image. The system is used to verify handwritten street names automatically extracted from live US mail against recognition results of analytical classifiers. Presented with a binary image of a street name and an ASCII street name, holistic features (reference lines, large gaps and local contour extrema) of the street name hypothesis are “predicted” from the expected features of the constituent characters using heuristic rules. A dynamic programming algorithm is used to match the predicted features with the extracted image features. Classes of holistic features are matched sequentially in increasing order of cost, allowing an ACCEPT/REJECT decision to be arrived at in a time-efficient manner. The system rejects errors with 98 percent accuracy at the 30 percent accept level, while consuming approximately 20/msec per image on the average on a 150 MHz SPARC 10  相似文献   

7.
In this paper, we present a new off-line word recognition system that is able to recognize unconstrained handwritten words using grey-scale images. This is based on structural and relational information in the handwritten word. We use Gabor filters to extract features from the words, and then use an evidence-based approach for word classification. A solution to the Gabor filter parameter estimation problem is given, enabling the Gabor filter to be automatically tuned to the word image properties. We also developed two new methods for correcting the slope of the handwritten words. Our experiments show that the proposed method achieves good recognition rates compared to standard classification methods.  相似文献   

8.
9.
An off-line handwritten word recognition system is described. Images of handwritten words are matched to lexicons of candidate strings. A word image is segmented into primitives. The best match between sequences of unions of primitives and a lexicon string is found using dynamic programming. Neural networks assign match scores between characters and segments. Two particularly unique features are that neural networks assign confidence that pairs of segments are compatible with character confidence assignments and that this confidence is integrated into the dynamic programming. Experimental results are provided on data from the U.S. Postal Service.  相似文献   

10.
In the standard segmentation-based approach to handwritten word recognition, individual character-class confidence scores are combined via averaging to estimate confidences in the hypothesized identities for a word. We describe a methodology for generating optimal linear combination of order statistics operators for combining character class confidence scores. Experimental results are provided on over 1000 word images  相似文献   

11.
In an effort to make object recognition efficient and accurate enough for real applications; we have developed three probabilistic techniques-sensor modeling, probabilistic hypothesis generation, and robust localization-which form the basis of a promising paradigm for object recognition. Our techniques effectively exploit prior knowledge to reduce the number of hypotheses that must be tested during recognition. Our recognition approach utilizes statistical constraints on the matches between image and model features. These statistical constraints are computed using a model of the entire sensing process-resulting in more realistic and tighter constraints on matches. The candidate hypotheses are pruned by probabilistic constraint satisfaction to select likely matches based on the image evidence and prior statistical constraints. The resulting hypotheses are ordered most-likely first for verification. Thus minimizing unnecessary verifications. The reliability of the verification decision is significantly increased by the use of a robust localization algorithm  相似文献   

12.
Fuzzy logic is applied to the problem of locating and reading street numbers in digital images of handwritten mail. A fuzzy rule-based system is defined that uses uncertain information provided by image processing and neural network-based character recognition modules to generate multiple hypotheses with associated confidence values for the location of the street number in an image of a handwritten address. The results of a blind test of the resultant system are presented to demonstrate the value of this new approach. The results are compared to those obtained using a neural network trained with backpropagation. The fuzzy logic system achieved higher performance rates  相似文献   

13.
由于手写签名具有高度的唯一性、不遗忘性、自然性与可靠性已成为身份识别的研究热点。文中对手写签名识别技术主要是离线手写签名识别技术进行了详细分析,在研究手写签名的特点和在身份认证中实用性的基础上,提出了一种基于图像处理技术的手写签名鉴别方案,该方案利用了图像的缩放和膨胀细化技术对图像进行处理,然后利用均方差和峰值信噪比来评价签名图像的相似度。试验数据表明该鉴别方法简单可行且高效,能获得较好的验证率和鉴别率。  相似文献   

14.
An architecture for handwritten text recognition systems   总被引:1,自引:1,他引:0  
This paper presents an end-to-end system for reading handwritten page images. Five functional modules included in the system are introduced in this paper: (i) pre-processing, which concerns introducing an image representation for easy manipulation of large page images and image handling procedures using the image representation; (ii) line separation, concerning text line detection and extracting images of lines of text from a page image; (iii) word segmentation, which concerns locating word gaps and isolating words from a line of text image obtained efficiently and in an intelligent manner; (iv) word recognition, concerning handwritten word recognition algorithms; and (v) linguistic post-pro- cessing, which concerns the use of linguistic constraints to intelligently parse and recognize text. Key ideas employed in each functional module, which have been developed for dealing with the diversity of handwriting in its various aspects with a goal of system reliability and robustness, are described in this paper. Preliminary experiments show promising results in terms of speed and accuracy. Received October 30, 1998 / Revised January 15, 1999  相似文献   

15.
Large vocabulary recognition of on-line handwritten cursive words   总被引:1,自引:0,他引:1  
This paper presents a writer independent system for large vocabulary recognition of on-line handwritten cursive words. The system first uses a filtering module, based on simple letter features, to quickly reduce a large reference dictionary (lexicon) to a more manageable size; the reduced lexicon is subsequently fed to a recognition module. The recognition module uses a temporal representation of the input, instead of a static two-dimensional image, thereby preserving the sequential nature of the data and enabling the use of a Time-Delay Neural Network (TDNN); such networks have been previously successful in the continuous speech recognition domain. Explicit segmentation of the input words into characters is avoided by sequentially presenting the input word representation to the neural network-based recognizer. The outputs of the recognition module are collected and converted into a string of characters that is matched against the reduced lexicon using an extended Damerau-Levenshtein function. Trained on 2,443 unconstrained word images (11 k characters) from 55 writers and using a 21 k lexicon we reached a 97.9% and 82.4% top-5 word recognition rate on a writer-dependent and writer-independent test, respectively  相似文献   

16.
The segmentation of handwritten digit strings into isolated digits remains a challenging task. The difficulty for recognizing handwritten digit strings is related to several factors such as sloping, overlapping, connecting and unknown length of the digit string. Hence, this paper aims to propose a segmentation and recognition system for unknown-length handwritten digit strings by combining several explicit segmentation methods depending on the configuration link between digits. Three segmentation methods are combined based on histogram of the vertical projection, the contour analysis and the sliding window Radon transform. A recognition and verification module based on support vector machine classifiers allows analyzing and deciding the rejection or acceptance each segmented digit image. Moreover, various submodules are included leading to enhance the robustness of the proposed system. Experimental results conducted on the benchmark dataset show that the proposed system is effective for segmenting handwritten digit strings without prior knowledge of their length comparatively to the state of the art.  相似文献   

17.
Offline handwritten Amharic word recognition   总被引:1,自引:0,他引:1  
This paper describes two approaches for Amharic word recognition in unconstrained handwritten text using HMMs. The first approach builds word models from concatenated features of constituent characters and in the second method HMMs of constituent characters are concatenated to form word model. In both cases, the features used for training and recognition are a set of primitive strokes and their spatial relationships. The recognition system does not require segmentation of characters but requires text line detection and extraction of structural features, which is done by making use of direction field tensor. The performance of the recognition system is tested by a dataset of unconstrained handwritten documents collected from various sources, and promising results are obtained.  相似文献   

18.
Character recognition systems can contribute tremendously to the advancement of the automation process, and can improve the interaction between man and machine in many applications, including office automation, cheque verification and a large variety of banking, business and data entry applications.The main theme of this paper is the automatic recognition of hand-printed Latin characters using artificial neural networks in combination with conventional techniques. This approach has a number of advantages: it combines rule-based (structural) approach for feature extraction and non-linea classification tests for recognition; it is more efficient for large and complex data sets; feature extraction is inexpensive and execution time is independent of handwriting style and size. The technique can be divided into three major steps: The first step is pre-processing in which the original image is transformed into a binary image utilising a 300 dpi scanner and then thinned using a parallel thinning algorithm. Second, the image-skeleton is traced from left to right in order to build a binary tree. Some primitives, such as Straight lines, Curves and Loops, are extracted from the binary tree. Finally, a three layer artificial neural network is used for character classification. The system was tested on a sample of handwritten characters from several individuals whose writing ranged from acceptable to poor in quality and the correct average recognition rate obtained using cross-validation was 86%.  相似文献   

19.
A modular system to recognize handwritten numerical strings is proposed. It uses a segmentation-based recognition approach and a recognition and verification strategy. The approach combines the outputs from different levels such as segmentation, recognition, and postprocessing in a probabilistic model. A new verification scheme which contains two verifiers to deal with the problems of oversegmentation and undersegmentation is presented. A new feature set is also introduced to feed the oversegmentation verifier. A postprocessor based on a deterministic automaton is used and the global decision module makes an accept/reject decision. Finally, experimental results on two databases are presented: numerical amounts on Brazilian bank checks and NIST SD19. The latter aims at validating the concept of modular system and showing the robustness of the system using a well-known database.  相似文献   

20.
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号