首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 14 毫秒
1.
In this paper, we implemented a speaker-dependent speech recognition system for 11 standard Arabic isolated words. During the feature extraction phase, several techniques were used such as Mel frequency cepstral coefficients, perceptual linear prediction, relative perceptual linear prediction and their first order temporal derivatives. Principal component analysis was adopted in order to reduce the feature dimension. The recognition phase is based on the feed forward back-propagation neural network using two learning algorithms: the Levenberg–Marquardt “Trainlm” and the scaled conjugate gradient “Trainscg”. Hybrid approaches were used and compared in terms of computational time and recognition rates and have produced very interesting performances.  相似文献   

2.
This paper presents a comparative study of two machine learning techniques for recognizing handwritten Arabic words, where hidden Markov models (HMMs) and dynamic Bayesian networks (DBNs) were evaluated. The work proposed is divided into three stages, namely preprocessing, feature extraction and classification. Preprocessing includes baseline estimation and normalization as well as segmentation. In the second stage, features are extracted from each of the normalized words, where a set of new features for handwritten Arabic words is proposed, based on a sliding window approach moving across the mirrored word image. The third stage is for classification and recognition, where machine learning is applied using HMMs and DBNs. In order to validate the techniques, extensive experiments were conducted using the IFN/ENIT database which contains 32,492 Arabic words. Experimental results and quantitative evaluations showed that HMM outperforms DBN in terms of higher recognition rate and lower complexity.  相似文献   

3.
The retrieval of information from scanned handwritten documents is becoming vital with the rapid increase of digitized documents, and word spotting systems have been developed to search for words within documents. These systems can be either template matching algorithms or learning based. This paper presents a coherent learning based Arabic handwritten word spotting system which can adapt to the nature of Arabic handwriting, which can have no clear boundaries between words. Consequently, the system recognizes Pieces of Arabic Words (PAWs), then re-constructs and spots words using language models. The proposed system produced promising result for Arabic handwritten word spotting when tested on the CENPARMI Arabic documents database.  相似文献   

4.
Multimedia Tools and Applications - Handwritten word recognition is one of the hot topics in automatic handwritten text recognition that received a lot of attention in recent years. Unlike...  相似文献   

5.
Pattern Analysis and Applications - One-class classifier (OCC) is involved for solving different kinds of problems due to its ability to represent a class distribution regardless the remaining...  相似文献   

6.
The success of using Hidden Markov Models (HMMs) for speech recognition application has motivated the adoption of these models for handwriting recognition especially the online handwriting that has large similarity with the speech signal as a sequential process. Some languages such as Arabic, Farsi and Urdo include large number of delayed strokes that are written above or below most letters and usually written delayed in time. These delayed strokes represent a modeling challenge for the conventional left-right HMM that is commonly used for Automatic Speech Recognition (ASR) systems. In this paper, we introduce a new approach for handling delayed strokes in Arabic online handwriting recognition using HMMs. We also show that several modeling approaches such as context based tri-grapheme models, speaker adaptive training and discriminative training that are currently used in most state-of-the-art ASR systems can provide similar performance improvement for Hand Writing Recognition (HWR) systems. Finally, we show that using a multi-pass decoder that use the computationally less expensive models in the early passes can provide an Arabic large vocabulary HWR system with practical decoding time. We evaluated the performance of our proposed Arabic HWR system using two databases of small and large lexicons. For the small lexicon data set, our system achieved competing results compared to the best reported state-of-the-art Arabic HWR systems. For the large lexicon, our system achieved promising results (accuracy and time) for a vocabulary size of 64k words with the possibility of adapting the models for specific writers to get even better results.  相似文献   

7.
8.
We studied the feedforward network proposed by Dandurand et al. (2010), which maps location-specific letter inputs to location-invariant word outputs, probing the hidden layer to determine the nature of the code. Hidden patterns for words were densely distributed, and K-means clustering on single letter patterns produced evidence that the network had formed semi-location-invariant letter representations during training. The possible confound with superseding bigram representations was ruled out, and linear regressions showed that any word pattern was well approximated by a linear combination of its constituent letter patterns. Emulating this code using overlapping holographic representations (Plate, 1995) uncovered a surprisingly acute and useful correspondence with the network, stemming from a broken symmetry in the connection weight matrix and related to the group-invariance theorem (Minsky & Papert, 1969). These results also explain how the network can reproduce relative and transposition priming effects found in humans.  相似文献   

9.
Multimedia Tools and Applications - Text recognition in the wild is a challenging task in the field of computer vision and machine learning. Existing optical character recognition engines cannot...  相似文献   

10.
This paper proposes a real-time lip reading system (consisting of a lip detector, lip tracker, lip activation detector, and word classifier), which can recognize isolated Korean words. Lip detection is performed in several stages: face detection, eye detection, mouth detection, mouth end-point detection, and active appearance model (AAM) fitting. Lip tracking is then undertaken via a novel two-stage lip tracking method, where the model-based Lucas-Kanade feature tracker is used to track the outer lip, and then a fast block matching algorithm is used to track the inner lip. Lip activation detection is undertaken through a neural network classifier, the input for which being a combination of the lip motion energy function and the first dominant shape feature. In the last step, input words are defined and recognized by three different classifiers: HMM, ANN, and K-NN. We combine the proposed lip reading system with an audio-only automatic speech recognition (ASR) system to improve the word recognition performance in the noisy environments. We then demonstrate the potential applicability of the combined system for use within hands free in-vehicle navigation devices. Results from experiments undertaken on 30 isolated Korean words using the K-NN classifier at a speed of 15 fps demonstrate that the proposed lip reading system achieves a 92.67% word correct rate (WCR) for person-dependent tests, and a 46.50% WCR for person-independent tests. Also, the combined audio-visual ASR system increases the WCR from 0% to 60% in a noisy environment.  相似文献   

11.
This paper presents an artificial neural network (ANN) for speaker-independent isolated word speech recognition. The network consists of three subnets in concatenation. The static information within one frame of speech signal is processed in the probabilistic mapping subnet that converts an input vector of acoustic features into a probability vector whose components are estimated probabilities of the feature vector belonging to the phonetic classes that constitute the words in the vocabulary. The dynamics capturing subnet computes the first-order cross correlation between the components of the probability vectors to serve as the discriminative feature derived from the interframe temporal information of the speech signal. These dynamic features are passed for decision-making to the classification subnet, which is a multilayer perceptron (MLP). The architecture of these three subnets are described, and the associated adaptive learning algorithms are derived. The recognition results for a subset of the DARPA TIMIT speech database are reported. The correct recognition rate of the proposed ANN system is 95.5%, whereas that of the best of continuous hidden Markov model (HMM)-based systems is only 91.0%  相似文献   

12.
One of the most common effects among aphasia patients is the difficulty to recall names or words. Typically, word retrieval problems can be treated through word naming therapeutic exercises. In fact, the frequency and the intensity of speech therapy are key factors in the recovery of lost communication functionalities. In this sense, speech and language technology can have a relevant contribution in the development of automatic therapy methods. In this work, we present an on-line system designed to behave as a virtual therapist incorporating automatic speech recognition technology that permits aphasia patients to perform word naming training exercises. We focus on the study of the automatic word naming detector module and on its utility for both global evaluation and treatment. For that purpose, a database consisting of word naming therapy sessions of aphasic Portuguese native speakers has been collected. In spite of the different patient characteristics and speech quality conditions of the collected data, encouraging results have been obtained thanks to a calibration method that makes use of the patients’ word naming ability to automatically adapt to the patients’ speech particularities.  相似文献   

13.
14.
In this paper, we fill a gap in the literature by studying the problem of Arabic handwritten digit recognition. The performances of different classification and feature extraction techniques on recognizing Arabic digits are going to be reported to serve as a benchmark for future work on the problem. The performance of well known classifiers and feature extraction techniques will be reported in addition to a novel feature extraction technique we present in this paper that gives a high accuracy and competes with the state-of-the-art techniques. A total of 54 different classifier/features combinations will be evaluated on Arabic digits in terms of accuracy and classification time. The results are analyzed and the problem of the digit ‘0’ is identified with a proposed method to solve it. Moreover, we propose a strategy to select and design an optimal two-stage system out of our study and, hence, we suggest a fast two-stage classification system for Arabic digits which achieves as high accuracy as the highest classifier/features combination but with much less recognition time.  相似文献   

15.
Goraine  H. Usher  M. Al-Emami  S. 《Computer》1992,25(7):71-74
A personal computer-based Arabic character recognition system that performs three preprocessing stages sequentially, thinning, stroke segmentation, and sampling, is described. The eight-direction code used for stroke representation and classification, the character classification done at primary and secondary levels, and the contextual postprocessor used for error detection and correction are described. Experimental results obtained using samples of handwritten and typewritten Arabic words are presented  相似文献   

16.
We present the last version of our system Adresy dedicated to the recognition of words written on a digitizing tablet. Adresy is designed to be used with a big (given) vocabulary. In this context, it achieves a very good performance because it is able to learn automatically the writing style of any specific user, directly from a set of a few samples of words. Moreover, Adresy improves continuously its performance in a user-transparent way, thanks to a second, faster, learning process called adaptation. This paper describes the main aspects of Adresy. Moreover, the power of our system is proven through four experiments performed on a database of ten thousand handwritten words.  相似文献   

17.
We describe a process of word recognition that has high tolerance for poor image quality, tunability to the lexical content of the documents to which it is applied, and high speed of operation. This process relies on the transformation of text images into character shape codes, and on special lexica that contain information on the shape of words. We rely on the structure of English and the high efficiency of mapping between shape codes and the characters in the words. Remaining ambiguity is reduced by template matching using exemplars derived from surrounding text, taking advantage of the local consistency of font, face and size as well as image quality. This paper describes the effects of lexical content, structure and processing on the performance of a word recognition engine. Word recognition performance is shown to be enhanced by the application of an appropriate lexicon. Recognition speed is shown to be essentially independent of the details of lexical content provided the intersection of the occurrences of words in the document and the lexicon is high. Word recognition accuracy is dependent on both intersection and specificity of the lexicon. Received May 1, 1998 / Revised October 20, 1998  相似文献   

18.
针对四容水箱系统的多变量、大时滞、非线性及强耦合等特性,采用了小波神经网络广义预测控制(WNNGPC)策略。利用小波神经网络良好的函数逼近能力,对系统被控对象进行辨识,得到控制系统的预测模型,再结合广义预测控制良好的控制性能,达到对四容水箱系统的稳定控制。在系统辨识的过程中,采用的是改进的BP学习算法,这一算法能够快速平稳地修正网络权值和阈值,使预测输出平滑地趋近期望输出。在解决系统的耦合问题上,利用了模糊控制的通用逼近性,设计了模糊前馈补偿解耦。基于模糊补偿解耦的WNNGPC对四容水箱进行控制实验,并对比分析实验结果。结果表明,这一控制策略对四容水箱进行控制时取得了较好的控制效果。  相似文献   

19.
20.
Because of large variations involved in handwritten words, the recognition problem is very difficult. Hidden Markov models (HMM) have been widely and successfully used in speech processing and recognition. Recently HMM has also been used with some success in recognizing handwritten words with presegmented letters. In this paper, a complete scheme for totally unconstrained handwritten word recognition based on a single contextual hidden Markov model type stochastic network is presented. Our scheme includes a morphology and heuristics based segmentation algorithm, a training algorithm that can adapt itself with the changing dictionary, and a modified Viterbi algorithm which searches for the (l+1)th globally best path based on the previous l best paths. Detailed experiments are carried out and successful recognition results are reported  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号