首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到6条相似文献,搜索用时 15 毫秒
1.
In this paper we describe the connectionist-based classification engine of an OCR system. The classification engine is based on a new modular connectionist architecture, where a multilayer perceptron (MLP) acting as a classifier is properly combined with a set of autoassociators – one for each class – trained to copy the input to the output layer. The MLP-based classifier selects a small group of classes with high score, that are afterwards verified by the corresponding autoassociators. The learning samples used to train the classifiers are constructed by means of a synthetic noise generator starting from few grey level characters labeled by the user. We report experimental results for comparing three neural architectures: an MLP-based classifier, an autoassociator-based classifier, and the proposed combined architecture. The experiments show that the proposed architecture exhibits the best performance, without increasing significantly the computational burden. Received March 6, 2000 / Revised July 12, 2000  相似文献   

2.
This paper presents an application of the quaternion Fourier transform for the preprocessing for neural-computing. In a new way the 1D acoustic signals of French spoken words are represented as 2D signals in the frequency and time domain. These kind of images are then convolved in the quaternion Fourier domain with a quaternion Gabor filter for the extraction of features. This approach allows to greatly reduce the dimension of the feature vector. Two methods of feature extraction are tested. The features vectors were used for the training of a simple MLP, a TDNN and a system of neural experts. The improvement in the classification rate of the neural network classifiers are very encouraging which amply justify the preprocessing in the quaternion frequency domain. This work also suggests the application of the quaternion Fourier transform for other image processing tasks.
Michel NaranjoEmail:
  相似文献   

3.
Tone information is very important to speech recognition in a tonal language such as Thai. In this article, we present a method for isolated Thai tone recognition. First, we define three sets of tone features to capture the characteristics of Thai tones and employ a feedforward neural network to classify tones based on these features. Next, we describe several experiments using the proposed features. The experiments are designed to study the effect of initial consonants, vowels, and final consonants on tone recognition. We find that there are some correlations between tones and other phonemes, and the recognition performances are satisfying. A human perception test is then conducted to judge the recognition rate. The recognition rate of a human is much lower than that of a machine. Finally, we explore various combination schemes to enhance the recognition rate. Further improvements are found in most experiments.  相似文献   

4.
We are addressing the novel problem of jointly evaluating multiple speech patterns for automatic speech recognition and training. We propose solutions based on both the non-parametric dynamic time warping (DTW) algorithm, and the parametric hidden Markov model (HMM). We show that a hybrid approach is quite effective for the application of noisy speech recognition. We extend the concept to HMM training wherein some patterns may be noisy or distorted. Utilizing the concept of “virtual pattern” developed for joint evaluation, we propose selective iterative training of HMMs. Evaluating these algorithms for burst/transient noisy speech and isolated word recognition, significant improvement in recognition accuracy is obtained using the new algorithms over those which do not utilize the joint evaluation strategy.  相似文献   

5.
In spite of recent advances in automatic speech recognition, the performance of state-of-the-art speech recognisers fluctuates depending on the speaker. Speaker normalisation aims at the reduction of differences between the acoustic space of a new speaker and the training acoustic space of a given speech recogniser, improving performance. Normalisation is based on an acoustic feature transformation, to be estimated from a small amount of speech signal. This paper introduces a mixture of recurrent neural networks as an effective regression technique to approach the problem. A suitable Vit-erbi-based time alignment procedure is proposed for generating the adaptation set. The mixture is compared with linear regression and single-model connectionist approaches. Speaker-dependent and speaker-independent continuous speech recognition experiments with a large vocabulary, using Hidden Markov Models, are presented. Results show that the mixture improves recognition performance, yielding a 21% relative reduction of the word error rate, i.e. comparable with that obtained with model-adaptation approaches.  相似文献   

6.
This paper presents an irrelevant variability normalization (IVN) approach to jointly discriminative training of feature transforms and multi-prototype based classifier for recognition of online handwritten Chinese characters. A sample separation margin based minimum classification error criterion is adopted in IVN-based training, while an Rprop algorithm is used for optimizing the objective function. For the IVN approach based on piecewise linear transforms, the corresponding recognizer can be made both compact and efficient by using a two-level fast-match tree whose internal nodes coincide with the labels of feature transforms. Furthermore, the IVN system using weighted sum of linear transforms outperforms that based on piecewise linear transforms. The effectiveness of the proposed approach is first confirmed using an in-house developed online Chinese handwriting corpus with a vocabulary of 9306 characters, and then further verified on a standard benchmark database for an online handwritten character recognition task with a vocabulary of 3755 characters.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号