共查询到6条相似文献,搜索用时 15 毫秒
1.
E. Francesconi M. Gori S. Marinai G. Soda 《International Journal on Document Analysis and Recognition》2001,3(3):160-168
In this paper we describe the connectionist-based classification engine of an OCR system. The classification engine is based
on a new modular connectionist architecture, where a multilayer perceptron (MLP) acting as a classifier is properly combined
with a set of autoassociators – one for each class – trained to copy the input to the output layer. The MLP-based classifier
selects a small group of classes with high score, that are afterwards verified by the corresponding autoassociators. The learning
samples used to train the classifiers are constructed by means of a synthetic noise generator starting from few grey level
characters labeled by the user. We report experimental results for comparing three neural architectures: an MLP-based classifier,
an autoassociator-based classifier, and the proposed combined architecture. The experiments show that the proposed architecture
exhibits the best performance, without increasing significantly the computational burden.
Received March 6, 2000 / Revised July 12, 2000 相似文献
2.
Eduardo Bayro-Corrochano Noel Trujillo Michel Naranjo 《Journal of Mathematical Imaging and Vision》2007,28(2):179-190
This paper presents an application of the quaternion Fourier transform for the preprocessing for neural-computing. In a new
way the 1D acoustic signals of French spoken words are represented as 2D signals in the frequency and time domain. These kind
of images are then convolved in the quaternion Fourier domain with a quaternion Gabor filter for the extraction of features.
This approach allows to greatly reduce the dimension of the feature vector. Two methods of feature extraction are tested.
The features vectors were used for the training of a simple MLP, a TDNN and a system of neural experts. The improvement in
the classification rate of the neural network classifiers are very encouraging which amply justify the preprocessing in the
quaternion frequency domain. This work also suggests the application of the quaternion Fourier transform for other image processing
tasks.
相似文献
Michel NaranjoEmail: |
3.
Nuttakorn Thubthong Boonserm Kijsirikul & Apirath Pusittrakul 《Computational Intelligence》2002,18(3):313-335
Tone information is very important to speech recognition in a tonal language such as Thai. In this article, we present a method for isolated Thai tone recognition. First, we define three sets of tone features to capture the characteristics of Thai tones and employ a feedforward neural network to classify tones based on these features. Next, we describe several experiments using the proposed features. The experiments are designed to study the effect of initial consonants, vowels, and final consonants on tone recognition. We find that there are some correlations between tones and other phonemes, and the recognition performances are satisfying. A human perception test is then conducted to judge the recognition rate. The recognition rate of a human is much lower than that of a machine. Finally, we explore various combination schemes to enhance the recognition rate. Further improvements are found in most experiments. 相似文献
4.
We are addressing the novel problem of jointly evaluating multiple speech patterns for automatic speech recognition and training. We propose solutions based on both the non-parametric dynamic time warping (DTW) algorithm, and the parametric hidden Markov model (HMM). We show that a hybrid approach is quite effective for the application of noisy speech recognition. We extend the concept to HMM training wherein some patterns may be noisy or distorted. Utilizing the concept of “virtual pattern” developed for joint evaluation, we propose selective iterative training of HMMs. Evaluating these algorithms for burst/transient noisy speech and isolated word recognition, significant improvement in recognition accuracy is obtained using the new algorithms over those which do not utilize the joint evaluation strategy. 相似文献
5.
In spite of recent advances in automatic speech recognition, the performance of state-of-the-art speech recognisers fluctuates
depending on the speaker. Speaker normalisation aims at the reduction of differences between the acoustic space of a new speaker
and the training acoustic space of a given speech recogniser, improving performance. Normalisation is based on an acoustic
feature transformation, to be estimated from a small amount of speech signal. This paper introduces a mixture of recurrent
neural networks as an effective regression technique to approach the problem. A suitable Vit-erbi-based time alignment procedure
is proposed for generating the adaptation set. The mixture is compared with linear regression and single-model connectionist
approaches. Speaker-dependent and speaker-independent continuous speech recognition experiments with a large vocabulary, using
Hidden Markov Models, are presented. Results show that the mixture improves recognition performance, yielding a 21% relative
reduction of the word error rate, i.e. comparable with that obtained with model-adaptation approaches. 相似文献
6.
This paper presents an irrelevant variability normalization (IVN) approach to jointly discriminative training of feature transforms and multi-prototype based classifier for recognition of online handwritten Chinese characters. A sample separation margin based minimum classification error criterion is adopted in IVN-based training, while an Rprop algorithm is used for optimizing the objective function. For the IVN approach based on piecewise linear transforms, the corresponding recognizer can be made both compact and efficient by using a two-level fast-match tree whose internal nodes coincide with the labels of feature transforms. Furthermore, the IVN system using weighted sum of linear transforms outperforms that based on piecewise linear transforms. The effectiveness of the proposed approach is first confirmed using an in-house developed online Chinese handwriting corpus with a vocabulary of 9306 characters, and then further verified on a standard benchmark database for an online handwritten character recognition task with a vocabulary of 3755 characters. 相似文献