Learning stochastic edit distance: Application in handwritten character recognition |
| |
Authors: | Jose Oncina Marc Sebban |
| |
Affiliation: | a Departamento de Lenguajes y Sistemas Informaticos, Universidad de Alicante, E-03071 Alicante, Spain b EURISE, Université de Saint-Etienne, 23 rue du Docteur Paul Michelon, 42023 Saint-Eienne, France |
| |
Abstract: | Many pattern recognition algorithms are based on the nearest-neighbour search and use the well-known edit distance, for which the primitive edit costs are usually fixed in advance. In this article, we aim at learning an unbiased stochastic edit distance in the form of a finite-state transducer from a corpus of (input, output) pairs of strings. Contrary to the other standard methods, which generally use the Expectation Maximisation algorithm, our algorithm learns a transducer independently on the marginal probability distribution of the input strings. Such an unbiased way to proceed requires to optimise the parameters of a conditional transducer instead of a joint one. We apply our new model in the context of handwritten digit recognition. We show, carrying out a large series of experiments, that it always outperforms the standard edit distance. |
| |
Keywords: | Stochastic edit distance Finite-state transducers Handwritten character recognition |
本文献已被 ScienceDirect 等数据库收录! |
|