首页 | 本学科首页   官方微博 | 高级检索  
     


Learning Finite-State Transducers: Evolution Versus Heuristic State Merging
Authors:Lucas   S.M. Reynolds   T.J.
Affiliation:Dept. of Comput. Sci., Essex Univ., Colchester;
Abstract:Finite-state transducers (FSTs) are finite-state machines (FSMs) that map strings in a source domain into strings in a target domain. While there are many reports in the literature of evolving FSMs, there has been much less work on evolving FSTs. In particular, the fitness functions required for evolving FSTs are generally different from those used for FSMs. In this paper, three string distance-based fitness functions are evaluated, in order of increasing computational complexity: string equality, Hamming distance, and edit distance. The fitness-distance correlation (FDC) and evolutionary performance of each fitness function is analyzed when used within a random mutation hill-climber (RMHC). Edit distance has the strongest FDC and also provides the best evolutionary performance, in that it is more likely to find the target FST within a given number of fitness function evaluations. Edit distance is also the most expensive to compute, but in most cases this extra computation is more than justified by its performance. The RMHC was compared with the best known heuristic method for learning FSTs, the onward subsequential transducer inference algorithm (OSTIA). On noise-free data, the RMHC performs best on problems with sparse training sets and small target machines. The RMHC and OSTIA offer similar performance for large target machines and denser data sets. When noise-corrupted data is used for training, the RMHC still performs well, while OSTIA performs poorly given even small amounts of noise. The RMHC is also shown to outperform a genetic algorithm. Hence, for certain classes of FST induction problem, the RMHC presented in this paper offers the best performance of any known algorithm
Keywords:
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号