LARS: A learning algorithm for rewriting systems |
| |
Authors: | Rémi Eyraud Colin de la Higuera Jean-Christophe Janodet |
| |
Affiliation: | (1) EURISE, Université Jean Monnet de Saint-Etienne, 23 rue Paul Michelon, 42023 Saint-Etienne, France |
| |
Abstract: | Whereas there is a number of methods and algorithms to learn regular languages, moving up the Chomsky hierarchy is proving to be a challenging task. Indeed, several theoretical barriers make the class of context-free languages hard to learn. To tackle these barriers, we choose to change the way we represent these languages. Among the formalisms that allow the definition of classes of languages, the one of string-rewriting systems (SRS) has outstanding properties. We introduce a new type of SRS’s, called Delimited SRS (DSRS), that are expressive enough to define, in a uniform way, a noteworthy and non trivial class of languages that contains all the regular languages, , , the parenthesis languages of Dyck, the language of Lukasiewicz, and many others. Moreover, DSRS’s constitute an efficient (often linear) parsing device for strings, and are thus promising candidates in forthcoming applications of grammatical inference. In this paper, we pioneer the problem of their learnability. We propose a novel and sound algorithm (called LARS) which identifies a large subclass of them in polynomial time (but not data). We illustrate the execution of our algorithm through several examples, discuss the position of the class in the Chomsky hierarchy and finally raise some open questions and research directions. This work was supported in part by the IST Program of the European Community, under the PASCAL Network of Excellence, IST-2002-506778. This publication only reflects the authors’ views. Editor: Georgios Paliouras and Yasubumi Sakakibara |
| |
Keywords: | Learning context-free languages Rewriting systems |
本文献已被 SpringerLink 等数据库收录! |
|