RECOGNITION CAN BE HARDER THAN PARSING |
| |
Authors: | Bernard Lang |
| |
Affiliation: | INRIA-Rocquencourt, BP 105, 78153 Le Chesnay Cedex, France |
| |
Abstract: | The work presented here attempts to bring out some fundamental concepts that underlie some known parsing algorithms, usually called chart or dynamic programming parsers, in the hope of guiding the design of similar algorithms for other formalisms that could be considered for describing the "surface" syntax of languages. The key idea is that chart parsing is essentially equivalent to a simple construction of the intersection of the language (represented by its grammar) with a regular set containing only the input sentence to be parsed (represented by a finite state machine). The resulting grammar for that intersection is precisely what is usually called a shared forest: it represents all parses of a syntactically ambiguous sentence. Since most techniques for processing ill-formed input can be modeled by considering a nonsingleton regular set of input sentences, we can expect to generalize these ill-formed input processing techniques to all parsers describable with our approach. |
| |
Keywords: | parsing chart dynamic programming context-free tree adjoining parse tree parse forest sharing ambiguity ill-formed sentences |
|
|