Distributed representations for extended syntactic transformation |
| |
Authors: | Lars Niklasson Fredrik Linåker |
| |
Affiliation: | University of Sk?vde , PO Box 408, Sk?vde, 541 28, Sweden |
| |
Abstract: | This paper shows how the choice of representation substantially affects the generalization performance of connectionistnetworks. The starting point is Chalmers' simulations involving structure-sensitive processing. Chalmers argued that a connectionist network could handle structure sensitive processing without the use of syntactically structured representations. He trained a connectionist architecture to encode/decode distributed representations for simple sentences. These distributed representations were then holistically transformed such that active sentences were transformed into their passive counterpart. However, he noted that the recursive auto-associative memory (RAAM), which was used to encode and decode distributed representations for the structures, exhibited only a limited ability to generalize when trained to encode/decode a randomly selected sample of the total corpus. When the RAAM was trained to encode/decode all sentences, and a separate transformation network was trained to make some active-passive transformations of the RAAMencoded sentences, the transformation network demonstrated perfect generalization on the remaining test sentences. It is argued here that the main reason for the limited generalization is not the ability of the RAAM architecture per se, but the choice of representation for the tokens used. This paper shows that 100% generalization can be achieved for Chalmers' original set up (i.e. using only 30% of the total corpus for training). The key to this success is to use distributed representations for the tokens (capturing different characteristics for differentclasses of tokens, e.g. verbs or nouns). |
| |
Keywords: | Constituent Similarity Recursive Auto-ASSOCIATIVE Memory Systematicity Generalization Syntactic Transformation |
|
|