Spatial representation of context-dependent sentences and its application to sentence generation |
| |
Authors: | Tomoyuki Maekawa Wataru Takano |
| |
Affiliation: | 1. Graduate School of Interdisciplinary Information Studies, The University of Tokyo, Tokyo, Japan.;2. Graduate School of Information Science and Technology, The University of Tokyo, Tokyo, Japan. |
| |
Abstract: | We propose a novel approach to embedding sentences into a high-dimensional space. Independent words in the sentence are located at points in the space, and the sentence is represented by a curve along these words. A set of functions that evaluates a sequence of words is designed over this space and is helpful for searching for words that are likely to follow the observed sentences. More generally, our approach makes sentences sequentially depending on the context. We simplify Japanese grammar and subsequently implement it as a grammar that constrains simple sentences to be generated. In this study, we performed experiments in which we created a dictionary containing 2877 different independent words and constructed a semantic space from texts in eight digital archived books, consisting of 8495 independent words and 161 paragraphs in total. It was demonstrated that several meaningful sentences can be generated that are likely to follow untrained input sentences. |
| |
Keywords: | Natural language processing context search algorithm sentence generation |
|
|