Learning word meanings and grammar for verbalization of daily life activities using multilayered multimodal latent Dirichlet allocation and Bayesian hidden Markov models |
| |
Authors: | Muhammad Attamimi Yuji Ando Tomoaki Nakamura Takayuki Nagai Daichi Mochihashi Ichiro Kobayashi |
| |
Affiliation: | 1. Department of Mechanical Engineering and Intelligent Systems, The University of Electro-Communications, Chofu-shi, Japan.;2. Department of Mathematical Analysis and Statistical Inference, Institute of Statistical Mathematics, Tachikawa, Japan.;3. Department of Information Sciences, Faculty of Sciences, Ochanomizu University, Bunkyo-ku, Japan. |
| |
Abstract: | Intelligent systems need to understand and respond to human words to enable them to interact with humans in a natural way. Several studies attempted to realize these abilities by investigating the symbol grounding problem. For example, we proposed multilayered multimodal latent Dirichlet allocation (mMLDA) to enable the formation of various concepts and inference using grounded concepts. We previously reported on the issue of connecting words to various hierarchical concepts and also proposed a simple preliminary algorithm for generating sentences. This paper proposes a novel method that enables a sensing system to verbalize an everyday scene it observes. The method uses mMLDA and Bayesian hidden Markov models (BHMM) and the proposed algorithm improves the word inference of our previous work. The advantage of our approach is that grammar learning based on BHMM not only boosts concept selection results but enables our method to process functional words. The proposed verbalization algorithm produces results that are far superior to those of previous methods. Finally, we developed a system to obtain multimodal data from human everyday activities. We evaluate language learning and sentence generation as a complete process under this realistic setting. The results demonstrate the effectiveness of our method. |
| |
Keywords: | Multimodal categorization unsupervised learning symbol grounding language acquisition sentence generation |
|
|