Latent topic model for audio retrieval |
| |
Authors: | Pengfei Hu Wenju Liu Wei Jiang Zhanlei Yang |
| |
Affiliation: | National Laboratory of Pattern Recognition (NLPR), Institute of Automation, Chinese Academy of Sciences, Intelligence Building 1403, Zhongguancun East Road 95, Haidian District, Beijing 100190, China |
| |
Abstract: | Latent topic model such as Latent Dirichlet Allocation (LDA) has been designed for text processing and has also demonstrated success in the task of audio related processing. The main idea behind LDA assumes that the words of each document arise from a mixture of topics, each of which is a multinomial distribution over the vocabulary. When applying the original LDA to process continuous data, the word-like unit need be first generated by vector quantization (VQ). This data discretization usually results in information loss. To overcome this shortage, this paper introduces a new topic model named Gaussian-LDA for audio retrieval. In the proposed model, we consider continuous emission probability, Gaussian instead of multinomial distribution. This new topic model skips the vector quantization and directly models each topic as a Gaussian distribution over audio features. It avoids discretization by this way and integrates the procedure of clustering. The experiments of audio retrieval demonstrate that Gaussian-LDA achieves better performance than other compared methods. |
| |
Keywords: | Topic model LDA Gaussian distribution Audio retrieval |
本文献已被 ScienceDirect 等数据库收录! |
|