Toward semantic indexing and retrieval using hierarchical audio models |
| |
Authors: | Wei-Ta Chu Wen-Huang Cheng Jane Yung-Jen Hsu Ja-Ling Wu |
| |
Affiliation: | (1) Department of Computer Science and Information Engineering, National Taiwan University, No. 1, Sec. 4, Roosevelt Road, Taipei, Taiwan, 106;(2) Graduate Institute of Networking and Multimedia, National Taiwan University, No. 1, Sec. 4, Roosevelt Road, Taipei, Taiwan, 106;(3) Department of Computer Science and Information Engineering; Graduate Institute of Networking and Multimedia, National Taiwan University, No. 1, Sec. 4, Roosevelt Road, Taipei, Taiwan, 106 |
| |
Abstract: | Semantic-level content analysis is a crucial issue in achieving efficient content retrieval and management. We propose a hierarchical
approach that models the statistical characteristics of audio events over a time series to accomplish semantic context detection.
Two stages, audio event and semantic context modeling, are devised to bridge the semantic gap between physical audio features
and semantic concepts. In this work, hidden Markov models (HMMs) are used to model four representative audio events, i.e.,
gunshot, explosion, engine, and car-braking, in action movies. At the semantic-context level, Gaussian mixture models (GMMs)
and ergodic HMMs are investigated to fuse the characteristics and correlations between various audio events. They provide
cues for detecting gunplay and car-chasing scenes, two semantic contexts we focus on in this work. The promising experimental
results demonstrate the effectiveness of the proposed approach and exhibit that the proposed framework provides a foundation
in semantic indexing and retrieval. Moreover, the two fusion schemes are compared, and the relations between audio event and
semantic context are studied. |
| |
Keywords: | Audio event Semantic context Semantic gap Hidden Markov model Gaussian mixture model |
本文献已被 SpringerLink 等数据库收录! |
|