共查询到20条相似文献,搜索用时 46 毫秒
1.
Indexing and Retrieval of Audio: A Survey 总被引:3,自引:0,他引:3
With more and more audio being captured and stored, there is a growing need for automatic audio indexing and retrieval techniques that can retrieve relevant audio pieces quickly on demand. This paper provides a comprehensive survey of audio indexing and retrieval techniques. We first describe main audio characteristics and features and discuss techniques for classifying audio into speech and music based on these features. Indexing and retrieval of speech and music is then described separately. Finally, significance of audio in multimedia indexing and retrieval is discussed. 相似文献
2.
3.
An analytical evaluation of search by content and interaction patterns on multimodal meeting records
It has been suggested that combining content-based indexing with automatically generated temporal metadata might help improve
search and browsing of recordings of computer-mediated collaborative activities such as on-line meetings, which are characterised
by extensive multimodal communication. This paper presents an analytical evaluation of the effectiveness of these techniques
as implemented through automatic speech recognition and temporal mapping. In particular, it assesses the extent to which this strategy can help uncover contextual relationships between audio and
text segments in recorded remote meetings. Results show that even simple temporal mapping can effectively support retrieval
of recorded audio segments, improve retrieval performance in situations where speech recognition alone would have exhibited
prohibitively high word error rates, and provide a basic form of semantic adaptation.
The authors are listed in alphabetical order. 相似文献
4.
M. Liu C. Wan L. Wang 《Soft Computing - A Fusion of Foundations, Methodologies and Applications》2002,6(5):357-364
In recent years, available audio corpora are rapidly increasing from fast growing Internet and digital libraries. How to
classify and retrieve sound files relevant to the user's interest from large databases is crucial for building multimedia
web search engines. In this paper, content-based technology has been applied to classify and retrieve audio clips using a
fuzzy logic system, which is intuitive due to the fuzzy nature of human perception of audio, especially audio clips with mixed
types. Two features selected from various extracted features are used as input to a constructed fuzzy inference system (FIS).
The outputs of the FIS are two types of hierarchical audio classes. The membership functions and rules are derived from the
distributions of extracted audio features. Speech and music can thus be discriminated by the FIS. Furthermore, female and
male speech can be separated by another FIS, whereas percussion can be distinguished from other music instruments. In addition,
we can use multiple FISs to form a “fuzzy tree” for retrieval of more types of audio clips. With this approach, we can classify
and retrieve generic audios more accurately, using fewer features and less computation time, compared to other existing approaches. 相似文献
5.
基于内容的音频检索:概念和方法 总被引:38,自引:1,他引:37
F过去对视觉媒体的检索,如图象和视频,进行了大量的研究。但是我们注意到音频也是多媒体中的一种典型媒体,是信息的一种常用载体。常规的自理是把数字音频当成非结构化流媒体。然而音频是语音的载体、包含丰富的听觉特征,并且具有结构信息。因此需要并且可以基于这些内容对音频进行存取。本文根据当前相关研究的进展,综述基于内容的音频检索方法,包括面向语音、音乐和音频分析的检索、音频分割等;分析并总结出音频内容及其检 相似文献
6.
7.
A. Tuerk S.E. Johnson P. Jourlin K. Spärck Jones P.C. Woodland 《International Journal of Speech Technology》2001,4(3-4):241-250
The Cambridge University Multimedia Document Retrieval (CU-MDR) Demo System is a web-based application that allows the user to query a database of radio broadcasts that are available on the Internet. The audio from several radio stations is downloaded and transcribed automatically. This gives a collection of text and audio documents that can be searched by a user. The paper describes how speech recognition and information retrieval techniques are combined in the CU-MDR Demo System and shows how the user can interact with it. 相似文献
8.
Traditional content-based music retrieval systems retrieve a specific music object which is similar to what a user has requested.
However, the need exists for the development of category search for the retrieval of a specific category of music objects
which share a common semantic concept. The concept of category search in content-based music retrieval is subjective and dynamic.
Therefore, this paper investigates a relevance feedback mechanism for category search of polyphonic symbolic music based on
semantic concept learning. For the consideration of both global and local properties of music objects, a segment-based music
object modeling approach is presented. Furthermore, in order to discover the user semantic concept in terms of discriminative
features of discriminative segments, a concept learning mechanism based on data mining techniques is proposed to find the
discriminative characteristics between relevant and irrelevant objects. Moreover, three strategies, the Most-Positive, the
Most-Informative, and the Hybrid, to return music objects concerning user relevance judgments are investigated. Finally, comparative
experiments are conducted to evaluate the effectiveness of the proposed relevance feedback mechanism. Experimental results
show that, for a database of 215 polyphonic music objects, 60% average precision can be achieved through the use of the proposed
relevance feedback mechanism.
相似文献
Fang-Fei KuoEmail: |
9.
10.
11.
12.
1.引言面对日益庞大的信息量,如何有效地检索到感兴趣的内容是至关重要的。新闻视频、音频(包括电视、广播)与文字报道相比,更为生动,表达更为丰富,但也有数据量大、难以组织、索引、检索等缺点。这主要体现在两方面; 文本有标题、段等明显的辅助标记,而视频、音频则没有。一般的浏览工具只有播放、快进、快退、拖动定位等简单手段。这对于几十、几百小时,而且还在日益增长的视频、音频数据库,是远远不能满足要求的。 相似文献
13.
14.
15.
16.
17.
Yasuo Ariki 《New Generation Computing》2000,18(4):341-357
Because of the media digitization, a large amount of information such as speech, audio and video data is produced everyday.
In order to retrieve data from these databases quickly and precisely, multimedia technologies for structuring and retrieving
of speech, audio and video data are strongly required. In this paper, we overview the multimedia technologies such as structuring
and retrieval of speech, audio and video data, speaker indexing, audio summarization and cross media retrieval existing today
for TV news detabase. The main purpose of structuring is to produce tables of contents and indices from audio and video data
automatically. In order to make these technologies feasible, first, processing units such as words on audio data and shots
on video data are extracted. On a second step, they are meaningfully integrated into topics. Furthermore, the units extracted
from different types of media are integrated for higher functions.
Yasuo Ariki, Ph.D.: He is a Professor in the Department of Electronics and Informatics at the Ryukoku University. He received his B.E., M.E.
and Ph.D. in information science from Kyoto University in 1974, 1976 and 1979, respectively. He had been an Assistant in Kyoto
University from 1980 to 1990, and stayed at Edinburgh University as visiting academic from 1987 to 1990. His research interests
are in speech and image recognition and in information retrieval and database. He is a member of IPSJ, IEICE, ASJ, Soc. Artif.
Intel. and IEEE. 相似文献
18.
George Chang Gunjan Samtani Marcus Healey Franz Kurfess Jason Wang 《Journal of Systems Integration》2001,10(3):253-267
Information retrieval has evolved from searches of references, to abstracts, to documents. Search on the Web involves search engines that promise to parse full-text and other files: audio, video, and multimedia. With the indexable Web at 320 million pages and growing, difficulties with locating relevant information have become apparent. The most prevalent means for information retrieval relies on syntax-based methods: keywords or strings of characters are presented to a search engine, and it returns all the matches in the available documents. This method is satisfactory and easy to implement, but it has some inherent limitations that make it unsuitable for many tasks. Instead of looking for syntactical patterns, the user often is interested in keyword meaning or the location of a particular word in a title or header. This paper describes some precise search approaches in the environmental domain that locate information according to syntactic criteria, augmented by the utilization of information in a certain context. The main emphasis of this paper lies in the treatment of structured knowledge, where essential aspects about the topic of interest are encoded not only by the individual items, but also by their relationships among each other. Examples for such structured knowledge are hypertext documents, diagrams, logical and chemical formulae. Benefits of this approach are enhanced precision and approximate search in an already focused, context-specific search engine for the environment: EnviroDaemon. 相似文献
19.
Web image retrieval using majority-based ranking approach 总被引:1,自引:0,他引:1
Web image retrieval has characteristics different from typical content-based image retrieval; web images have associated textual cues. However, a web image retrieval system often yields undesirable results, because it uses limited text information such as surrounding text, URLs, and image filenames. In this paper, we propose a new approach to retrieval, which uses the image content of retrieved results without relying on assistance from the user. Our basic hypothesis is that more popular images have a higher probability of being the ones that the user wishes to retrieve. According to this hypothesis, we propose a retrieval approach that is based on a majority of the images under consideration. We define four methods for finding the visual features of majority of images; (1) majority-first method, (2) centroid-of-all method, (3) centroid-of-top K method, and (4) centroid-of-largest-cluster method. In addition, we implement a graph/picture classifier for improving the effectiveness of web image retrieval. We evaluate the retrieval effectiveness of both our methods and conventional ones by using precision and recall graphs. Experimental results show that the proposed methods are more effective than conventional keyword-based retrieval methods. 相似文献