首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到10条相似文献,搜索用时 125 毫秒
1.
This work addresses the soundtrack indexing of multimedia documents. Our purpose is to detect and locate sound unity to structure the audio dataflow in program broadcasts (reports). We present two audio classification tools that we have developed. The first one, a speech music classification tool, is based on three original features: entropy modulation, stationary segment duration (with a Forward–Backward Divergence algorithm) and number of segments. They are merged with the classical 4 Hz modulation energy. It is divided into two classifications (speech/non-speech and music/non-music) and provides more than 90% of accuracy for speech detection and 89% for music detection. The other system, a jingle identification tool, uses an Euclidean distance in the spectral domain to index the audio data flow. Results show that is efficient: among 132 jingles to recognize, we have detected 130. Systems are tested on TV and radio corpora (more than 10 h). They are simple, robust and can be improved on every corpus without training or adaptation.
Régine André-ObrechtEmail:
  相似文献   

2.
ONTRACK: Dynamically adapting music playback to support navigation   总被引:3,自引:3,他引:0  
Listening to music on personal, digital devices whilst mobile is an enjoyable, everyday activity. We explore a scheme for exploiting this practice to immerse listeners in navigation cues. Our prototype, ONTRACK, continuously adapts audio, modifying the spatial balance and volume to lead listeners to their target destination. First we report on an initial lab-based evaluation that demonstrated the approach’s efficacy: users were able to complete tasks within a reasonable time and their subjective feedback was positive. Encouraged by these results we constructed a handheld prototype. Here, we discuss this implementation and the results of field-trials. These indicate that even with a low-fidelity realisation of the concept, users can quite effectively navigate complicated routes.
Matt Jones (Corresponding author)Email:
Steve JonesEmail:
Gareth BradleyEmail:
Nigel WarrenEmail:
David BainbridgeEmail:
Geoff HolmesEmail:
  相似文献   

3.
In the age of speech and voice recognition technologies, sign language recognition is an essential part of ensuring equal access for deaf people. To date, sign language recognition research has mostly ignored facial expressions that arise as part of a natural sign language discourse, even though they carry important grammatical and prosodic information. One reason is that tracking the motion and dynamics of expressions in human faces from video is a hard task, especially with the high number of occlusions from the signers’ hands. This paper presents a 3D deformable model tracking system to address this problem, and applies it to sequences of native signers, taken from the National Center of Sign Language and Gesture Resources (NCSLGR), with a special emphasis on outlier rejection methods to handle occlusions. The experiments conducted in this paper validate the output of the face tracker against expert human annotations of the NCSLGR corpus, demonstrate the promise of the proposed face tracking framework for sign language data, and reveal that the tracking framework picks up properties that ideally complement human annotations for linguistic research.
Christian Vogler (Corresponding author)Email:
Siome GoldensteinEmail:
  相似文献   

4.
We present an enhancement towards adaptive video training for PhoneGuide, a digital museum guidance system for ordinary camera-equipped mobile phones. It enables museum visitors to identify exhibits by capturing photos of them. In this article, a combined solution of object recognition and pervasive tracking is extended to a client–server-system for improving data acquisition and for supporting scale-invariant object recognition. A static as well as a dynamic training technique are presented that preprocess the collected object data differently and apply two types of neural networks (NN) for classification. Furthermore, the system enables a temporal adaptation for ensuring a continuous data acquisition to improve the recognition rate over time. A formal field experiment reveals current recognition rates and indicates the practicability of both methods under realistic conditions in a museum.
Erich BrunsEmail:
Oliver Bimber (Corresponding author)Email:
  相似文献   

5.
Finding maximum-length repeating patterns in music databases   总被引:1,自引:0,他引:1  
This paper introduces the problem of discovering maximum-length repeating patterns in music objects. A novel algorithm is presented for the extraction of this kind of patterns from a melody music object. The proposed algorithm discovers all maximum-length repeating patterns using an “aggressive” accession during searching, by avoiding costly repetition frequency calculation and by examining as few as possible repeating patterns in order to reach the maximum-length repeating pattern(s). Detailed experimental results illustrate the significant performance gains due to the proposed algorithm, compared to an existing baseline algorithm.
Yannis Manolopoulos (Corresponding author)Email:
  相似文献   

6.
In this paper, we propose an innovative architecture to segment a news video into the so-called “stories” by both using the included video and audio information. Segmentation of news into stories is one of the key issues for achieving efficient treatment of news-based digital libraries. While the relevance of this research problem is widely recognized in the scientific community, we are in presence of a few established solutions in the field. In our approach, the segmentation is performed in two steps: first, shots are classified by combining three different anchor shot detection algorithms using video information only. Then, the shot classification is improved by using a novel anchor shot detection method based on features extracted from the audio track. Tests on a large database confirm that the proposed system outperforms each single video-based method as well as their combination.
Mario VentoEmail:
  相似文献   

7.
This paper describes security and privacy issues for multimedia database management systems. Multimedia data includes text, images, audio and video. It describes access control for multimedia database management systems and describes security policies and security architectures for such systems. Privacy problems that result from multimedia data mining are also discussed.
Bhavani ThuraisinghamEmail:
  相似文献   

8.
In this paper, we study the performance improvement that it is possible to obtain combining classifiers based on different notions (each trained using a different physicochemical property of amino-acids). This multi-classifier has been tested in three problems: HIV-protease; recognition of T-cell epitopes; predictive vaccinology. We propose a multi-classifier that combines a classifier that approaches the problem as a two-class pattern recognition problem and a method based on a one-class classifier. Several classifiers combined with the “sum rule” enables us to obtain an improvement performance over the best results previously published in the literature.
Loris NanniEmail:
  相似文献   

9.
The map-seeking circuit algorithm (MSC) was developed by Arathorn to efficiently solve the combinatorial problem of correspondence maximization, which arises in applications like computer vision, motion estimation, image matching, and automatic speech recognition (Arathorn, D.W. in Map-Seeking Circuits in Visual Cognition: A Computational Mechanism for Biological and Machine Vision, Stanford University Press, Stanford, 2002). Given an input image, a template image, and a discrete set of transformations, the goal is to find a composition of transformations which gives the best fit between the transformed input and the template. We imbed the associated combinatorial search problem within a continuous framework by using superposition, and we analyze a resulting constrained optimization problem. We present several numerical schemes to compute local solutions, and we compare their performance on a pair of test problems: an image matching problem and the challenging problem of automatically solving a Rubik’s cube.
C. R. VogelEmail:
  相似文献   

10.
Making SVMs Scalable to Large Data Sets using Hierarchical Cluster Indexing   总被引:2,自引:0,他引:2  
Support vector machines (SVMs) have been promising methods for classification and regression analysis due to their solid mathematical foundations, which include two desirable properties: margin maximization and nonlinear classification using kernels. However, despite these prominent properties, SVMs are usually not chosen for large-scale data mining problems because their training complexity is highly dependent on the data set size. Unlike traditional pattern recognition and machine learning, real-world data mining applications often involve huge numbers of data records. Thus it is too expensive to perform multiple scans on the entire data set, and it is also infeasible to put the data set in memory. This paper presents a method, Clustering-Based SVM (CB-SVM), that maximizes the SVM performance for very large data sets given a limited amount of resource, e.g., memory. CB-SVM applies a hierarchical micro-clustering algorithm that scans the entire data set only once to provide an SVM with high quality samples. These samples carry statistical summaries of the data and maximize the benefit of learning. Our analyses show that the training complexity of CB-SVM is quadratically dependent on the number of support vectors, which is usually much less than that of the entire data set. Our experiments on synthetic and real-world data sets show that CB-SVM is highly scalable for very large data sets and very accurate in terms of classification. A preliminary version of the paper, “Classifying Large Data Sets Using SVM with Hierarchical Clusters”, by H. Yu, J. Yang, and J. Han, appeared in Proc. 2003 Int. Conf. on Knowledge Discovery in Databases (KDD'03), Washington, DC, August 2003. However, this submission has substantially extended the previous paper and contains new and major-value added technical contribution in comparison with the conference publication.
Hwanjo Yu (Corresponding author)Email:
Jiong YangEmail:
Jiawei HanEmail:
Xiaolei LiEmail:
  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号