首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
Rapid increase in the amount of the digital audio collections presenting various formats, types, durations and other parameters that the digital multimedia world refers demands a generic framework for robust and efficient indexing and retrieval based on the aural content. Moreover, from the content-based multimedia retrieval point of view, the audio information can be even more important than the visual part as it is mostly unique and significantly stable within the entire duration of the content. A generic and robust audio-based multimedia indexing and retrieval framework, which has been developed and tested under the MUVIS system, is presented. This framework supports the dynamic integration of the audio feature extraction modules during the indexing and retrieval phases and therefore provides a test-bed platform for developing robust and efficient aural feature extraction techniques. Furthermore, the proposed framework is designed based on the high-level content classification and segmentation in order to improve the speed and accuracy of the aural retrievals. Both theoretical and experimental results are finally presented, including the comparative measures of retrieval performance with respect to the visual counterpart.  相似文献   

2.
Models for motion-based video indexing and retrieval   总被引:9,自引:0,他引:9  
With the rapid proliferation of multimedia applications that require video data management, it is becoming more desirable to provide proper video data indexing techniques capable of representing the rich semantics in video data. In real-time applications, the need for efficient query processing is another reason for the use of such techniques. We present models that use the object motion information in order to characterize the events to allow subsequent retrieval. Algorithms for different spatiotemporal search cases in terms of spatial and temporal translation and scale invariance have been developed using various signal and image processing techniques. We have developed a prototype video search engine, PICTURESQUE (pictorial information and content transformation unified retrieval engine for spatiotemporal queries) to verify the proposed methods. Development of such technology will enable true multimedia search engines that will enable indexing and searching of the digital video data based on its true content.  相似文献   

3.
4.
Content based video indexing and retrieval   总被引:3,自引:0,他引:3  
Video management tools and techniques are based on pixels rather than perceived content. Thus, state-of-the-art video editing systems can easily manipulate such things as time codes and image frames, but they cannot “know,” for example, what a basketball is. Our research addresses four areas of content-based video management  相似文献   

5.
基于内容的音频检索具有较高的实用价值.将安静环境下训练的模型应用于实际有背景噪声的环境中.分析各种语音增强技术,并通过性能对比,选取谱减法作为系统前端抗噪技术,给出一个将音频增强和音频检索系统级联组成的抗噪声音频检索系统,并给出适合系统使用的谱减法的改进算法.  相似文献   

6.
基于内容的音频检索具有较高的实用价值,将安静环境下训练的模型应用于实际有背景噪声的环境中。分析各种语音增强技术,并通过性能对比,选取谱减法作为系统前端抗噪技术,给出一个将音频增强和音频检索系统级联组成的抗噪声音频检索系统.并给出适合系统使用的谱减法的改进算法。  相似文献   

7.
Frequency layered color indexing for content-based image retrieval   总被引:1,自引:0,他引:1  
Image patches of different spatial frequencies are likely to have different perceptual significance as well as reflect different physical properties. Incorporating such concept is helpful to the development of more effective image retrieval techniques. We introduce a method which separates an image into layers, each of which retains only pixels in areas with similar spatial frequency characteristics and uses simple low-level features to index the layers individually. The scheme associates indexing features with perceptual and physical significance thus implicitly incorporating high level knowledge into low level features. We present a computationally efficient implementation of the method, which enhances the power and at the same time retains the simplicity and elegance of basic color indexing. Experimental results are presented to demonstrate the effectiveness of the method.  相似文献   

8.
This paper presents a brief overview of recent approaches to two problems in music information retrieval: query by example and automated source separation. It describes the challenges inherent in musical query by example and the systems that take very different approaches to the problem. The paper also explores approaches to source separation in a musical context, focusing on systems that take distinct approaches to the problem. Finally, it comments about the future directions for digital signal processing (DSP) research in the context of music information retrieval.  相似文献   

9.
The ease of capturing and encoding digital images has produced a massive amount of visual information online. As a consequence, grand challenges have emerged in the areas of storage, indexing, and retrieval of visual information in large archives. How does one find a photograph from a large archive that contains millions of pictures? How does a CNN video journalist find a specific clip from the myriad of video tapes, ranging from historical to contemporary, from sports to humanities? Efficient, real-time algorithms and systems are needed to address these needs of not only professionals but for users who want to find visual information online  相似文献   

10.
Robust coding schemes for indexing and retrieval from large facedatabases   总被引:10,自引:0,他引:10  
This paper introduces two new coding schemes, probabilistic reasoning models (PRM) and enhanced FLD (Fisher linear discriminant) models (EFM), for indexing and retrieval of large image databases with applications to face recognition. The unifying theme of the new schemes is that of lowering the space dimension ("data compression") subject to increased fitness for the discrimination index.  相似文献   

11.
The primary advances in speech and audio signal processing that contributed to the maturing of multimedia applications are discussed in the areas of speech and audio signal compression, speech synthesis, acoustic processing and echo control, and network echo cancellation  相似文献   

12.
In our previous work, illumination invariant object recognition was achieved by normalizing the three color bands. We further employed the compressed histogram of the chromaticity to arrive at a valuable representation of an object which can facilitate high retrieval accuracy. The first shortcoming of this method lies in the usage of a uniform quantization scheme in obtaining the chromaticity, which is not in agreement with the perception of the human vision system. In this paper, we develop an approach using the CIE UCS transform to circumvent this problem. Second, instead of using uncompressed images to achieve the illumination invariant indexing and retrieval, we carry out our indexing process directly in the DCT domain by using several coefficients from each macro-block. Third, in light of the special properties of the normalized chromaticity histogram frames, the foundation of the ensuing low-pass filtering, an additional step is inserted to render this frame smoother thus resulting in a better data reduction. Fourth, in order to facilitate efficient retrieval during the data query phase, which is of utmost importance in digital libraries, the 36-dimensional model vectors as the indices of model images in digital libraries are clustered by use of vector quantization techniques. This clustering strategy reduces the searching space by an order of magnitude. Desirable results have been observed from our experiments using the proposed color-object-indexing/retrieval algorithm.  相似文献   

13.
In this paper, a novel study on system profiles and adaptation of parameters for end-users of content-based indexing and retrieval (CBIR) applications are presented. The main objective of the study is improving the overall CBIR application performance in different hardware platforms having different technical capabilities and conditions. We define CBIR system profiles in terms of hardware and system platform attributes and propose CBIR parameters for each profile. Hence, the study consists of two main parts: system profiling and adaptation of indexing and retrieval parameters for each profile. The proposed CBIR parameters are appropriate configurations for optimal CBIR use on every platform. The proposed parameters for each system profile are assessed over a large set of experiments. Experimental studies show that the proposed parameters for each system profile have satisfactory semantic retrieval performance, with reduced computational complexity and storage space requirement. 45 to 78% improvement is achieved in the computational complexity of the retrieval process depending on the profile.  相似文献   

14.
This paper describes an original approach for content-based video indexing and retrieval. We aim at providing a global interpretation of the dynamic content of video shots without any prior motion segmentation and without any use of dense optic flow fields. To this end, we exploit the spatio-temporal distribution, within a shot, of appropriate local motion-related measurements derived from the spatio-temporal derivatives of the intensity function. These distributions are then represented by causal Gibbs models. To be independent of camera movement, the motion-related measurements are computed in the image sequence generated by compensating the estimated dominant image motion in the original sequence. The statistical modeling framework considered makes the exact computation of the conditional likelihood of a video shot belonging to a given motion or more generally to an activity class feasible. This property allows us to develop a general statistical framework for video indexing and retrieval with query-by-example. We build a hierarchical structure of the processed video database according to motion content similarity. This results in a binary tree where each node is associated to an estimated causal Gibbs model. We consider a similarity measure inspired from Kullback-Leibler divergence. Then, retrieval with query-by-example is performed through this binary tree using the maximum a posteriori (MAP) criterion. We have obtained promising results on a set of various real image sequences.  相似文献   

15.
Traditional World Wide Web search engines, such as AltaVista.com, index and recommend individual Web pages to assist users in locating relevant documents. As the Web grows, however, the number of matching pages increases at a tremendous rate. Users are often overwhelmed by the large answer set recommended by the search engines. Also, if a matching document is a hypertext, the document structure is destroyed and the individual pages that compose the document are returned instead. The logical starting point of the hyperdocument is thus hidden among the large basket of matching pages. Users need to spend a lot of effort browsing through the pages to locate the starting point, a very time consuming process. This paper studies the anchor point indexing problem. The set of anchor points of a given user query is a small set of key pages from which the larger set of documents that are relevant to the query can be easily reached. The use of anchor points helps solve the problems of huge answer set and low precision suffered by most search engines by considering the hyperlink structures of the relevant documents, and by providing a summary view of the result set.  相似文献   

16.
In the future, the world of telecommunications will be vastly different than it is today. The driving force will be the seamless integration of real time communications (e.g. voice, video, music, etc.) and data into a single network, with ubiquitous access to that network anywhere, anytime, and by a wide range of devices. The only currently available ubiquitous access device to the network is the telephone, and the only ubiquitous user access technology mode is spoken voice commands and natural language dialogues with machines. In the future, new access devices and modes will augment speech in this role, but are unlikely to supplant the telephone and access by speech anytime soon. Speech technologies have progressed to the point where they are now viable for a broad range of communications services, including: compression of speech for use over wired and wireless networks; speech synthesis, recognition, and understanding for dialogue access to information, people, and messaging; and speaker verification for secure access to information and services. The paper provides brief overviews of these technologies, discusses some of the unique properties of wireless, plain old telephone service, and Internet protocol networks that make voice communication and control problematic, and describes the types of voice services available in the past and today, and those that we foresee becoming available over the next several years  相似文献   

17.
Today, biomedical media data are being generated at rates unimaginable only years ago. Content-based retrieval of biomedical media from large databases is becoming increasingly important to clinical, research, and educational communities. In this paper, we present the recently developed entropy balanced statistical (EBS) k-d tree and its applications to biomedical media, including a high-resolution computed tomography (HRCT) lung image database and the first real-time protein tertiary structure search engine. Our index utilizes statistical properties inherent in large-scale biomedical media databases for efficient and accurate searches. By applying concepts from pattern recognition and information theory, the EBS k-d tree is built through top-down decision tree induction. Experimentation shows similarity searches against a protein structure database of 53 363 structures consistently execute in less than 8.14 ms for the top 100 most similar structures. Additionally, we have shown improved retrieval precision over adaptive and statistical k-d trees. Retrieval precision of the EBS k-d tree is 81.6% for content-based retrieval of HRCT lung images and 94.9% at 10% recall for protein structure similarity search. The EBS k-d tree has enormous potential for use in biomedical applications embedded with ground-truth knowledge and multidimensional signatures.  相似文献   

18.
Image indexing and retrieval using expressive fuzzy description logics   总被引:2,自引:0,他引:2  
The effective management and exploitation of multimedia documents requires the extraction of the underlying semantics. Multimedia analysis algorithms can produce fairly rich, though imprecise information about a multimedia document which most of the times remains unexploited. In this paper we propose a methodology for semantic indexing and retrieval of images, based on techniques of image segmentation and classification combined with fuzzy reasoning. In the proposed knowledge-assisted analysis architecture a segmentation algorithm firstly generates a set of over-segmented regions. After that, a region classification process is employed to assign semantic labels using a confidence degree and simultaneously merge regions based on their semantic similarity. This information comprises the assertional component of a fuzzy knowledge base which is used for the refinement of mistakenly classified regions and also for the extraction of rich implicit knowledge used for global image classification. This knowledge about images is stored in a semantic repository permitting image retrieval and ranking. This research was supported by the European Commission under contract FP6-027026 K-SPACE.  相似文献   

19.
Content-based classification, search, and retrieval of audio   总被引:5,自引:0,他引:5  
Many audio and multimedia applications would benefit from the ability to classify and search for audio based on its characteristics. The audio analysis, search, and classification engine described here reduces sounds to perceptual and acoustical features. This lets users search or retrieve sounds by any one feature or a combination of them, by specifying previously learned classes based on these features, or by selecting or entering reference sounds and asking the engine to retrieve similar or dissimilar sounds  相似文献   

20.
1引言现在,电子技术的发展早已进入了数码时代。而功率放大器仍沿袭着传统的模拟设计方式,其技术已相当成熟,几乎到了模拟技术的极限,似乎有落后于数码时代的趋势。但在数字功放方面进行的大量的理论研究仍未摆脱传统的模拟方式。可见传统的模拟功率放大器仍然会在较长的一段时间内,以其相当成熟的技术而大放异彩。2Hi-Fi功放的电压放大功率放大器的目的是将来自声源的信号(约1V),经电压放大和电流放大,产生足够大的输出功率推动扬声器发声。功率放大器的电压放大一般由1~3级电压放大器构成,其线性度、动态特性等直接影响功率放大器的技…  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号