首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
Knowledge management has become a challenge for almost all e-government applications where the efficient processing of large amounts of data is still a critical issue. In the last years, semantic techniques have been introduced to improve the full automatic digitalization process of documents, in order to facilitate the access to the information embedded in very large document repositories. In this paper, we present a novel model for multimedia digital documents aiming at improve effectiveness of digitalization activities within an information system supporting e-government organizations. At the best of our knowledge, the proposed model is one of the first attempts to give a single and unified characterization of multimedia documents managed by e-government applications, whereas semantic procedures and multimedia facilities are used for the transformation of unstructured documents into structured information. Furthermore, we define an architecture for the management of multimedia documents “life cycle”, which provides advanced functionalities for information extraction, semantic retrieval, indexing, storage, presentation, together with long-term preservation. Preliminary experiments concerning an e-health scenario are finally presented and discussed.  相似文献   

2.
In this paper we present a framework for unified, personalized access to heterogeneous multimedia content in distributed repositories. Focusing on semantic analysis of multimedia documents, metadata, user queries and user profiles, it contributes to the bridging of the gap between the semantic nature of user queries and raw multimedia documents. The proposed approach utilizes as input visual content analysis results, as well as analyzes and exploits associated textual annotation, in order to extract the underlying semantics, construct a semantic index and classify documents to topics, based on a unified knowledge and semantics representation model. It may then accept user queries, and, carrying out semantic interpretation and expansion, retrieve documents from the index and rank them according to user preferences, similarly to text retrieval. All processes are based on a novel semantic processing methodology, employing fuzzy algebra and principles of taxonomic knowledge representation. The first part of this work presented in this paper deals with data and knowledge models, manipulation of multimedia content annotations and semantic indexing, while the second part will continue on the use of the extracted semantic information for personalized retrieval.
Stefanos KolliasEmail:
  相似文献   

3.
多媒体语义模型研究进展   总被引:1,自引:0,他引:1  
多媒体语义研究是多媒体数据处理与多媒体信息服务领域的核心和关键问题。多媒体数据的语义问题源于多媒体的数据获取方式,在多媒体数据的应用阶段,这一问题成为制约多谋体数据使用和创作的重要瓶颈。语义模型研究是多媒体语义研究的重点,是多媒体数据处理过程的总结和抽象,其实质就是研究多媒体数据整个生命周期的语义问题。介绍了近几年多媒体语义模型在内容描述、语义表示、数据检索三个方面的研究进展情况。  相似文献   

4.

Since its invention, the Web has evolved into the largest multimedia repository that has ever existed. This evolution is a direct result of the explosion of user-generated content, explained by the wide adoption of social network platforms. The vast amount of multimedia content requires effective management and retrieval techniques. Nevertheless, Web multimedia retrieval is a complex task because users commonly express their information needs in semantic terms, but expect multimedia content in return. This dissociation between semantics and content of multimedia is known as the semantic gap. To solve this, researchers are looking beyond content-based or text-based approaches, integrating novel data sources. New data sources can consist of any type of data extracted from the context of multimedia documents, defined as the data that is not part of the raw content of a multimedia file. The Web is an extraordinary source of context data, which can be found in explicit or implicit relation to multimedia objects, such as surrounding text, tags, hyperlinks, and even in relevance-feedback. Recent advances in Web multimedia retrieval have shown that context data has great potential to bridge the semantic gap. In this article, we present the first comprehensive survey of context-based approaches for multimedia information retrieval on the Web. We introduce a data-driven taxonomy, which we then use in our literature review of the most emblematic and important approaches that use context-based data. In addition, we identify important challenges and opportunities, which had not been previously addressed in this area.

  相似文献   

5.
The storage and retrieval of multimedia data is a crucial problem in multimedia information systems due to the huge storage requirements. It is necessary to provide an efficient methodology for the indexing of multimedia data for rapid retrieval. The aim of this paper is to introduce a methodology to represent, simplify, store, retrieve and reconstruct an image from a repository. An algebraic representation of the spatio-temporal relations present in a document is constructed from an equivalent graph representation and used to index the document. We use this representation to simplify and later reconstruct the complete index. This methodology has been tested by implementation of a prototype system called Simplified Modeling to Access and ReTrieve multimedia information (SMART). Experimental results show that the complexity of an index of a 2D document is O (n*(n−1)/k) with k≥2 as opposed to the O (n*(n−1)/2) known so far. Since k depends on the number of objects in an image more complex documents have lower overall complexity.  相似文献   

6.
We present a system for multimedia event detection. The developed system characterizes complex multimedia events based on a large array of multimodal features, and classifies unseen videos by effectively fusing diverse responses. We present three major technical innovations. First, we explore novel visual and audio features across multiple semantic granularities, including building, often in an unsupervised manner, mid-level and high-level features upon low-level features to enable semantic understanding. Second, we show a novel Latent SVM model which learns and localizes discriminative high-level concepts in cluttered video sequences. In addition to improving detection accuracy beyond existing approaches, it enables a unique summary for every retrieval by its use of high-level concepts and temporal evidence localization. The resulting summary provides some transparency into why the system classified the video as it did. Finally, we present novel fusion learning algorithms and our methodology to improve fusion learning under limited training data condition. Thorough evaluation on a large TRECVID MED 2011 dataset showcases the benefits of the presented system.  相似文献   

7.
Concept detection stands as an important problem for efficient indexing and retrieval in large video archives. In this work, the KavTan System, which performs high-level semantic classification in one of the largest TV archives of Turkey, is presented. In this system, concept detection is performed using generalized visual and audio concept detection modules that are supported by video text detection, audio keyword spotting and specialized audio-visual semantic detection components. The performance of the presented framework was assessed objectively over a wide range of semantic concepts (5 high-level, 14 visual, 9 audio, 2 supplementary) by using a significant amount of precisely labeled ground truth data. KavTan System achieves successful high-level concept detection performance in unconstrained TV broadcast by efficiently utilizing multimodal information that is systematically extracted from both spatial and temporal extent of multimedia data.  相似文献   

8.
9.
Nowadays, server-side Web caching becomes an important technique used to reduce the User Perceived Latency (UPL). In large-scale multimedia systems, there are many Web proxies, connected with a multimedia server, that can cache some most popular multimedia objects and respond to the requests for them. Multimedia objects have some particular characteristic, e.g., strict QoS requirements. Hence, even some efficient conventional caching strategies based on cache hit ratio, meant for non-multimedia objects, will confront some problems in dealing with the multimedia objects. If we consider additional resources of proxy besides cache space, say bandwidth, we can readily observe that high hit ratios may deteriorate the entire system performance. In this paper, we propose a novel placement model for networked multimedia systems, referred to as the Hk/T model, which considers the combined influence of arrival rate, size, and playback time to select the objects to be cached. Based on this model, we propose an innovative Web caching algorithm, named as the ART-Greedy algorithm, which can balance the load among the proxies and achieve a minimum Average Response Time (ART) of the requests. Our experimental results conclusively demonstrate that the ART-Greedy algorithm outperforms the most popular and commonly used LFU (Least Frequently Used) algorithm significantly, and can achieve a better performance than the byte-hit algorithm when the system utilization is medium and high.  相似文献   

10.
The MPEG-7 Multimedia Database System (MPEG-7 MMDB)   总被引:1,自引:0,他引:1  
Broadly used Database Management Systems (DBMS) propose multimedia extensions, like Oracle’s Multimedia (formerly interMedia). However, these extensions lack means for managing the requirements of multimedia data in terms of semantic meaningful querying, advanced indexing, content modeling and multimedia programming libraries.In this context, this paper presents the MPEG-7 Multimedia DataBase System (MPEG-7 MMDB). The innovative parts of our system are our metadata model for multimedia content relying on the XML-based MPEG-7 standard, a new indexing and querying system for MPEG-7, the query optimizer and the supporting internal and external application libraries.The resulting system, extending Oracle 10g, is verified and demonstrated by the use of two real multimedia applications in the field of audio recognition and image retrieval.  相似文献   

11.
12.
This paper presents a novel approach to automatic image annotation which combines global, regional, and contextual features by an extended cross-media relevance model. Unlike typical image annotation methods which use either global or regional features exclusively, as well as neglect the textual context information among the annotated words, the proposed approach incorporates the three kinds of information which are helpful to describe image semantics to annotate images by estimating their joint probability. Specifically, we describe the global features as a distribution vector of visual topics and model the textual context as a multinomial distribution. The global features provide the global distribution of visual topics over an image, while the textual context relaxes the assumption of mutual independence among annotated words which is commonly adopted in most existing methods. Both the global features and textual context are learned by a probability latent semantic analysis approach from the training data. The experiments over 5k Corel images have shown that combining these three kinds of information is beneficial in image annotation.  相似文献   

13.
Advances in geographical tracking, multimedia processing, information extraction, and sensor networks have created a deluge of probabilistic data. While similarity search is an important tool to support the manipulation of probabilistic data, it raises new challenges to traditional relational databases. The problem stems from the limited effectiveness of the distance metrics employed by existing database systems. On the other hand, several more complicated distance operators have proven their values for better distinguishing ability in specific probabilistic domains. In this paper, we discuss the similarity search problem with respect to Earth Mover’s Distance (EMD). EMD is the most successful distance metric for probability distribution comparison but is an expensive operator as it has cubic time complexity. We present a new database indexing approach to answer EMD-based similarity queries, including range queries and k-nearest neighbor queries on probabilistic data. Our solution utilizes primal-dual theory from linear programming and employs a group of B + trees for effective candidate pruning. We also apply our filtering technique to the processing of continuous similarity queries, especially with applications to frame copy detection in real-time videos. Extensive experiments show that our proposals dramatically improve the usefulness and scalability of probabilistic data management.  相似文献   

14.
15.
Zhang  Hong  Huang  Yu  Xu  Xin  Zhu  Ziqi  Deng  Chunhua 《Multimedia Tools and Applications》2018,77(3):3353-3368

Due to the rapid development of multimedia applications, cross-media semantics learning is becoming increasingly important nowadays. One of the most challenging issues for cross-media semantics understanding is how to mine semantic correlation between different modalities. Most traditional multimedia semantics analysis approaches are based on unimodal data cases and neglect the semantic consistency between different modalities. In this paper, we propose a novel multimedia representation learning framework via latent semantic factorization (LSF). First, the posterior probability under the learned classifiers is served as the latent semantic representation for different modalities. Moreover, we explore the semantic representation for a multimedia document, which consists of image and text, by latent semantic factorization. Besides, two projection matrices are learned to project images and text into a same semantic space which is more similar with the multimedia document. Experiments conducted on three real-world datasets for cross-media retrieval, demonstrate the effectiveness of our proposed approach, compared with state-of-the-art methods.

  相似文献   

16.
BilVideo: Design and Implementation of a Video Database Management System   总被引:1,自引:1,他引:0  
With the advances in information technology, the amount of multimedia data captured, produced, and stored is increasing rapidly. As a consequence, multimedia content is widely used for many applications in today’s world, and hence, a need for organizing this data, and accessing it from repositories with vast amount of information has been a driving stimulus both commercially and academically. In compliance with this inevitable trend, first image and especially later video database management systems have attracted a great deal of attention, since traditional database systems are designed to deal with alphanumeric information only, thereby not being suitable for multimedia data.In this paper, a prototype video database management system, which we call BilVideo, is introduced. The system architecture of BilVideo is original in that it provides full support for spatio-temporal queries that contain any combination of spatial, temporal, object-appearance, external-predicate, trajectory-projection, and similarity-based object-trajectory conditions by a rule-based system built on a knowledge-base, while utilizing an object-relational database to respond to semantic (keyword, event/activity, and category-based), color, shape, and texture queries. The parts of BilVideo (Fact-Extractor, Video-Annotator, its Web-based visual query interface, and its SQL-like textual query language) are presented, as well. Moreover, our query processing strategy is also briefly explained.This work is partially supported by the Scientific and Research Council of Turkey (TÜBİTAK) under Project Code 199E025, Turkish State Planning Organization (DPT) under Grant No. 2004K120720, and European Union under Grant No. FP6-507752 (MUSCLE Network of Excellence Project).  相似文献   

17.
18.
The semantics of multimedia data, which features context-dependency and media-independency, is of vital importance to multimedia applications but inadequately supported by the state-of-the-art database technology. In this paper, we address this problem by proposing MediaView as an extended object-oriented view mechanism to bridge the “semantic gap” between conventional databases and semantics-intensive multimedia applications. This mechanism captures the dynamic semantics of multimedia using a modelling construct named media view, which formulates a customized context where heterogeneous media objects with similar/related semantics are characterized by additional properties and user-defined semantic relationships. Due to the complex ingredients and dynamic application requirements of multimedia databases, it is however difficult for users to define by themselves individual media views in a top–down fashion. To this end, a unique approach of constructing media views logically is devised. In addition, a set of user level operators is defined and implemented to accommodate the specialization and generalization relationships among the views. The usefulness and elegancy of MediaView are demonstrated by its application in a multi-modal information retrieval system. Main part of the work by this Qing Li was done when he was on leave from City University of Hong Kong, HKSAR, China.  相似文献   

19.
20.
We introduce a new paradigm for real-time conversion of a real world event into a rich multimedia database by processing data from multiple sensors observing the event. A real-time analysis of the sensor data, tightly coupled with domain knowledge, results in instant indexing of multimedia data at capture time. This yields semantic information to answer complex queries about the content and the ability to extract portions of data that correspond to complex actions performed in the real world. The power of such an instantly indexed multimedia database system, in content-based retrieval of multimedia data or in semantic analysis and visualization of the data, far exceeds that of systems which index multimedia data only after it is produced. We present LucentVision, an instantly indexed multimedia database system developed for the sport of tennis. This system analyzes video from multiple cameras in real time and captures the activity of the players and the ball in the form of motion trajectories. The system stores these trajectories in a database along with video, 3D models of the environment, scores, and other domain-specific information. LucentVision has been used to enhance live television and Internet broadcasts with game analyses and virtual replays in more than 250 international tennis matches.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号