首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
Zhang  Hong  Huang  Yu  Xu  Xin  Zhu  Ziqi  Deng  Chunhua 《Multimedia Tools and Applications》2018,77(3):3353-3368

Due to the rapid development of multimedia applications, cross-media semantics learning is becoming increasingly important nowadays. One of the most challenging issues for cross-media semantics understanding is how to mine semantic correlation between different modalities. Most traditional multimedia semantics analysis approaches are based on unimodal data cases and neglect the semantic consistency between different modalities. In this paper, we propose a novel multimedia representation learning framework via latent semantic factorization (LSF). First, the posterior probability under the learned classifiers is served as the latent semantic representation for different modalities. Moreover, we explore the semantic representation for a multimedia document, which consists of image and text, by latent semantic factorization. Besides, two projection matrices are learned to project images and text into a same semantic space which is more similar with the multimedia document. Experiments conducted on three real-world datasets for cross-media retrieval, demonstrate the effectiveness of our proposed approach, compared with state-of-the-art methods.

  相似文献   

2.
Social media networks contain both content and context-specific information. Most existing methods work with either of the two for the purpose of multimedia mining and retrieval. In reality, both content and context information are rich sources of information for mining, and the full power of mining and processing algorithms can be realized only with the use of a combination of the two. This paper proposes a new algorithm which mines both context and content links in social media networks to discover the underlying latent semantic space. This mapping of the multimedia objects into latent feature vectors enables the use of any off-the-shelf multimedia retrieval algorithms. Compared to the state-of-the-art latent methods in multimedia analysis, this algorithm effectively solves the problem of sparse context links by mining the geometric structure underlying the content links between multimedia objects. Specifically for multimedia annotation, we show that an effective algorithm can be developed to directly construct annotation models by simultaneously leveraging both context and content information based on latent structure between correlated semantic concepts. We conduct experiments on the Flickr data set, which contains user tags linked with images. We illustrate the advantages of our approach over the state-of-the-art multimedia retrieval techniques.  相似文献   

3.
In this paper we provide a classification of adaptive systems with respect to the kind of semantic technology they exploit to accomplish or improve specific adaptation and user modeling tasks. This classification is based on a distinction between strong semantic techniques and weak semantic techniques. The former are techniques based on the Semantic Web, while the latter regard technologies that, in different ways, annotate resources, enriching their meaning. This second category includes, in particular, Web 2.0 social annotations and mixed approaches between social annotations and Semantic Web techniques. While the impact of the Semantic Web on adaptive systems has been discussed in several survey papers, the potential of weak semantic technologies has, so far, received little attention. The aim of this analysis is to fill this gap. Therefore, we will discuss contributions and limits of both approaches, but we will focus special attention on weak semantic adaptive systems.  相似文献   

4.

In the recent years the rapid growth of multimedia content makes the image retrieval a challenging research task. Content Based Image Retrieval (CBIR) is a technique which uses features of image to search user required image from large image dataset according to the user’s request in the form of query image. Effective feature representation and similarity measures are very crucial to the retrieval performance of CBIR. The key challenge has been attributed to the well known semantic gap issue. The machine learning has been actively investigated as possible solution to bridge the semantic gap. The recent success of deep learning inspires as a hope for bridging the semantic gap in CBIR. In this paper, we investigate deep learning approach used for CBIR tasks under varied settings from our empirical studies; we find some encouraging conclusions and insights for future research.

  相似文献   

5.
6.
The distributed nature of the Web, as a decentralized system exchanging information between heterogeneous sources, has underlined the need to manage interoperability, i.e., the ability to automatically interpret information in Web documents exchanged between different sources, necessary for efficient information management and search applications. In this context, XML was introduced as a data representation standard that simplifies the tasks of interoperation and integration among heterogeneous data sources, allowing to represent data in (semi-) structured documents consisting of hierarchically nested elements and atomic attributes. However, while XML was shown most effective in exchanging data, i.e., in syntactic interoperability, it has been proven limited when it comes to handling semantics, i.e.,  semantic interoperability, since it only specifies the syntactic and structural properties of the data without any further semantic meaning. As a result, XML semantic-aware processing has become a motivating challenge in Web data management, requiring dedicated semantic analysis and disambiguation methods to assign well-defined meaning to XML elements and attributes. In this context, most existing approaches: (i) ignore the problem of identifying ambiguous XML elements/nodes, (ii) only partially consider their structural relationships/context, (iii) use syntactic information in processing XML data regardless of the semantics involved, and (iv) are static in adopting fixed disambiguation constraints thus limiting user involvement. In this paper, we provide a new XML Semantic Disambiguation Framework titled XSDFdesigned to address each of the above limitations, taking as input: an XML document, and then producing as output a semantically augmented XML tree made of unambiguous semantic concepts extracted from a reference machine-readable semantic network. XSDF consists of four main modules for: (i) linguistic pre-processing of simple/compound XML node labels and values, (ii) selecting ambiguous XML nodes as targets for disambiguation, (iii) representing target nodes as special sphere neighborhood vectors including all XML structural relationships within a (user-chosen) range, and (iv) running context vectors through a hybrid disambiguation process, combining two approaches: concept-basedand context-based disambiguation, allowing the user to tune disambiguation parameters following her needs. Conducted experiments demonstrate the effectiveness and efficiency of our approach in comparison with alternative methods. We also discuss some practical applications of our method, ranging over semantic-aware query rewriting, semantic document clustering and classification, Mobile and Web services search and discovery, as well as blog analysis and event detection in social networks and tweets.  相似文献   

7.
Multimedia understanding is a fast emerging interdisciplinary research area. There is tremendous potential for effective use of multimedia content through intelligent analysis. Diverse application areas are increasingly relying on multimedia understanding systems. Advances in multimedia understanding are related directly to advances in signal processing, computer vision, pattern recognition, multimedia databases, and smart sensors. We review the state-of-the-art techniques in multimedia retrieval. In particular, we discuss how multimedia retrieval can be viewed as a pattern recognition problem. We discuss how reliance on powerful pattern recognition and machine learning techniques is increasing in the field of multimedia retrieval. We review the state-of-the-art multimedia understanding systems with particular emphasis on a system for semantic video indexing centered around multijects and multinets. We discuss how semantic retrieval is centered around concepts and context and the various mechanisms for modeling concepts and context.  相似文献   

8.
The semantics of multimedia data, which features context-dependency and media-independency, is of vital importance to multimedia applications but inadequately supported by the state-of-the-art database technology. In this paper, we address this problem by proposing MediaView as an extended object-oriented view mechanism to bridge the “semantic gap” between conventional databases and semantics-intensive multimedia applications. This mechanism captures the dynamic semantics of multimedia using a modelling construct named media view, which formulates a customized context where heterogeneous media objects with similar/related semantics are characterized by additional properties and user-defined semantic relationships. Due to the complex ingredients and dynamic application requirements of multimedia databases, it is however difficult for users to define by themselves individual media views in a top–down fashion. To this end, a unique approach of constructing media views logically is devised. In addition, a set of user level operators is defined and implemented to accommodate the specialization and generalization relationships among the views. The usefulness and elegancy of MediaView are demonstrated by its application in a multi-modal information retrieval system. Main part of the work by this Qing Li was done when he was on leave from City University of Hong Kong, HKSAR, China.  相似文献   

9.
Multimedia analysis and reuse of raw un-edited audio visual content known as rushes is gaining acceptance by a large number of research labs and companies. A set of research projects are considering multimedia indexing, annotation, search and retrieval in the context of European funded research, but only the FP6 project RUSHES is focusing on automatic semantic annotation, indexing and retrieval of raw and un-edited audio-visual content. Even professional content creators and providers as well as home-users are dealing with this type of content and therefore novel technologies for semantic search and retrieval are required. In this paper, we present a summary of the most relevant achievements of the RUSHES project, focusing on specific approaches for automatic annotation as well as the main features of the final RUSHES search engine.  相似文献   

10.
基于综合推理的多媒体语义挖掘和跨媒体检索   总被引:6,自引:0,他引:6  
为了更准确地进行跨媒体检索,需要挖掘、学习不同类型多媒体对象之间的语义关联,为此提出一种基于综合推理模型的多媒体语义挖掘和跨媒体检索技术.首先根据多媒体对象的底层特征构造推理源,根据多媒体对象的共生关系构造影响源场来进行综合推理,并构造出多媒体语义空间;然后针对不同检索例子,根据伪相关反馈为每一个检索例子自适应地选择不同的榆索方法进行跨媒体检索.为了处理检索例子不在训练集合内的情况,提出了两阶段学习方法完成检索;同时还提出了一种基于日志的长程反馈学习算法,以提高系统性能.实验结果证明,该技术能够准确地挖掘多媒体语义,多媒体文档检索和跨媒体检索效果准确_凡稳定.  相似文献   

11.
We describe a system which supports dynamic user interaction with multimedia information using content-based hypermedia navigation techniques, specialising in a technique for navigation of musical content. The model combines the principles of open hypermedia, whereby hypermedia link information is maintained by a link service, with content-based retrieval techniques in which a database is queried based on a feature of the multimedia content; our approach could be described as ‘content-based retrieval of hypermedia links’. The experimental system focuses on temporal media and consists of a set of component-based navigational hypermedia tools. We propose the use of melodic pitch contours in this context and we present techniques for storing and querying contours, together with experimental results. Techniques for integrating the contour database with open hypermedia systems are also discussed.  相似文献   

12.
13.
Slow access to disk-based multimedia data is a major limiting factor in the performance of modern multimedia Web servers connected over broadband networks. The I/O bottleneck becomes even more pronounced for currently evolving systems handling multimedia data, such as audio and video. Retrieval of complex multimedia documents needs to be handled at two levels: I/O bandwidth management for multiple multimedia streams, and interstream and intrastream synchronization for multimedia objects constituting these documents. In this paper, based on the diverse characteristics of multimedia data, we propose efficient techniques for synchronous retrieval and delivery of such data from the storage system to the main memory of the server. We propose methods to quantify user perceived quality via quality-of-presentation (QoP) parameters. We combine QoP and Object Composition Petri Net (OCPN) multimedia data modeling to develop techniques for efficient synchronous retrieval of multimedia data. Since I/O bandwidth is a precious resource, the proposed techniques have low overhead, which is , where m is the number of logical I/O channels and n is the total number of frames of multimedia data in a scheduling period. We simulate the relative performance of these techniques under diverse I/O conditions and determine the tradeoffs between the system resources, such as memory, bandwidth, and the improvement in QoP for multimedia applications.Published online: 9 February 2005 Correspondence to: M. Farrukh Khan  相似文献   

14.
In this paper we present a framework for unified, personalized access to heterogeneous multimedia content in distributed repositories. Focusing on semantic analysis of multimedia documents, metadata, user queries and user profiles, it contributes to the bridging of the gap between the semantic nature of user queries and raw multimedia documents. The proposed approach utilizes as input visual content analysis results, as well as analyzes and exploits associated textual annotation, in order to extract the underlying semantics, construct a semantic index and classify documents to topics, based on a unified knowledge and semantics representation model. It may then accept user queries, and, carrying out semantic interpretation and expansion, retrieve documents from the index and rank them according to user preferences, similarly to text retrieval. All processes are based on a novel semantic processing methodology, employing fuzzy algebra and principles of taxonomic knowledge representation. The first part of this work presented in this paper deals with data and knowledge models, manipulation of multimedia content annotations and semantic indexing, while the second part will continue on the use of the extracted semantic information for personalized retrieval.
Stefanos KolliasEmail:
  相似文献   

15.
Web2.0 has enabled contributions to the Web on an unprecedented scale, through simple interfaces that provide engaging interactions. This wealth of data has spawned countless mashups that integrate heterogenous information, but using techniques that will not scale beyond a handful of sources. In contrast, the Semantic Web provides the key to large-scale data integration, yet still lacks approachable interfaces allowing contributions from non-specialists. In this paper we present Revyu, a reviewing and rating site in the Web2.0 mould that is built on Semantic Web infrastructure and both publishes and consumes linked RDF data. This combination of approaches affords ease of interaction for regular users and ease of integration with external data sources.  相似文献   

16.
One major challenge in the content-based image retrieval (CBIR) and computer vision research is to bridge the so-called “semantic gap” between low-level visual features and high-level semantic concepts, that is, extracting semantic concepts from a large database of images effectively. In this paper, we tackle the problem by mining the decisive feature patterns (DFPs). Intuitively, a decisive feature pattern is a combination of low-level feature values that are unique and significant for describing a semantic concept. Interesting algorithms are developed to mine the decisive feature patterns and construct a rule base to automatically recognize semantic concepts in images. A systematic performance study on large image databases containing many semantic concepts shows that our method is more effective than some previously proposed methods. Importantly, our method can be generally applied to any domain of semantic concepts and low-level features. Wei Wang received his Ph.D. degree in Computing Science and Engineering from the State University of New York (SUNY) at Buffalo in 2004, under Dr. Aidong Zhang's supervision. He received the B.Eng. in Electrical Engineering from Xi'an Jiaotong University, China in 1995 and the M.Eng. in Computer Engineering from National University of Singapore in 2000, respectively. He joined Motorola Inc. in 2004, where he is currently a senior research engineer in Multimedia Research Lab, Motorola Applications Research Center. His research interests can be summarized as developing novel techniques for multimedia data analysis applications. He is particularly interested in multimedia information retrieval, multimedia mining and association, multimedia database systems, multimedia processing and pattern recognition. He has published 15 research papers in refereed journals, conferences, and workshops, has served in the organization committees and the program committees of IADIS International Conference e-Society 2005 and 2006, and has been a reviewer for some leading academic journals and conferences. In 2005, his research prototype of “seamless content consumption” was awarded the “most innovative research concept of the year” from the Motorola Applications Research Center. Dr. Aidong Zhang received her Ph.D. degree in computer science from Purdue University, West Lafayette, Indiana, in 1994. She was an assistant professor from 1994 to 1999, an associate professor from 1999 to 2002, and has been a professor since 2002 in the Department of Computer Science and Engineering at the State University of New York at Buffalo. Her research interests include bioinformatics, data mining, multimedia systems, content-based image retrieval, and database systems. She has authored over 150 research publications in these areas. Dr. Zhang's research has been funded by NSF, NIH, NIMA, and Xerox. Dr. Zhang serves on the editorial boards of International Journal of Bioinformatics Research and Applications (IJBRA), ACMMultimedia Systems, the International Journal of Multimedia Tools and Applications, and International Journal of Distributed and Parallel Databases. She was the editor for ACM SIGMOD DiSC (Digital Symposium Collection) from 2001 to 2003. She was co-chair of the technical program committee for ACM Multimedia 2001. She has also served on various conference program committees. Dr. Zhang is a recipient of the National Science Foundation CAREER Award and SUNY Chancellor's Research Recognition Award.  相似文献   

17.
The MPEG-7 Multimedia Database System (MPEG-7 MMDB)   总被引:1,自引:0,他引:1  
Broadly used Database Management Systems (DBMS) propose multimedia extensions, like Oracle’s Multimedia (formerly interMedia). However, these extensions lack means for managing the requirements of multimedia data in terms of semantic meaningful querying, advanced indexing, content modeling and multimedia programming libraries.In this context, this paper presents the MPEG-7 Multimedia DataBase System (MPEG-7 MMDB). The innovative parts of our system are our metadata model for multimedia content relying on the XML-based MPEG-7 standard, a new indexing and querying system for MPEG-7, the query optimizer and the supporting internal and external application libraries.The resulting system, extending Oracle 10g, is verified and demonstrated by the use of two real multimedia applications in the field of audio recognition and image retrieval.  相似文献   

18.
19.
Ontologies represent domain concepts and relations in a form of semantic network. Many research works use ontologies in the information matchmaking and retrieval. This trend is further accelerated by the convergence of various information sources supported by ontologies. In this paper, we propose a novel multi-modality ontology model that integrates both the low-level image features and the high-level text information to represent image contents for image retrieval. By embedding this ontology into an image retrieval system, we are able to realize intelligent image retrieval with high precision. Moreover, benefiting from the soft-coded ontology model, this system has good flexibility and can be easily extended to the larger domains. Currently, our experiment is conducted on the animal domain canine. An ontology has been built based on the low-level features and the domain knowledge of canine. A prototype retrieval system is set up to assess the performance. We compare our experiment results with traditional text-based image search engine and prove the advantages of our approach.  相似文献   

20.
ABSTRACT

Present day information retrieval systems largely ignore the issues of lexical and compositional semantics, and rely mainly on some statistical measures for choosing or evolving an indexing scheme. This has been the reason for the decreasing precision in their responses, given an exponentially increasing number of Web pages. The work reported in this paper addresses this issue from a linguistic point of view. We show that the detection of domain-specific phrases can capture the task-specific semantics of documents. We introduce the notion of n*-gram formalism to characterize the domain-specific phrases and their variants, taking a few sample domains. A method to construct a phrase grammar from a small set of documents is proposed. A method of conceptual indexing based on the phrase grammar has also been proposed. In order to demonstrate the effectiveness of the proposed method, we have designed a versatile system that can perform concept-based retrieval, in addition to several document-processing tasks, such as text classification, extraction-based summarization, context tracking, and semantic tagging. Collectively, the system can address the semantic content of documents. Considering the fact that an average user prefers highly relevant results in the top-ranked subset to an exhaustively retrieved set, it is shown that the proposed system performs better in that it retrieves documents that are more conceptually relevant than those retrieved by Google, and at 95% confidence level.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号