首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
On Similarity Measures for Multimedia Database Applications   总被引:1,自引:1,他引:0  
A multimedia database query consists of a set of fuzzy and boolean (or crisp) predicates, constants, variables, and conjunction, disjunction, and negation operators. The fuzzy predicates are evaluated based on different media criteria, such as color, shape, layout, keyword. Since media-based evaluation yields similarity values, results to such a query is defined as an ordered set. Since many multimedia applications require partial matches, query results also include tuples which do not satisfy all predicates. Hence, any fuzzy semantics which extends the boolean semantics of conjunction in a straight forward manner may not be desirable for multimedia databases. In this paper, we focus on the problem of ‘given a multimedia query which consists of multiple fuzzy and crisp predicates, how to provide the user with a meaningful overall ranking.’ More specifically, we study the problem of merging similarity values in queries with multiple fuzzy predicates. We describe the essential multimedia retrieval semantics, compare these with the known approaches, and propose a semantics which captures the retrieval requirements in multimedia databases. Received 13 August 1999 / Revised 13 May 2000 / Accepted in revised form 26 July 2000  相似文献   

2.
Zhang  Hong  Huang  Yu  Xu  Xin  Zhu  Ziqi  Deng  Chunhua 《Multimedia Tools and Applications》2018,77(3):3353-3368

Due to the rapid development of multimedia applications, cross-media semantics learning is becoming increasingly important nowadays. One of the most challenging issues for cross-media semantics understanding is how to mine semantic correlation between different modalities. Most traditional multimedia semantics analysis approaches are based on unimodal data cases and neglect the semantic consistency between different modalities. In this paper, we propose a novel multimedia representation learning framework via latent semantic factorization (LSF). First, the posterior probability under the learned classifiers is served as the latent semantic representation for different modalities. Moreover, we explore the semantic representation for a multimedia document, which consists of image and text, by latent semantic factorization. Besides, two projection matrices are learned to project images and text into a same semantic space which is more similar with the multimedia document. Experiments conducted on three real-world datasets for cross-media retrieval, demonstrate the effectiveness of our proposed approach, compared with state-of-the-art methods.

  相似文献   

3.
One major challenge in the content-based image retrieval (CBIR) and computer vision research is to bridge the so-called “semantic gap” between low-level visual features and high-level semantic concepts, that is, extracting semantic concepts from a large database of images effectively. In this paper, we tackle the problem by mining the decisive feature patterns (DFPs). Intuitively, a decisive feature pattern is a combination of low-level feature values that are unique and significant for describing a semantic concept. Interesting algorithms are developed to mine the decisive feature patterns and construct a rule base to automatically recognize semantic concepts in images. A systematic performance study on large image databases containing many semantic concepts shows that our method is more effective than some previously proposed methods. Importantly, our method can be generally applied to any domain of semantic concepts and low-level features. Wei Wang received his Ph.D. degree in Computing Science and Engineering from the State University of New York (SUNY) at Buffalo in 2004, under Dr. Aidong Zhang's supervision. He received the B.Eng. in Electrical Engineering from Xi'an Jiaotong University, China in 1995 and the M.Eng. in Computer Engineering from National University of Singapore in 2000, respectively. He joined Motorola Inc. in 2004, where he is currently a senior research engineer in Multimedia Research Lab, Motorola Applications Research Center. His research interests can be summarized as developing novel techniques for multimedia data analysis applications. He is particularly interested in multimedia information retrieval, multimedia mining and association, multimedia database systems, multimedia processing and pattern recognition. He has published 15 research papers in refereed journals, conferences, and workshops, has served in the organization committees and the program committees of IADIS International Conference e-Society 2005 and 2006, and has been a reviewer for some leading academic journals and conferences. In 2005, his research prototype of “seamless content consumption” was awarded the “most innovative research concept of the year” from the Motorola Applications Research Center. Dr. Aidong Zhang received her Ph.D. degree in computer science from Purdue University, West Lafayette, Indiana, in 1994. She was an assistant professor from 1994 to 1999, an associate professor from 1999 to 2002, and has been a professor since 2002 in the Department of Computer Science and Engineering at the State University of New York at Buffalo. Her research interests include bioinformatics, data mining, multimedia systems, content-based image retrieval, and database systems. She has authored over 150 research publications in these areas. Dr. Zhang's research has been funded by NSF, NIH, NIMA, and Xerox. Dr. Zhang serves on the editorial boards of International Journal of Bioinformatics Research and Applications (IJBRA), ACMMultimedia Systems, the International Journal of Multimedia Tools and Applications, and International Journal of Distributed and Parallel Databases. She was the editor for ACM SIGMOD DiSC (Digital Symposium Collection) from 2001 to 2003. She was co-chair of the technical program committee for ACM Multimedia 2001. She has also served on various conference program committees. Dr. Zhang is a recipient of the National Science Foundation CAREER Award and SUNY Chancellor's Research Recognition Award.  相似文献   

4.
5.

Since its invention, the Web has evolved into the largest multimedia repository that has ever existed. This evolution is a direct result of the explosion of user-generated content, explained by the wide adoption of social network platforms. The vast amount of multimedia content requires effective management and retrieval techniques. Nevertheless, Web multimedia retrieval is a complex task because users commonly express their information needs in semantic terms, but expect multimedia content in return. This dissociation between semantics and content of multimedia is known as the semantic gap. To solve this, researchers are looking beyond content-based or text-based approaches, integrating novel data sources. New data sources can consist of any type of data extracted from the context of multimedia documents, defined as the data that is not part of the raw content of a multimedia file. The Web is an extraordinary source of context data, which can be found in explicit or implicit relation to multimedia objects, such as surrounding text, tags, hyperlinks, and even in relevance-feedback. Recent advances in Web multimedia retrieval have shown that context data has great potential to bridge the semantic gap. In this article, we present the first comprehensive survey of context-based approaches for multimedia information retrieval on the Web. We introduce a data-driven taxonomy, which we then use in our literature review of the most emblematic and important approaches that use context-based data. In addition, we identify important challenges and opportunities, which had not been previously addressed in this area.

  相似文献   

6.
In this article, we present the theory of Kripke semantics, along with the mathematical framework and applications of Kripke semantics. We take the Kripke‐Sato approach to define the knowledge operator in relation to Hintikka's possible worlds model, which is an application of the semantics of intuitionistic logic and modal logic. The applications are interesting from the viewpoint of agent interactives and process interaction. We propose (i) an application of possible worlds semantics, which enables the evaluation of the truth value of a conditional sentence without explicitly defining the operator “→” (implication), through clustering on the space of events (worlds) using the notion of neighborhood; and (ii) a semantical approach to treat discrete dynamic process using Kripke‐Beth semantics. Starting from the topological approach, we define the measure‐theoretical machinery, in particular, we adopt the methods developed in stochastic process—mainly the martingale—to our semantics; this involves some Boolean algebraic (BA) manipulations. The clustering on the space of events (worlds), using the notion of neighborhood, enables us to define an accessibility relation that is necessary for the evaluation of the conditional sentence. Our approach is by taking the neighborhood as an open set and looking at topological properties using metric space, in particular, the so‐called ε‐ball; then, we can perform the implication by computing Euclidean distance, whenever we introduce a certain enumerative scheme to transform the semantic objects into mathematical objects. Thus, this method provides an approach to quantify semantic notions. Combining with modal operators Ki operating on E set, it provides a more‐computable way to recognize the “indistinguishability” in some applications, e.g., electronic catalogue. Because semantics used in this context is a local matter, we also propose the application of sheaf theory for passing local information to global information. By looking at Kripke interpretation as a function with values in an open‐set lattice ??U, which is formed by stepwise verification process, we obtain a topological space structure. Now, using the measure‐theoretical approach by taking the Borel set and Borel function in defining measurable functions, this can be extended to treat the dynamical aspect of processes; from the stochastic process, considered as a family of random variables over a measure space (the probability space triple), we draw two strong parallels between Kripke semantics and stochastic process (mainly martingales): first, the strong affinity of Kripke‐Beth path semantics and time path of the process; and second, the treatment of time as parametrization to the dynamic process using the technique of filtration, adapted process, and progressive process. The technique provides very effective manipulation of BA in the form of random variables and σ‐subalgebra under the cover of measurable functions. This enables us to adopt the computational algorithms obtained for stochastic processes to path semantics. Besides, using the technique of measurable functions, we indeed obtain an intrinsic way to introduce the notion of time sequence. © 2003 Wiley Periodicals, Inc.  相似文献   

7.
近年来社交媒体逐渐成为人们获取新闻信息的主要渠道,但其在给人们带来方便的同时也促进了虚假新闻的传播.在社交媒体的富媒体化趋势下,虚假新闻逐渐由单一的文本形式向多模态形式转变,因此多模态虚假新闻检测正在受到越来越多的关注.现有的多模态虚假新闻检测方法大多依赖于和数据集高度相关的表现层面特征,对新闻的语义层面特征建模不足,难以理解文本和视觉实体的深层语义,在新数据上的泛化能力受限.提出了一种语义增强的多模态虚假新闻检测方法,通过利用预训练语言模型中隐含的事实知识以及显式的视觉实体提取,更好地理解多模态新闻的深层语义.提取不同语义层次的视觉特征,在此基础上采用文本引导的注意力机制建模图文之间的语义交互,从而更好地融合多模态异构特征.在基于微博新闻的真实数据集上的实验结果表明:该方法能够有效提高多模态虚假新闻检测的性能.  相似文献   

8.
9.
This paper presents methods and principles for knowledge elicitation and semantics definitions for images and text, respectively, and furthermore introduces a semantic representation scheme that fuses the semantic information extracted from image and text to facilitate intelligent indexing and retrieval for multimedia collection as well as media transformation through their semantic meanings. The method can be deployed for WWW applications such as telemedicine or virtual gallery.  相似文献   

10.
In this paper we study a problem motivated by the management of changes in databases. It turns out that several such change scenarios, e.g., the separately studied problems of view maintenance (propagation of data changes) and view adaptation (propagation of view definition changes) can be unified as instances of query reformulation using views provided that support for the relational difference operator exists in the context of query reformulation. Exact query reformulation using views in positive relational languages is well understood, and has a variety of applications in query optimization and data sharing. Unfortunately, most questions about queries become undecidable in the presence of difference (or negation), whether we use the foundational set semantics or the more practical bag semantics.  相似文献   

11.
Recent development in the field of digital media technology has resulted in the generation of a huge number of images. Consequently, content-based image retrieval has emerged as an important area in multimedia computing. Research in human perception of image content suggests that the semantic cues play an important role in image retrieval. In this paper, we present a new paradigm to establish the semantics in image databases based on multi-user relevance feedback. Relevance feedback mechanism is one way to incorporate the users’ perception during image retrieval. By treating each feedback as a weak classifier and combining them together, we are able to capture the categories in the users’ mind and build a user-centered semantic hierarchy in the database to support semantic browsing and searching. We present an image retrieval system based on a city-landscape image database comprising of 3,009 images. We also compare our approach with other typical methods to organize an image database. Superior results have been achieved by the proposed framework.  相似文献   

12.
Multimedia data mining refers to pattern discovery, rule extraction and knowledge acquisition from multimedia database. Two typical tasks in multimedia data mining are of visual data classification and clustering in terms of semantics. Usually performance of such classification or clustering systems may not be favorable due to the use of low-level features for image representation, and also some improper similarity metrics for measuring the closeness between multimedia objects as well. This paper considers a problem of modeling similarity for semantic image clustering. A collection of semantic images and feed-forward neural networks are used to approximate a characteristic function of equivalence classes, which is termed as a learning pseudo metric (LPM). Empirical criteria on evaluating the goodness of the LPM are established. A LPM based k-Mean rule is then employed for the semantic image clustering practice, where two impurity indices, classification performance and robustness are used for performance evaluation. An artificial image database with 11 semantics is employed for our simulation studies. Results demonstrate the merits and usefulness of our proposed techniques for multimedia data mining.  相似文献   

13.
一种支持异构数据库集成的定义说明语言   总被引:3,自引:0,他引:3  
谢兴生  方翔  庄镇泉 《计算机应用》2006,26(6):1392-1395
提出了一种数据集成中间件Mediator定义说明语言,该语言通过提供了一套抽象的语言成分组件,能支持从一个高度抽象的层次来表达复杂数据集成语义,有效解决异构数据源集成时面临的结构/语义异构冲突,并自动生成数据集成中间件Mediator。介绍数据集成定义说明语言(DISL)主要语言成分组件的语法/语义、构造生成Mediator的方法以及应用体系结构,并讨论DISL-Mediator的内部体系结构和它的静态/动态特性.  相似文献   

14.
The impetus behind Semantic Web research remains the vision of supplementing availability with utility; that is, the World Wide Web provides availability of digital media, but the Semantic Web will allow presently available digital media to be used in unseen ways. An example of such an application is multimedia retrieval. At present, there are vast amounts of digital media available on the web. Once this media gets associated with machine-understandable metadata, the web can serve as a potentially unlimited supplier for multimedia web services, which could populate themselves by searching for keywords and subsequently retrieving images or articles, which is precisely the type of system that is proposed in this paper. Such a system requires solid interoperability, a central ontology, semantic agent search capabilities, and standards. Specifically, this paper explores this cross-section of image annotation and Semantic Web services, models the web service components that constitute such a system, discusses the sequential, cooperative execution of these Semantic Web services, and introduces intelligent storage of image semantics as part of a semantic link space.  相似文献   

15.
In order to provide audiences with a proper universal multimedia experience, all classes of media consumption devices, from high definition displays to mobile media players, must receive a product that is not only adapted to their capabilities and usage environments, but also conveys the semantics and cinematography behind the narrative in an optimal way. This paper introduces a semantic video adaptation system that incorporates the media adaptation process in the center of the drama production process. Producers, directors and other creative staff instruct the semantic adaptation system using common cinematographic terminology and vocabulary, thereby seamlessly extending the drama production process into the realm of content adaptation. The multitude of production metadata obtained from various steps in the production process provides a valuable context of narrative semantics that is exploited by the adaptation process. As such, high definition imagery can be intelligently adapted to smaller resolutions while optimally fulfilling the filmmaker’s dramatic intentions with respect to the original narrative and obeying various rules of cinematographic grammar.  相似文献   

16.
Given a query workload, a database and a set of constraints, the view-selection problem is to select views to materialize so that the constraints are satisfied and the views can be used to compute the queries in the workload efficiently. A typical constraint, which we consider in the present work, is to require that the views can be stored in a given amount of disk space. Depending on features of SQL queries (e.g., the DISTINCT keyword) and on whether the database relations on which the queries are applied are sets or bags, the queries may be computed under set semantics, bag-set semantics, or bag semantics. In this paper we study the complexity of the view-selection problem for conjunctive queries and views under these semantics. We show that bag semantics is the “easiest to handle” (we show that in this case the decision version of view selection is in NP), whereas under set and bag-set semantics we assume further restrictions on the query workload (we only allow queries without self-joins in the workload) to achieve the same complexity. Moreover, while under bag and bag-set semantics filtering views (i.e., subgoals that can be dropped from the rewriting without impacting equivalence to the query) are practically not needed, under set semantics filtering views can reduce significantly the query-evaluation costs. We show that under set semantics the decision version of the view-selection problem remains in NP only if filtering views are not allowed in the rewritings. Finally, we investigate whether the cgalg algorithm for view selection introduced in Chirkova and Genesereth (Linearly bounded reformulations of conjunctive databases, pp. 987–1001, 2000) is suitable in our setting. We prove that this algorithm is sound for all cases we examine here, and that it is complete under bag semantics for workloads of arbitrary conjunctive queries and under bag-set semantics for workloads of conjunctive queries without self-joins. Rada Chirkova’s work on this material has been supported by the National Science Foundation under Grant No. 0307072. The project is co-funded by the European Social Fund (75%) and National Resources (25%)- Operational Program for Educational and Vocational Training II (EPEAEK II) and particularly the program PYTHAGORAS. A preliminary version of this paper appears in F. Afrati, R. Chirkova, M. Gergatsoulis, V. Pavlaki. Designing Views to Efficiently Answer Real SQL Queries. In Proc. of SARA 2005, LNAI Vol. 3607, pages 332-346, Springer-Verlag, 2005.  相似文献   

17.
Due to the fuzziness of query specification and media matching, multimedia retrieval is conducted by way of exploration. It is essential to provide feedback so that users can visualize query reformulation alternatives and database content distribution. Since media matching is an expensive task, another issue is how to efficiently support exploration so that the system is not overloaded by perpetual query reformulation. In this paper, we present a uniform framework to represent statistical information of both semantics and visual metadata for images in the databases. We propose the concept of query verification, which evaluates queries using statistics, and provides users with feedback, including the strictness and reformulation alternatives of each query condition as well as estimated numbers of matches. With query verification, the system increases the efficiency of the multimedia database exploration for both users and the system. Such statistical information is also utilized to support progressive query processing and query relaxation. Received: 9 June 1998/ Accepted: 21 July 2000 Published online: 4 May 2001  相似文献   

18.
An important research issue in multimedia databases is the retrieval of similar objects. For most applications in multimedia databases, an exact search is not meaningful. Thus, much effort has been devoted to develop efficient and effective similarity search techniques. A recent approach that has been shown to improve the effectiveness of similarity search in multimedia databases resorts to the usage of combinations of metrics (i.e., a search on a multi-metric space). In this approach, the desirable contribution (weight) of each metric is chosen at query time. It follows that standard metric indexes cannot be directly used to improve the efficiency of dynamically weighted queries, because they assume that there is only one fixed distance function at indexing and query time. This paper presents a methodology for adapting metric indexes to multi-metric indexes, that is, to support similarity queries with dynamic combinations of metric functions. The adapted indexes are built with a single distance function and store partial distances to estimate the dynamically weighed distances. We present two novel indexes for multimetric space indexing, which are the result of the application of the proposed methodology.  相似文献   

19.
20.
In this paper, we consider the problem of multimedia document (MMD) semantics understanding and content-based cross-media retrieval. An MMD is a set of media objects of different modalities but carrying the same semantics and the content-based cross-media retrieval is a new kind of retrieval method by which the query examples and search results can be of different modalities. Two levels of manifolds are learned to explore the relationships among all the data in the level of MMD and in the level of media object respectively. We first construct a Laplacian media object space for media object representation of each modality and an MMD semantic graph to learn the MMD semantic correlations. The characteristics of media objects propagate along the MMD semantic graph and an MMD semantic space is constructed to perform cross-media retrieval. Different methods are proposed to utilize relevance feedback and experiment shows that the proposed approaches are effective.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号