共查询到20条相似文献,搜索用时 0 毫秒
1.
随着互联网的飞速发展,越来越多的视频被上传和下载,然而这些海量的视频中有很大的比例是近似重复的,这些近似重复的视频会给版权控制、视频检索准确性等造成一定影响,同时也会增加运营商的存储和处理成本。如何在大规模的视频集中找出近似重复的视频变得日益重要。本文对近几年关于近似重复视频检索方面的相关工作和研究成果进行了深入调研,详细论述了当前近似视频检索技术的现状及关键技术,并对其发展进行了展望。 相似文献
3.
Semantic filtering and retrieval of multimedia content is crucial for efficient use of the multimedia data repositories. Video query by semantic keywords is one of the most difficult problems in multimedia data retrieval. The difficulty lies in the mapping between low-level video representation and high-level semantics. We therefore formulate the multimedia content access problem as a multimedia pattern recognition problem. We propose a probabilistic framework for semantic video indexing, which call support filtering and retrieval and facilitate efficient content-based access. To map low-level features to high-level semantics we propose probabilistic multimedia objects (multijects). Examples of multijects in movies include explosion, mountain, beach, outdoor, music etc. Semantic concepts in videos interact and to model this interaction explicitly, we propose a network of multijects (multinet). Using probabilistic models for six site multijects, rocks, sky, snow, water-body forestry/greenery and outdoor and using a Bayesian belief network as the multinet we demonstrate the application of this framework to semantic indexing. We demonstrate how detection performance can be significantly improved using the multinet to take interconceptual relationships into account. We also show how the multinet can fuse heterogeneous features to support detection based on inference and reasoning 相似文献
4.
As one of key technologies in content-based near-duplicate detection and video retrieval, video sequence matching can be used to judge whether two videos exist duplicate or near-duplicate segments or not. Despite a lot of research efforts devoted in recent years, how to precisely and efficiently perform sequence matching among videos (which may be subject to complex audio-visual transformations) from a large-scale database still remains a pretty challenging task. To address this problem, this paper proposes a multiscale video sequence matching (MS-VSM) method, which can gradually detect and locate the similar segments between videos from coarse to fine scales. At the coarse scale, it makes use of the Maximum Weight Matching (MWM) algorithm to rapidly select several candidate reference videos from the database for a given query. Then for each candidate video, its most similar segment with respect to the given query is obtained at the middle scale by the Constrained Longest Ascending Matching Subsequence (CLAMS) algorithm, and then can be used to judge whether that candidate exists near-duplicate or not. If so, the precise locations of the near-duplicate segments in both query and reference videos are determined at the fine scale by using bi-directional scanning to check the matching similarity at the segments’ boundaries. As such, the MS-VSM method can achieve excellent near-duplicate detection accuracy and localization precision with a very high processing efficiency. Extensive experiments show that it outperforms several state-of-the-art methods remarkably on several benchmarks. 相似文献
5.
In this paper, a method for indexing cross-language databases for conceptual query matching is presented. Two languages (Greek and English) are combined by appending a small portion of documents from one language to the identical documents in the other language. The proposed merging strategy duplicates less than 7% of the entire database (made up of different translations of the Gospels). Previous strategies duplicated up to 34% of the initial database in order to perform the merger. The proposed method retrieves a larger number of relevant documents for both languages with higher cosine rankings when Latent Semantic Indexing (LSI) is employed. Using the proposed merge strategies, LSI is shown to be effective in retrieving documents from either language (Greek or English) without requiring any translation of a user's query. An effective Bible search product needs to allow the use of natural language for searching (queries). LSI enables the user to form queries with using natural expressions in the user's own native language. The merging strategy proposed in this study enables LSI to retrieve relevant documents effectively using a minimum of the database in a foreign language.Michael W. Berry is an Assistant Professor in the Department of Computer Science at the University of Tennessee, Knoxville. He recieved a Ph.D. in Computer Science from the University of Illinois at Urbana-Champaign in 1990, and an M.S. in Applied Mathematics from North Carolina State University at Raleigh in 1983. His current interests include scientific computing, parallel algorithms, information retrieval applications, and computer performance evaluation. He is a member of the ACM, SIAM, and the IEEE Computer Society.Paul G. Young is now employed as an Associate Consultant with Oracle Government Services in Knoxville, TN. In 1984 he graduated from the Gordon-Conwell Theological Seminary in S. Hamilton, MA and became an Ordained Presbyterian Minister (PCUSA). He later received an M.S. in Computer Science from the University of Tennessee in 1994. 相似文献
6.
针对交互式的多媒体学习系统的特点,提出了一种基于自然语言的方法来实现基于内容的视频检索,用户可以用自然语言和系统进行交互,从而方便快捷地找到自己想要的视频片段.该方法集成了自然语言处理、实体名提取,基于帧的索引以及信息检索等技术,从而使系统能够处理用户提出的自然语言问题,根据问题构建简洁明了的问题模板,用问题模板与系统中已建的描述视频的模板进行匹配,从而降低了视频检索问题的复杂度,提高了系统的易用性. 相似文献
7.
As the amount of electronic information increases, traditional lexical (or Boolean) information retrieval techniques will become less useful. Large, heterogeneous collections will be difficult to search since the sheer volume of unranked documents returned in response to a query will overwhelm the user. Vector-space approaches to information retrieval, on the other hand, allow the user to search for concepts rather than specific words, and rank the results of the search according to their relative similarity to the query. One vector-space approach, Latent Semantic Indexing (LSI), has achieved up to 30% better retrieval performance than lexical searching techniques by employing a reduced-rank model of the term-document space. However, the original implementation of LSI lacked the execution efficiency required to make LSI useful for large data sets. A new implementation of LSI, LSI++, seeks to make LSI efficient, extensible, portable, and maintainable. The LSI++ Application Programming Interface (API) allows applications to immediately use LSI without knowing the implementation details of the underlying system. LSI++ supports both serial and distributed searching of large data sets, providing the same programming interface regardless of the implementation actually executing. In addition, a World Wide Web interface was created to allow simple, intuitive searching of document collections using LSI++. Timing results indicate that the serial implementation of LSI++ searches up to six times faster than the original implementation of LSI, while the parallel implementation searches nearly 180 times faster on large document collections. 相似文献
8.
In this paper, we present an ontology-based information extraction and retrieval system and its application in the soccer domain. In general, we deal with three issues in semantic search, namely, usability, scalability and retrieval performance. We propose a keyword-based semantic retrieval approach. The performance of the system is improved considerably using domain-specific information extraction, inferencing and rules. Scalability is achieved by adapting a semantic indexing approach and representing the whole world as small independent models. The system is implemented using the state-of-the-art technologies in Semantic Web and its performance is evaluated against traditional systems as well as the query expansion methods. Furthermore, a detailed evaluation is provided to observe the performance gain due to domain-specific information extraction and inferencing. Finally, we show how we use semantic indexing to solve simple structural ambiguities. 相似文献
9.
随着互连网上信息资源的极度膨胀,出现了各种各样的信息搜集工具给用户提供信息服务,但是目前的信息搜集系统在给用户提供信息服务时,很难根据用户的个人信息实现个性化的信息服务,不同的用户相同的查询请求,返回的查询结果是相同的,这给用户的使用带来了很大的不便,而隐含语义检索(LSI)可以利用关键词之间的语义信息完成信息的搜索。提出了利用LSI来进行个性化信息检索算法的几种实现方法,并通过实验验证了算法的有效性。 相似文献
10.
The proliferation of digital media has led to a huge interest in classifying and indexing media objects for generic search and usage. In particular, we are witnessing a colossal growth in digital image repositories that are difficult to navigate using free-text search mechanisms, which often return inaccurate matches as they in principle rely on statistical analysis of query keyword recurrence in the image annotation or surrounding text. In this paper we present a semantically-enabled image annotation and retrieval engine that is designed to satisfy the requirements of the commercial image collections market in terms of both accuracy and efficiency of the retrieval process. Our search engine relies on methodically structured ontologies for image annotation, thus allowing for more intelligent reasoning about the image content and subsequently obtaining a more accurate set of results and a richer set of alternatives matchmaking the original query. We also show how our well-analysed and designed domain ontology contributes to the implicit expansion of user queries as well as the exploitation of lexical databases for explicit semantic-based query expansion. 相似文献
11.
Semantic-level content analysis is a crucial issue in achieving efficient content retrieval and management. We propose a hierarchical
approach that models the statistical characteristics of audio events over a time series to accomplish semantic context detection.
Two stages, audio event and semantic context modeling, are devised to bridge the semantic gap between physical audio features
and semantic concepts. In this work, hidden Markov models (HMMs) are used to model four representative audio events, i.e.,
gunshot, explosion, engine, and car-braking, in action movies. At the semantic-context level, Gaussian mixture models (GMMs)
and ergodic HMMs are investigated to fuse the characteristics and correlations between various audio events. They provide
cues for detecting gunplay and car-chasing scenes, two semantic contexts we focus on in this work. The promising experimental
results demonstrate the effectiveness of the proposed approach and exhibit that the proposed framework provides a foundation
in semantic indexing and retrieval. Moreover, the two fusion schemes are compared, and the relations between audio event and
semantic context are studied. 相似文献
13.
Selecting an instructive story from a video case base is an information retrieval problem, but standard indexing and retrieval techniques [ 1] were not developed with such applications in mind. The classical model assumes a passive retrieval system queried by interested and well-informed users. In educational situations, students cannot be expected to form appropriate queries or to identify their own ignorance. Systems that teach must, therefore, be active retrievers that formulate their own retrieval cues and reason about the appropriateness of intervention. The Story Producer for InteractivE Learning (SPIEL) is an active retrieval system for recalling stories to tell to students who are learning social skills in a simulated environment [2, 3]. SPIEL is a component of the Guided Social Simulation (GuSS) architecture [4] used to build YELLO, a program that teaches account executives the fine points of selling Yellow Pages advertising. SPIEL uses structured, conceptual indices derived from research in case-based reasoning [5, 6]. SPIEL's manually-created indices are detailed representations of what stories are about, and they are needed to make precise assessments of stories' relevance. SPIEL's opportunistic retrieval architecture operates in two phases. During the storage phase, the system uses its educational knowledge encapsulated in a library of “storytelling strategies” to determine, for each story, what an opportunity to tell that story would look like. During the retrieval phase, the system tries to recognize those opportunities while the student interacts with the simulation. This design is similar to “opportunistic memory” architectures proposed for opportunistic planning [7, 8]. 相似文献
14.
We are witnessing a significant growth in the number of smartphone users and advances in phone hardware and sensor technology. In conjunction with the popularity of video applications such as YouTube, an unprecedented number of user-generated videos (UGVs) are being generated and consumed by the public, which leads to a Big Data challenge in social media. In a very large video repository, it is difficult to index and search videos in their unstructured form. However, due to recent development, videos can be geo-tagged (e.g., locations from GPS receiver and viewing directions from digital compass) at the acquisition time, which can provide potential for efficient management of video data. Ideally, each video frame can be tagged by the spatial extent of its coverage area, termed Field-Of-View (FOV). This effectively converts a challenging video management problem into a spatial database problem. This paper attacks the challenges of large-scale video data management using spatial indexing and querying of FOVs, especially maximally harnessing the geographical properties of FOVs. Since FOVs are shaped similar to slices of pie and contain both location and orientation information, conventional spatial indexes, such as R-tree, cannot index them efficiently. The distribution of UGVs’ locations is non-uniform (e.g., more FOVs in popular locations). Consequently, even multilevel grid-based indexes, which can handle both location and orientation, have limitations in managing the skewed distribution. Additionally, since UGVs are usually captured in a casual way with diverse setups and movements, no a priori assumption can be made to condense them in an index structure. To overcome the challenges, we propose a class of new R-tree-based index structures that effectively harness FOVs’ camera locations, orientations and view-distances, in tandem, for both filtering and optimization. We also present novel search strategies and algorithms for efficient range and directional queries on our indexes. Our experiments using both real-world and large synthetic video datasets (over 30 years’ worth of videos) demonstrate the scalability and efficiency of our proposed indexes and search algorithms. 相似文献
15.
We present word spatial arrangement (WSA), an approach to represent the spatial arrangement of visual words under the bag-of-visual-words model. It lies in a simple idea which encodes the relative position of visual words by splitting the image space into quadrants using each detected point as origin. WSA generates compact feature vectors and is flexible for being used for image retrieval and classification, for working with hard or soft assignment, requiring no pre/post processing for spatial verification. Experiments in the retrieval scenario show the superiority of WSA in relation to Spatial Pyramids. Experiments in the classification scenario show a reasonable compromise between those methods, with Spatial Pyramids generating larger feature vectors, while WSA provides adequate performance with much more compact features. As WSA encodes only the spatial information of visual words and not their frequency of occurrence, the results indicate the importance of such information for visual categorization. 相似文献
17.
The problem of video classification can be viewed as discovering the signature patterns in the elemental features of a video
class. In order to solve this problem, a large and diverse set of video features is proposed in this paper. The contributions
of the paper further lie in dealing with high-dimensionality induced by the feature space and in presenting an algorithm based
on two-phase grid searching for automatic parameter selection for support vector machine (SVM). The framework thus is directed
to bridge the gap between low-level features and semantic video classes. The experimental results and comparison with state-of-the-art
learning tools on more than 5000 video segments show the effectiveness of our approach. 相似文献
18.
As a powerful and expressive nontextual media that can capture and present information, instructional videos are extensively used in e-learning (Web-based distance learning). Since each video may cover many subjects, it is critical for an e-learning environment to have content-based video searching capabilities to meet diverse individual learning needs. In this paper, we present an interactive multimedia-based e-learning environment that enables users to interact with it to obtain knowledge in the form of logically segmented video clips. We propose a natural language approach to content-based video indexing and retrieval to identify appropriate video clips that can address users' needs. The method integrates natural language processing, named entity extraction, frame-based indexing, and information retrieval techniques to explore knowledge-on-demand in a video-based interactive e-learning environment. A preliminary evaluation shows that precision and recall of this approach are better than those of the traditional keyword based approach. 相似文献
20.
The increasing performance and wider spread use of automated semantic annotation and entity linking platforms has empowered the possibility of using semantic information in information retrieval. While keyword-based information retrieval techniques have shown impressive performance, the addition of semantic information can increase retrieval performance by allowing for more accurate sense disambiguation, intent determination, and instance identification, just to name a few. Researchers have already delved into the possibility of integrating semantic information into practical search engines using a combination of techniques such as using graph databases, hybrid indices and adapted inverted indices, among others. One of the challenges with the efficient design of a search engine capable of considering semantic information is that it would need to be able to index information beyond the traditional information stored in inverted indices, including entity mentions and type relationships. The objective of our work in this paper is to investigate various ways in which different data structure types can be adopted to integrate three types of information including keywords, entities and types. We will systematically compare the performance of the different data structures for scenarios where (i) the same data structure types are adopted for the three types of information, and (ii) different data structure types are integrated for storing and retrieving the three different information types. We report our findings in terms of the performance of various query processing tasks such as Boolean and ranked intersection for the different indices and discuss which index type would be appropriate under different conditions for semantic search. 相似文献
|