首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 36 毫秒
1.
实现信息化政务办公需要解决的一个问题是视频文档信息的检索与浏览。文中给出一个政务视频信息浏览和检索系统的方案,系统支持抽象层的视频信息浏览和基于示例图像的视频信息检索。系统分成两个主要部分:(1)视频内容结构化处理和抽象部分;(2)基于示例图像的视频信息检索部分。在部分(1)中,给出了一个基于颜色及目标轮廓变化特征进行非监督视频结构化和视频内容抽象的方法;在部分(2)中,首先对图像库中图像进行特征降维,然后进行聚类处理,这样降低了通过示例图像检索视频内容过程中相似性匹配运算的时间,提高了效率。  相似文献   

2.
Personalization is one of the most important mechanisms to make multimedia systems easy to use. In video applications, its embodiment is to tailor video contents for a particular viewer. For this purpose, we are now developing a system of retrieving and browsing video segments, called video portal with personalization (VIPP). VIPP is characterized by 1) supporting the viewer's access to video contents and making a summarized video clip by taking his/her preference into account and 2) acquiring the viewer's profile from his/her operations automatically. In this paper, we propose a method for learning to personalize from the viewer's operations such as retrieval and browsing, as well as describe how the personalized retrieval and summarization of videos can be realized. From the experiments, we clarify the effect of personalization on retrieval and summarization of baseball videos on VIPP.  相似文献   

3.
一种基于Agent的因特网信息获取系统   总被引:4,自引:0,他引:4  
文中在分析传统的搜索引擎和离线济览软件优缺点的基础上,提出了一种基本代理的因特网信息获取方案,并实现了集自动下载、全文检索和离线济览功能于一体的WebClone系统,使用户获取和共享因特网信息更加方便。  相似文献   

4.
Hierarchical video browsing and feature-based video retrieval are two standard methods for accessing video content. Very little research, however, has addressed the benefits of integrating these two methods for more effective and efficient video content access. In this paper, we introduce InsightVideo, a video analysis and retrieval system, which joins video content hierarchy, hierarchical browsing and retrieval for efficient video access. We propose several video processing techniques to organize the content hierarchy of the video. We first apply a camera motion classification and key-frame extraction strategy that operates in the compressed domain to extract video features. Then, shot grouping, scene detection and pairwise scene clustering strategies are applied to construct the video content hierarchy. We introduce a video similarity evaluation scheme at different levels (key-frame, shot, group, scene, and video.) By integrating the video content hierarchy and the video similarity evaluation scheme, hierarchical video browsing and retrieval are seamlessly integrated for efficient content access. We construct a progressive video retrieval scheme to refine user queries through the interactions of browsing and retrieval. Experimental results and comparisons of camera motion classification, key-frame extraction, scene detection, and video retrieval are presented to validate the effectiveness and efficiency of the proposed algorithms and the performance of the system.  相似文献   

5.
Spoken content retrieval will be very important for retrieving and browsing multimedia content over the Internet, and spoken term detection (STD) is one of the key technologies for spoken content retrieval. In this paper, we show acoustic feature similarity between spoken segments used with pseudo-relevance feedback and graph-based re-ranking can improve the performance of STD. This is based on the concept that spoken segments similar in acoustic feature vector sequences to those with higher/lower relevance scores should have higher/lower scores, while graph-based re-ranking further uses a graph to consider the similarity structure among all the segments retrieved in the first pass. These approaches are formulated on both word and subword lattices, and a complete framework of using them in open vocabulary retrieval of spoken content is presented. Significant improvements for these approaches with both in-vocabulary and out-of-vocabulary queries were observed in preliminary experiments.  相似文献   

6.
Detecting events of interest from video sequences, and searching and retrieving events from video databases are important and challenging problems. Event of interest is a very general term, since events of interest can vary significantly among different applications and users. A system that can only detect and/or retrieve a finite set of predefined events will find limited use. Thus, the event detection and retrieval problems introduce additional challenges including providing the user with flexibility to specify customized events with varying complexity, and communicating user-defined events to a system in a generic way. This paper presents a spatio-temporal event detection system that lets users specify semantically high-level and composite events, and then detects their occurrences automatically. Events can be defined on a single camera view or across multiple camera views. In addition to extracting information from videos, detecting customized events, and generating real-time alerts, the proposed system uses the extracted information in the search, retrieval, data management and investigation context. Generated event meta-data is mapped into tables in a relational database against which queries may be launched. It is therefore possible to retrieve events based on various attributes. Moreover, a variety of statistics can be computed on the event data. Thus, the presented system provides capabilities of a fully integrated smart system.  相似文献   

7.
The dramatic growth of video content over modern media channels (such as the Internet and mobile phone platforms) directs the interest of media broadcasters towards the topics of video retrieval and content browsing. Several video retrieval systems benefit from the use of semantic indexing based on content, since it allows an intuitive categorization of videos. However, indexing is usually performed through manual annotation, thus introducing potential problems such as ambiguity, lack of information, and non-relevance of index terms. In this paper, we present SHIATSU, a complete system for video retrieval which is based on the (semi-)automatic hierarchical semantic annotation of videos exploiting the analysis of visual content; videos can then be searched by means of attached tags and/or visual features. We experimentally evaluate the performance of SHIATSU on two different real video benchmarks, proving its accuracy and efficiency.  相似文献   

8.
In this paper, we describe a unique new paradigm for video database management known as ViBE (video indexing and browsing environment). ViBE is a browseable/searchable paradigm for organizing video data containing a large number of sequences. The system first segments video sequences into shots by using a new feature vector known as the Generalized Trace obtained from the DC-sequence of the compressed data. Each video shot is then represented by a hierarchical structure known as the shot tree. The shots are then classified into pseudo-semantic classes that describe the shot content. Finally, the results are presented to the user in an active browsing environment using a similarity pyramid data structure. The similarity pyramid allows the user to view the video database at various levels of detail. The user can also define semantic classes and reorganize the browsing environment based on relevance feedback. We describe how ViBE performs on a database of MPEG sequences.  相似文献   

9.
Automatic text segmentation and text recognition for video indexing   总被引:13,自引:0,他引:13  
Efficient indexing and retrieval of digital video is an important function of video databases. One powerful index for retrieval is the text appearing in them. It enables content-based browsing. We present our new methods for automatic segmentation of text in digital videos. The algorithms we propose make use of typical characteristics of text in videos in order to enable and enhance segmentation performance. The unique features of our approach are the tracking of characters and words over their complete duration of occurrence in a video and the integration of the multiple bitmaps of a character over time into a single bitmap. The output of the text segmentation step is then directly passed to a standard OCR software package in order to translate the segmented text into ASCII. Also, a straightforward indexing and retrieval scheme is introduced. It is used in the experiments to demonstrate that the proposed text segmentation algorithms together with existing text recognition algorithms are suitable for indexing and retrieval of relevant video sequences in and from a video database. Our experimental results are very encouraging and suggest that these algorithms can be used in video retrieval applications as well as to recognize higher level semantics in videos.  相似文献   

10.
一种层次的电影视频摘要生成方法   总被引:1,自引:0,他引:1       下载免费PDF全文
合理地组织视频数据对于基于内容的视频分析和检索有着重要的意义。提出了一种基于运动注意力模型的电影视频摘要生成方法。首先给出了一种基于滑动镜头窗的聚类算法将相似的镜头组织成为镜头类;然后根据电影视频场景内容的发展模式,在定义两个镜头类的3种时序关系的基础上,提出了一种基于镜头类之间的时空约束关系的场景检测方法;最后利用运动注意力模型选择场景中的重要镜头和代表帧,由选择的代表帧集合和重要镜头的关键帧集合建立层次视频摘要(场景级和镜头级)。该方法较全面地涵盖了视频内容,又突出了视频中的重要内容,能够很好地应用于电影视频的快速浏览和检索。  相似文献   

11.
A new effective mechanism is proposed for the browsing of large compressed images over the Internet. The image is compressed with the JPEG 2000 into one single bitstream and put on the server. During the browsing process, the user specifies a region of interest (ROI) with certain spatial and resolution constraint. The browser only downloads the portion of the compressed bitstream that covers the current ROI, and the download is performed in a progressive fashion so that a coarse view of the ROI can be rendered very quickly and then gradually refined as more and more bitstream arrives. In the case of the switch of ROI, e.g., zooming in/out or panning around, the browser uses existing compressed bitstream in cache to quickly render a coarse view of the new ROI, and in the same time, request a new set of compressed bitstream corresponding to the updated view. The system greatly improves the experience of browsing large images over the slow networks.  相似文献   

12.
Video server needs a storage system with large bandwidth in order to provide concurrently more users with the real time retrieval requests for video streams. So, the storage system generally has the structure of disk array, which consists of multiple disks. When the storage system serves multiple video stream requests, it's bottlenecks come from the seeking delay caused by the random movement of disk head and from unbalanced disk access due to disk load unbalance among multiple disks.This paper presents a novel placement and retrieval policy. The new policy retrieves the requested data through sequential movement of disk heads and maintaining disk load balance so that it can diminish the bottlenecks on retrieving and can provide the concurrent real time retrieval services for more users simultaneously. In addition, the novel policy reduces the startup latency for the requests. The correctness of the novel placement and retrieval policy is analyzed with theoretical views. Performance analysis of the novel placement and retrieval policy is provided with simulations.  相似文献   

13.
14.
WebClip (on-line demo at http://www.ctr.columbia.edu/webclip) is a compressed video searching and editing system operating over the World Wide Web. WebClip uses a distributed client-server model including a server engine for content analysis/editing, and clients for interactive controls of video browsing/editing. It specializes several unique features, including compressed-domain video feature extraction and manipulation, multi-resolution video access, content based video browsing/retrieval, and a distributed network architecture.  相似文献   

15.
In the digital world, secure data communication has an important role in mass media and Internet technology. With the increase in modern malicious technologies, confidential data are exposed at a greater risk during data communication. For secured communication, recent technologies and the Internet have introduced steganography, a new way to hide data. Steganography is the growing practice of concealing data in multimedia files for secure data transfer. Nowadays, videos are more commonly chosen as cover media than other multimedia files because of the moving sequence of images and audio files. Despite its popularity, video steganography faces a significant challenge, which is a lack of a fast retrieval system of the hidden data. This study proposes a novel video steganography technique in which an enhanced hidden Markov model (EHMM) is employed to improve the speed of retrieving hidden data. EHMM mathematical formulations are used to enhance the speed of embedding and extracting secret data. The data embedding and retrieving operations were performed using the conditional states and the state transition dynamics between the video frames. The proposed EHMM is extensively evaluated using three benchmark functions, and experimental evaluations are conducted to test the speed of data retrieval using differently sized cover-videos. Results indicate that the proposed EHMM yields better results by reducing the data hiding time by 3–50%, improving the data retrieval rate by 22–77% with a minimum computational cost of 20–91%, and improving the security by 4–77% compared with state-of-the-art methods.  相似文献   

16.
17.
Traditional browsing of large multimedia documents (e.g., video, audio) is primarily sequential. In the absence of an index structure browsing and searching for relevant information in a long video, audio or other multimedia document becomes difficult. Manual annotation can be used to mark various segments of such documents. Different segments can be combined to create new annotated segments, thus creating hierarchical annotation structures. Given the lack of structure in media data, it is natural for different users to have different views on the same media data. Therefore, different users can create different annotation structures. Users may also share some or all of each other's annotation structures. The annotation structure can be browsed or used to playback as a composed video consisting of different segments. Finally, the annotation structures can be manipulated dynamically by different users to alter views on a document. BRAHMA is a multimedia environment for browsing and retrieval of multimedia documents based on such hierarchical annotation structures.  相似文献   

18.
Recent advancement in cameras and image processing technology has generated a paradigm shift from traditional 2D and 3D video to multi-view video (MVV) technology, while at the same time improving video quality and compression through standards such as high efficiency video coding (HEVC). In multi-view, cameras are placed in predetermined positions to capture the video from various views. Delivering such views with high quality over the Internet is a challenging prospect, as MVV traffic is several times larger than traditional video, since it consists of multiple video sequences, each captured from a different angle, requiring more bandwidth than single-view video to transmit MVV. In addition, the Internet is known to be prone to packet loss, delay, and bandwidth variation, which adversely affect MVV transmission. Another challenge is that end users’ devices have different capabilities in terms of computing power, display, and access link capacity, requiring MVV to be adapted to each user’s context. In this paper, we propose an HEVC multi-view system using Dynamic Adaptive Streaming over HTTP to overcome the above-mentioned challenges. Our system uses an adaptive mechanism to adjust the video bit rate to the variations of bandwidth in best effort networks. We also propose a novel scalable way for the multi-view video and depth content for 3D video in terms of the number of transmitted views. Our objective measurements show that our method of transmitting MVV content can maximize the perceptual quality of virtual views after the rendering and hence increase the user’s quality of experience.  相似文献   

19.
Information retrieval from the Internet is becoming a commonplace phenomenon. Users and consumers are browsing websites and seeking various kinds of information for personal use. Retrieving quality information from the Internet can be challenging even for the computer-savvy. There are several search engines, even some personalized, to help users search for information on the Internet. In spite of all the claims about search engines, users still have difficult time retrieving relevant information quickly. This paper proposes a general conceptual model for user-centered quality information retrieval (UCQIR) from the Internet. The UCQIR conceptual model is presented in an architectural form. The UCQIR architectural model uses the concept of “Task-performer” to present various aspects of an information retrieval system at the knowledge level. Task-performer is an abstract construct used to conceptualize the idea of an entity that is competent in doing its tasks. The UCQIR architectural model can be used to easily design and develop domain-specific, user-centered quality information retrieval systems. The proposed UCQIR conceptual model is unique and comprehensive. The use of the conceptual model is illustrated through a design of a patient-centered quality medical information retrieval for the medical domain. We also present an experimental evaluation of a UCQIR prototype based upon real user experiences. The experimental results are very positive.  相似文献   

20.
Information summarization and retrieval are significant research topics associated with recent advancements in sensor devices, data compression and storage techniques, and high-speed internet. As a result of these advances, it is possible for people to collect huge life-logs. Video is one of the most important life information sources. This paper describes a method of summarizing video life-logs in an office environment with a multi-camera system. Previously, multi-camera systems have been used to track moving objects or to cover a wide area. This paper focuses on capturing diverse views of each office event using a multi-camera system with several cameras observing the same area. The summarization process includes camera view selection and event sequence summarization. View selection produces a single event sequence from multiple event sequences by selecting an optimal view at each time, for which domain knowledge based on the elements of the office environment and rules from questionnaire surveys have been used. Summarization creates a summary sequence from whole sequences by using a fuzzy rule-based system to approximate human decision making. The user-entered degrees of interest in objects, persons, and events are used for a personalized summarization. We confirmed experimentally that the proposed method provides promising results.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号