共查询到20条相似文献,搜索用时 15 毫秒
1.
Video in digital format is now commonplace and widespread in both professional use, and in domestic consumer products from camcorders to mobile phones. Video content is growing in volume and while we can capture, compress, store, transmit and display video with great facility, editing videos and manipulating them based on their content is still a non-trivial activity. In this paper, we give a brief review of the state of the art of video analysis, indexing and retrieval and we point to research directions which we think are promising and could make searching and browsing of video archives based on video content, as easy as searching and browsing (text) web pages. We conclude the paper with a list of grand challenges for researchers working in the area. 相似文献
2.
Automatic partitioning of full-motion video 总被引:95,自引:0,他引:95
Partitioning a video source into meaningful segments is an important step for video indexing. We present a comprehensive study of a partitioning system that detects segment boundaries. The system is based on a set of difference metrics and it measures the content changes between video frames. A twin-comparison approach has been developed to solve the problem of detecting transitions implemented by special effects. To eliminate the false interpretation of camera movements as transitions, a motion analysis algorithm is applied to determine whether an actual transition has occurred. A technique for determining the threshold for a difference metric and a multi-pass approach to improve the computation speed and accuracy have also been developed. 相似文献
3.
Xiaodong Wen Theodore D. Huffmire Helen H. Hu Adam Finkelstein 《Multimedia Systems》1999,7(5):350-358
We present several algorithms suitable for analysis of broadcast video. First, we show how wavelet analysis of frames of
video can be used to detect transitions between shots in a video stream, thereby dividing the stream into segments. Next we
describe how each segment can be inserted into a video database using an indexing scheme that involves a wavelet-based “signature.”
Finally, we show that during a subsequent broadcast of a similar or identical video clip, the segment can be found in the
database by quickly searching for the relevant signature. The method is robust against noise and typical variations in the
video stream, even global changes in brightness that can fool histogram-based techniques. In the paper, we compare experimentally
our shot transition mechanism to a color histogram implementation, and also evaluate the effectiveness of our database-searching
scheme. Our algorithms are very efficient and run in realtime on a desktop computer. We describe how this technology could
be employed to construct a “smart VCR” that was capable of alerting the viewer to the beginning of a specific program or identifying 相似文献
4.
Hirobumi NishidaAuthor Vitae 《Pattern recognition》2002,35(1):55-67
Efficient and robust information retrieval from large image databases is an essential functionality for the reuse, manipulation, and editing of multimedia documents. Structural feature indexing is a potential approach to efficient shape retrieval from large image databases, but the indexing is sensitive to noise, scales of observation, and local shape deformations. It has now been confirmed that efficiency of classification and robustness against noise and local shape transformations can be improved by the feature indexing approach incorporating shape feature generation techniques (Nishida, Comput. Vision Image Understanding 73 (1) (1999) 121-136). In this paper, based on this approach, an efficient, robust method is presented for retrieval of model shapes that have parts similar to the query shape presented to the image database. The effectiveness is confirmed by experimental trials with a large database of boundary contours obtained from real images, and is validated by systematically designed experiments with a large number of synthetic data. 相似文献
5.
To effectively utilize information stored in a digital image library, effective image indexing and retrieval techniques are essential. This paper proposes an image indexing and retrieval technique based on the compressed image data using vector quantization (VQ). By harnessing the characteristics of VQ, the proposed technique is able to capture the spatial relationships of pixels when indexing the image. Experimental results illustrate the robustness of the proposed technique and also show that its retrieval performance is higher compared with existing color-based techniques. 相似文献
6.
Manolis Delakis Guillaume Gravier Patrick Gros 《Computer Vision and Image Understanding》2008,111(2):142-154
Automatic video content analysis is an emerging research subject with numerous applications to large video databases and personal video recording systems. The aim of this study is to fuse multimodal information in order to automatically parse the underlying structure of tennis broadcasts. The frame-based observation distributions of Hidden Markov Models are too strict in modeling heterogeneous audiovisual data. We propose instead the use of segmental features, of the framework of Segment Models, to overcome this limitation and extend the synchronization points to the segment boundaries. Considering each segment as a video scene, auditory and visual features collected inside the scene boundaries can thus be sampled and modeled with their native sampling rates and models. Experimental results on a corpus of 15-h tennis video demonstrated a performance superiority of Segment Models with synchronous audiovisual fusion over Hidden Markov Models. Results though with asynchronous fusion are less optimistic. 相似文献
7.
Automatic text segmentation and text recognition for video indexing 总被引:13,自引:0,他引:13
Efficient indexing and retrieval of digital video is an important function of video databases. One powerful index for retrieval
is the text appearing in them. It enables content-based browsing. We present our new methods for automatic segmentation of
text in digital videos. The algorithms we propose make use of typical characteristics of text in videos in order to enable
and enhance segmentation performance. The unique features of our approach are the tracking of characters and words over their
complete duration of occurrence in a video and the integration of the multiple bitmaps of a character over time into a single
bitmap. The output of the text segmentation step is then directly passed to a standard OCR software package in order to translate
the segmented text into ASCII. Also, a straightforward indexing and retrieval scheme is introduced. It is used in the experiments
to demonstrate that the proposed text segmentation algorithms together with existing text recognition algorithms are suitable
for indexing and retrieval of relevant video sequences in and from a video database. Our experimental results are very encouraging
and suggest that these algorithms can be used in video retrieval applications as well as to recognize higher level semantics
in videos. 相似文献
8.
Personalized retrieval of sports video based on multi-modal analysis and user preference acquisition
In this paper, we present a novel framework on personalized retrieval of sports video, which includes two research tasks:
semantic annotation and user preference acquisition. For semantic annotation, web-casting texts which are corresponding to
sports videos are firstly captured from the webpages using data region segmentation and labeling. Incorporating the text,
we detect events in the sports video and generate video event clips. These video clips are annotated by the semantics extracted
from web-casting texts and indexed in a sports video database. Based on the annotation, these video clips can be retrieved
from different semantic attributes according to the user preference. For user preference acquisition, we utilize click-through
data as a feedback from the user. Relevance feedback is applied on text annotation and visual features to infer the intention
and interested points of the user. A user preference model is learned to re-rank the initial results. Experiments are conducted
on broadcast soccer and basketball videos and show an encouraging performance of the proposed method.
Yi-Fan Zhang received the B.E. degree from Southeast University, Nanjing, China, in 2004. He is currently pursuing the Ph.D. degree at National Laboratory of Pattern Recognition, Institute of Automation, Chinese Academy of Sciences, Beijing, China. In 2007, he was an intern student in Institute for Infocomm Research, Singapore. Currently he is an intern student in China-Singapore Institute of Digital Media. His research interests include multimedia, video analysis and pattern recognition. Changsheng Xu (M’97–SM’99) received the Ph.D. degree from Tsinghua University, Beijing, China in 1996. Currently he is Professor of Institute of Automation, Chinese Academy of Sciences and Executive Director of China-Singapore Institute of Digital Media. He was with Institute for Infocomm Research, Singapore from 1998 to 2008. He was with the National Lab of Pattern Recognition, Institute of Automation, Chinese Academy of Sciences from 1996 to 1998. His research interests include multimedia content analysis, indexing and retrieval, digital watermarking, computer vision and pattern recognition. He published over 150 papers in those areas. Dr. Xu is an Associate Editor of ACM/Springer Multimedia Systems Journal. He served as Short Paper Co-Chair of ACM Multimedia 2008, General Co-Chair of 2008 Pacific-Rim Conference on Multimedia (PCM2008) and 2007 Asia-Pacific Workshop on Visual Information Processing (VIP2007), Program Co-Chair of VIP2006, Industry Track Chair and Area Chair of 2007 International Conference on Multimedia Modeling (MMM2007). He also served as Technical Program Committee Member of major international multimedia conferences, including ACM Multimedia Conference, International Conference on Multimedia & Expo, Pacific-Rim Conference on Multimedia, and International Conference on Multimedia Modeling. Xiaoyu Zhang received the B.S. degree in computer science from Nanjing University of Science and Technology in 2005. He is a Ph.D. candidate of National Laboratory of Pattern Recognition, Institute of Automation, Chinese Academy of Sciences. He is currently a student in China-Singapore Institute of Digital Media. His research interests include image retrieval, video analysis, and machine learning. Hanqing Lu (M’05–SM’06) received the Ph.D. degree in Huazhong University of Sciences and Technology, Wuhan, China in 1992. Currently he is Professor of Institute of Automation, Chinese Academy of Sciences. His research interests include image similarity measure, video analysis, object recognition and tracking. He published more than 100 papers in those areas. 相似文献
Hanqing LuEmail: |
Yi-Fan Zhang received the B.E. degree from Southeast University, Nanjing, China, in 2004. He is currently pursuing the Ph.D. degree at National Laboratory of Pattern Recognition, Institute of Automation, Chinese Academy of Sciences, Beijing, China. In 2007, he was an intern student in Institute for Infocomm Research, Singapore. Currently he is an intern student in China-Singapore Institute of Digital Media. His research interests include multimedia, video analysis and pattern recognition. Changsheng Xu (M’97–SM’99) received the Ph.D. degree from Tsinghua University, Beijing, China in 1996. Currently he is Professor of Institute of Automation, Chinese Academy of Sciences and Executive Director of China-Singapore Institute of Digital Media. He was with Institute for Infocomm Research, Singapore from 1998 to 2008. He was with the National Lab of Pattern Recognition, Institute of Automation, Chinese Academy of Sciences from 1996 to 1998. His research interests include multimedia content analysis, indexing and retrieval, digital watermarking, computer vision and pattern recognition. He published over 150 papers in those areas. Dr. Xu is an Associate Editor of ACM/Springer Multimedia Systems Journal. He served as Short Paper Co-Chair of ACM Multimedia 2008, General Co-Chair of 2008 Pacific-Rim Conference on Multimedia (PCM2008) and 2007 Asia-Pacific Workshop on Visual Information Processing (VIP2007), Program Co-Chair of VIP2006, Industry Track Chair and Area Chair of 2007 International Conference on Multimedia Modeling (MMM2007). He also served as Technical Program Committee Member of major international multimedia conferences, including ACM Multimedia Conference, International Conference on Multimedia & Expo, Pacific-Rim Conference on Multimedia, and International Conference on Multimedia Modeling. Xiaoyu Zhang received the B.S. degree in computer science from Nanjing University of Science and Technology in 2005. He is a Ph.D. candidate of National Laboratory of Pattern Recognition, Institute of Automation, Chinese Academy of Sciences. He is currently a student in China-Singapore Institute of Digital Media. His research interests include image retrieval, video analysis, and machine learning. Hanqing Lu (M’05–SM’06) received the Ph.D. degree in Huazhong University of Sciences and Technology, Wuhan, China in 1992. Currently he is Professor of Institute of Automation, Chinese Academy of Sciences. His research interests include image similarity measure, video analysis, object recognition and tracking. He published more than 100 papers in those areas. 相似文献
9.
The need for content-based access to image and video information from media archives has captured the attention of researchers in recent years. Research efforts have led to the development of methods that provide access to image and video data. These methods have their roots in pattern recognition. The methods are used to determine the similarity in the visual information content extracted from low level features. These features are then clustered for generation of database indices. This paper presents a comprehensive survey on the use of these pattern recognition methods which enable image and video retrieval by content. 相似文献
10.
针对交互式的多媒体学习系统的特点,提出了一种基于自然语言的方法来实现基于内容的视频检索,用户可以用自然语言和系统进行交互,从而方便快捷地找到自己想要的视频片段.该方法集成了自然语言处理、实体名提取,基于帧的索引以及信息检索等技术,从而使系统能够处理用户提出的自然语言问题,根据问题构建简洁明了的问题模板,用问题模板与系统中已建的描述视频的模板进行匹配,从而降低了视频检索问题的复杂度,提高了系统的易用性. 相似文献
11.
W. A. C. Fernando 《Multimedia Tools and Applications》2006,28(3):301-320
This paper addresses an important area in video processing, namely compressed domain processing. For video indexing, video
scene transition detection is an essential step to segment the video. Current techniques for scene change detection tend to
suffer from a major limitation as most of them cannot identify scene transitions in the compressed domain. Since most video
is expected to be stored in the compressed domain, scene transition detection in this domain is highly desirable. In this
paper an algorithm for video scene change detection is proposed to overcome this limitation. In this scheme, properties of
the B-frames are used as it is capable of measuring the correlation between two adjacent reference frames. The results show
that this scheme performs better than schemes based on P-frames. Proposed scheme can be directly applied with compressed data
with minimum decompression and hence it is computationally efficient and makes real time implementations possible. Results
show that video scene transitions can be identified satisfactorily with the proposed scheme. 相似文献
12.
13.
For efficient image retrieval, the image database should be processed to extract a representing feature vector for each member image in the database. A reliable and robust statistical image indexing technique based on a stochastic model of an image color content has been developed. Based on the developed stochastic model, a compact 12-dimensional feature vector was defined to tag images in the database system. The entries of the defined feature vector are the mean, variance, and skewness of the image color histogram distributions as well as correlation factors between color components of the RGB color space. It was shown using statistical analysis that the feature vector provides sufficient knowledge about the histogram distribution. The reliability and robustness of the proposed technique against common intensity artifacts and noise was validated through several experiments conducted for that purpose. The proposed technique outperforms traditional and other histogram based techniques in terms of feature vector size and properties, as well as performance. 相似文献
14.
As the majority of content-based image retrieval systems operate on full images in pixel domain, decompression is a prerequisite for the retrieval of compressed images. To provide a possible on-line indexing and retrieval technique for those jpg image files, we propose a novel pseudo-pixel extraction algorithm to bridge the gap between the existing image indexing technology, developed in the pixel domain, and the fact that an increasing number of images stored on the Web are already compressed by JPEG at the source. Further, we describe our Web-based image retrieval system, WEBimager, by using the proposed algorithm to provide a prototype visual information system toward automatic management, indexing, and retrieval of compressed images available on the Internet. This provides users with efficient tools to search the Web for compressed images and establish a database or a collection of special images to their interests. Experiments using texture- and colour-based indexing techniques support the idea that the proposed algorithm achieves significantly better results in terms of computing cost than their full decompression or partial decompression counterparts. This technology will help control the explosion of media-rich content by offering users a powerful automated image indexing and retrieval tool for compressed images on the Web.J. Jiang: Contacting author 相似文献
15.
Sharlee ClimerAuthor VitaeSanjiv K. BhatiaAuthor Vitae 《Pattern recognition》2002,35(11):2479-2488
Image database indexing is used for efficient retrieval of images in response to a query expressed as an example image. The query image is processed to extract information that is matched against the index to provide pointers to similar images. We present a technique that facilitates content similarity-based retrieval of jpeg-compressed images without first having to uncompress them. The technique is based on an index developed from a subset of jpeg coefficients and a similarity measure to determine the difference between the query image and the images in the database. This method offers substantial efficiency as images are processed in compressed format, information that was derived during the original compression of the images is reused, and extensive early pruning is possible. Initial experiments with the index have provided encouraging results. The system outputs a set of ranked images in the database with respect to the query using the similarity measure, and can be limited to output a specified number of matched images by changing the threshold match. 相似文献
16.
基于后控技术的中小型竞争情报系统自动标引研究 总被引:2,自引:0,他引:2
为企业设计构建竞争情报系统对支持企业做出适时恰当的决策起着重要作用.基于此.为中小企业设计了一种基于Internet的竞争情报系统,为企业提供智能检索、个性化的服务等;为了提高系统的检索效率,改善系统功能,通过分析后控词表与本体之间的关系,提出一种利用本体编制后控词表的方法.对竞争情报系统中的文献进行检索效果比较,表明采用后控制的文献检索其查全率有显著的提高. 相似文献
17.
This paper discusses a video cut detection method. Cut detection is an important technique for making videos easier to handle. First, this paper analyzes the distribution of the image differenceV to clarify the characteristics that makeV suitable for cut detection. We propose a cut detection method that uses a projection (an isolated sharp peak) detecting filter. A motion sensitiveV is used to stabilizeV projections at cuts, and cuts are detected more reliably with this filter. The method can achieve high detection rates without increasing the rate of misdetection. Experimental results confirm the effectiveness of the filter. 相似文献
18.
Spatial relationships are important issues for similarity-based retrieval in many image database applications. With the popularity of digital cameras and the related image processing software, a sequence of images are often rotated or flipped. That is, those images are transformed in the rotation orientation or the reflection direction. However, many iconic indexing strategies based on symbolic projection are sensitive to rotation or reflection. Therefore, these strategies may miss the qualified images, when the query is issued in the orientation different from the orientation of the database images. To solve this problem, some researchers proposed a function to map the spatial relationship to its transformed one. However, this mapping consists of several conditional statements, which is time-consuming. Thus, in this paper, we propose an efficient iconic indexing strategy, in which we carefully assign a unique bit pattern to each spatial relationship and record the spatial information based on the bit patterns in a matrix. Without generating the rotated or flipped image, we can directly derive the index of the rotated or flipped image from the index of the original one by bit operations and matrix manipulation. In our performance study, we analyze the time complexity of our proposed strategy and show the efficiency of our proposed strategy according to the simulation results. Moreover, we implement a prototype to validate our proposed strategy. 相似文献
19.
20.
Automatic video logo detection and removal 总被引:1,自引:0,他引:1
Most commercial television channels use video logos, which can be considered a form of visible watermark, as a declaration
of intellectual property ownership. They are also used as a symbol of authorization to rebroadcast when original logos are
used in conjunction with newer logos. An unfortunate side effect of such logos is the concomitant decrease in viewing pleasure.
In this paper, we use the temporal correlation of video frames to detect and remove video logos. In the video-logo-detection
part, as an initial step, the logo boundary box is first located by using a distance threshold of video frames and is further
refined by employing a comparison of edge lengths. Second, our proposed Bayesian classifier framework locates fragments of
logos called logo-lets. In this framework, we systematically integrate the prior knowledge about the location of the video logos and their intrinsic
local features to achieve a robust detection result. In our logo-removal part, after the logo region is marked, a matching
technique is used to find the best replacement patch for the marked region within that video shot. This technique is found
to be useful for small logos. Furthermore, we extend the image inpainting technique to videos. Unlike the use of 2D gradients in the image inpainting technique, we inpaint the logo region of video
frames by using 3D gradients exploiting the temporal correlations in video. The advantage of this algorithm is that the inpainted
regions are consistent with the surrounding texture and hence the result is perceptually pleasing. We present the results
of our implementation and demonstrate the utility of our method for logo removal. 相似文献