期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

1.

Searching and editing MPEG-compressed video in a distributed online environment

Horace J. Meng Di Zhong Shih-Fu Chang 《Multimedia Systems》1999,7(4):282-293

WebClip (on-line demo at http://www.ctr.columbia.edu/webclip) is a compressed video searching and editing system operating over the World Wide Web. WebClip uses a distributed client-server model including a server engine for content analysis/editing, and clients for interactive controls of video browsing/editing. It specializes several unique features, including compressed-domain video feature extraction and manipulation, multi-resolution video access, content based video browsing/retrieval, and a distributed network architecture. 相似文献

2.

Detection of human faces in a compressed domain for video stratification 总被引：5，自引：0，他引：5

Tat-Seng Chua Yunlong Zhao Mohan S. Kankanhalli 《The Visual computer》2002,18(2):121-133

Published online: 15 March 2002 相似文献

3.

Automatic text segmentation and text recognition for video indexing 总被引：13，自引：0，他引：13

Rainer Lienhart Wolfgang Effelsberg 《Multimedia Systems》2000,8(1):69-81

Efficient indexing and retrieval of digital video is an important function of video databases. One powerful index for retrieval is the text appearing in them. It enables content-based browsing. We present our new methods for automatic segmentation of text in digital videos. The algorithms we propose make use of typical characteristics of text in videos in order to enable and enhance segmentation performance. The unique features of our approach are the tracking of characters and words over their complete duration of occurrence in a video and the integration of the multiple bitmaps of a character over time into a single bitmap. The output of the text segmentation step is then directly passed to a standard OCR software package in order to translate the segmented text into ASCII. Also, a straightforward indexing and retrieval scheme is introduced. It is used in the experiments to demonstrate that the proposed text segmentation algorithms together with existing text recognition algorithms are suitable for indexing and retrieval of relevant video sequences in and from a video database. Our experimental results are very encouraging and suggest that these algorithms can be used in video retrieval applications as well as to recognize higher level semantics in videos. 相似文献

4.

Wavelet-based video indexing and querying

Xiaodong Wen Theodore D. Huffmire Helen H. Hu Adam Finkelstein 《Multimedia Systems》1999,7(5):350-358

We present several algorithms suitable for analysis of broadcast video. First, we show how wavelet analysis of frames of video can be used to detect transitions between shots in a video stream, thereby dividing the stream into segments. Next we describe how each segment can be inserted into a video database using an indexing scheme that involves a wavelet-based “signature.” Finally, we show that during a subsequent broadcast of a similar or identical video clip, the segment can be found in the database by quickly searching for the relevant signature. The method is robust against noise and typical variations in the video stream, even global changes in brightness that can fool histogram-based techniques. In the paper, we compare experimentally our shot transition mechanism to a color histogram implementation, and also evaluate the effectiveness of our database-searching scheme. Our algorithms are very efficient and run in realtime on a desktop computer. We describe how this technology could be employed to construct a “smart VCR” that was capable of alerting the viewer to the beginning of a specific program or identifying 相似文献

5.

An experimental analysis of digital video library servers

Michael Kozuch Wayne Wolf Andrew Wolfe 《Multimedia Systems》2000,8(2):135-145

Much work on video servers has concentrated on movies on demand, in which a relatively small number of titles are viewed and users are given basic VCR-style controls. This paper concentrates on analyzing video server performance for non-linear access applications. In particular, we study two non-linear video applications: video libraries, in which users select from a large collection of videos and may be interested in viewing only a small part of the title; and video walk-throughs, in which users can move through an image-mapped representation of a space. We present a characterization of the workloads of these applications. Our simulation studies show that video server architectures developed for movies on demand can be adapted to video library usage, though caching is less effective and the server can support a smaller user population for non-linear video applications. We also show that video walk-throughs require extremely large amounts of RAM buffering to provide adequate performance for even a small number of users. 相似文献

6.

Query by video clip 总被引：15，自引：0，他引：15

Anil K. Jain Aditya Vailaya Xiong Wei 《Multimedia Systems》1999,7(5):369-384

Typical digital video search is based on queries involving a single shot. We generalize this problem by allowing queries that involve a video clip (say, a 10-s video segment). We propose two schemes: (i) retrieval based on key frames follows the traditional approach of identifying shots, computing key frames from a video, and then extracting image features around the key frames. For each key frame in the query, a similarity value (using color, texture, and motion) is obtained with respect to the key frames in the database video. Consecutive key frames in the database video that are highly similar to the query key frames are then used to generate the set of retrieved video clips. (ii) In retrieval using sub-sampled frames, we uniformly sub-sample the query clip as well as the database video. Retrieval is based on matching color and texture features of the sub-sampled frames. Initial experiments on two video databases (basketball video with approximately 16,000 frames and a CNN news video with approximately 20,000 frames) show promising results. Additional experiments using segments from one basketball video as query and a different basketball video as the database show the effectiveness of feature representation and matching schemes. 相似文献

7.

Scene change detection techniques for video database systems 总被引：1，自引：0，他引：1

Haitao Jiang Abdelsalam Helal Ahmed K. Elmagarmid Anupam Joshi 《Multimedia Systems》1998,6(3):186-195

Scene change detection (SCD) is one of several fundamental problems in the design of a video database management system (VDBMS). It is the first step towards the automatic segmentation, annotation, and indexing of video data. SCD is also used in other aspects of VDBMS, e.g., hierarchical representation and efficient browsing of the video data. In this paper, we provide a taxonomy that classifies existing SCD algorithms into three categories: full-video-image-based, compressed-video-based, and model-based algorithms. The capabilities and limitations of the SCD algorithms are discussed in detail. The paper also proposes a set of criteria for measuring and comparing the performance of various SCD algorithms. We conclude by discussing some important research directions. 相似文献

8.

Fast techniques for the optimal smoothing of stored video 总被引：3，自引：0，他引：3

Sanjay G. Rao S.V. Raghavan 《Multimedia Systems》1999,7(3):222-233

Work-ahead smoothing is a technique whereby a server, transmitting stored compressed video to a client, utilizes client buffer space to reduce the rate variability of the transmitted stream. The technique requires the server to compute a schedule of transfer under the constraints that the client buffer neither overflows nor underflows. Recent work established an optimal off-line algorithm (which minimizes peak, variance and rate variability of the transmitted stream) under the assumptions of fixed client buffer size, known worst case network jitter, and strict playback of the client video. In this paper, we examine the practical considerations of heterogeneous and dynamically variable client buffer sizes, variable worst case network jitter estimates, and client interactivity. These conditions require on-line computation of the optimal transfer schedule. We focus on techniques for reducing on-line computation time. Specifically, (i) we present an algorithm for precomputing and storing the optimal schedules for all possible client buffer sizes in a compact manner; (ii) we show that it is theoretically possible to precompute and store compactly the optimal schedules for all possible estimates of worst case network jitter; (iii) in the context of playback resumption after client interactivity, we show convergence of the recomputed schedule with the original schedule, implying greatly reduced on-line computation time; and (iv) we propose and empirically evaluate an “approximation scheme” that produces a schedule close to optimal but takes much less computation time. 相似文献

9.

Zodiac: A history-based interactive video authoring system

Tzi-cker Chiueh Tulika Mitra Anindya Neogi Chuan-Kai Yang 《Multimedia Systems》2000,8(3):201-211

Easy-to-use audio/video authoring tools play a crucial role in moving multimedia software from research curiosity to mainstream applications. However, research in multimedia authoring systems has rarely been documented in the literature. This paper describes the design and implementation of an interactive video authoring system called Zodiac, which employs an innovative edit history abstraction to support several unique editing features not found in existing commercial and research video editing systems. Zodiac provides users a conceptually clean and semantically powerful branching history model of edit operations to organize the authoring process, and to navigate among versions of authored documents. In addition, by analyzing the edit history, Zodiac is able to reliably detect a composed video stream's shot and scene boundaries, which facilitates interactive video browsing. Zodiac also features a video object annotation capability that allows users to associate annotations to moving objects in a video sequence. The annotations themselves could be text, image, audio, or video. Zodiac is built on top of MMFS, a file system specifically designed for interactive multimedia development environments, and implements an internal buffer manager that supports transparent lossless compression/decompression. Shot/scene detection, video object annotation, and buffer management all exploit the edit history information for performance optimization. 相似文献

10.

Packet video transport in ATM networks with single-bit feedback

Hemant Kanakia Partho P. Mishra 《Multimedia Systems》1996,4(6):370-380

The promise of a broadband integrated service digital network has led to the design of mechanisms for efficient transport of real-time compressed video in packet switching networks. We examine feedback control for video transport in ATM networks where the available feedback is a single bit of information carried in the cell header. We investigate the performance of three single-bit schemes for source rate adaptation. Two were originally for congestion control of bursty data traffic and are modified to control video traffic. The third scheme conveys more information about the state of queue(s) at the bottleneck. The simulation results show that all three schemes for feedback control of VBR video streams work remarkably well. During severe network congestion, the signal quality degrades gracefully, but not uniformly across all connections. Based on insights from the initial simulations, we propose a scheme to improve the fairness of service and demonstrate its effectiveness. 相似文献

11.

An optimal bandwidth allocation strategy for the delivery of compressed prerecorded video 总被引：1，自引：0，他引：1

Wu-chi Feng Farnam Jahanian Stuart Sechrest 《Multimedia Systems》1997,5(5):297-309

The transportation of prerecorded, compressed video data without loss of picture quality requires the network and video servers to support large fluctuations in bandwidth requirements. Fully utilizing a client-side buffer for smoothing bandwidth requirements can limit the fluctuations in bandwidth required from the underlying network and the video-on-demand servers. This paper shows that, for a fixed-size buffer constraint, the critical bandwidth allocation technique results in plans for continuous playback of stored video that have (1) the minimum number of bandwidth increases, (2) the smallest peak bandwidth requirements, and (3) the largest minimum bandwidth requirements. In addition, this paper introduces an optimal bandwidth allocation algorithm which, in addition to the three critical bandwidth allocation properties, minimizes the total number of bandwidth changes necessary for continuous playback. A comparison between the optimal bandwidth allocation algorithm and other critical bandwidth-based algorithms using 17 full-length movie videos and 3 seminar videos is also presented. 相似文献

12.

Extraction of special effects caption text events from digital video 总被引：1，自引：1，他引：1

David Crandall Sameer Antani Rangachar Kasturi 《International Journal on Document Analysis and Recognition》2003,5(2-3):138-157

Abstract. The popularity of digital video is increasing rapidly. To help users navigate libraries of video, algorithms that automatically index video based on content are needed. One approach is to extract text appearing in video, which often reflects a scene's semantic content. This is a difficult problem due to the unconstrained nature of general-purpose video. Text can have arbitrary color, size, and orientation. Backgrounds may be complex and changing. Most work so far has made restrictive assumptions about the nature of text occurring in video. Such work is therefore not directly applicable to unconstrained, general-purpose video. In addition, most work so far has focused only on detecting the spatial extent of text in individual video frames. However, text occurring in video usually persists for several seconds. This constitutes a text event that should be entered only once in the video index. Therefore it is also necessary to determine the temporal extent of text events. This is a non-trivial problem because text may move, rotate, grow, shrink, or otherwise change over time. Such text effects are common in television programs and commercials but so far have received little attention in the literature. This paper discusses detecting, binarizing, and tracking caption text in general-purpose MPEG-1 video. Solutions are proposed for each of these problems and compared with existing work found in the literature. Received: January 29, 2002 / Accepted: September 13, 2002 D. Crandall is now with Eastman Kodak Company, 1700 Dewey Avenue, Rochester, NY 14650-1816, USA; e-mail: david.crandall@kodak.com S. Antani is now with the National Library of Medicine, 8600 Rockville Pike, Bethesda, MD 20894, USA; e-mail: antani@nlm.nih.gov Correspondence to: David Crandall 相似文献

13.

Evaluation of statistical and multiple-hypothesis tracking for video traffic surveillance

Jeffrey E. Boyd Jean Meloche 《Machine Vision and Applications》2003,13(5-6):344-351

Abstract. Conventional tracking methods encounter difficulties as the number of objects, clutter, and sensors increase, because of the requirement for data association. Statistical tracking, based on the concept of network tomography, is an alternative that avoids data association. It estimates the number of trips made from one region to another in a scene based on interregion boundary traffic counts accumulated over time. It is not necessary to track an object through a scene to determine when an object crosses a boundary. This paper describes statistical tracing and presents an evaluation based on the estimation of pedestrian and vehicular traffic intensities at an intersection over a period of 1 month. We compare the results with those from a multiple-hypothesis tracker and manually counted ground-truth estimates. Received: 30 August 2001 / Accepted: 28 May 2002 Correspondence to: J.E. Boyd 相似文献

14.

Lexically-generated subject hierarchies for browsing large collections

Craig G. Nevill-Manning Ian H. Witten Gordon W. Paynter 《International Journal on Digital Libraries》1999,2(2-3):111-123

Developing intuition for the content of a digital collection is difficult. Hierarchies of subject terms allow users to explore the space of topics that a collection covers, to form and specialize useful query terms, and to directly identify interesting documents. We describe two interfaces for navigating such hierarchies, and present a technique for inferring hierarchies automatically from large corpora. We also discuss scalability issues for the techniques involved, and our solutions to these problems. Received: 15 December 1997 / Revised: June 1999 相似文献

15.

A multi-level abstraction and modeling in video databases

Young Francis Day Ashfaq Khokhar Serhan Dagtas Arif Ghafoor 《Multimedia Systems》1999,7(5):409-423

In this paper, we propose a multi-level abstraction mechanism for capturing the spatial and temporal semantics associated with various objects in an input image or in a sequence of video frames. This abstraction can manifest itself effectively in conceptualizing events and views in multimedia data as perceived by individual users. The objective is to provide an efficient mechanism for handling content-based queries, with the minimum amount of processing performed on raw data during query evaluation. We introduce a multi-level architecture for video data management at different levels of abstraction. The architecture facilitates a multi-level indexing/searching mechanism. At the finest level of granularity, video data can be indexed based on mere appearance of objects and faces. For management of information at higher levels of abstractions, an object-oriented paradigm is proposed which is capable of supporting domain specific views. 相似文献

16.

2PSM: an efficient framework for searching video information in a limited-bandwidth environment

Kien A. Hua Wallapak Tavanapong James Z. Wang 《Multimedia Systems》1999,7(5):396-408

We present a novel technique, called 2-Phase Service Model, for streaming videos to home users in a limited-bandwidth environment. This scheme first delivers some number of non-adjacent data fragments to the client in Phase 1. The missing fragments are then transmitted in Phase 2 as the client is playing back the video. This approach offers many benefits. The isochronous bandwidth required for Phase 2 can be controlled within the capability of the transport medium. The data fragments received during Phase 1 can be used to provide an excellent preview of the video. They can also be used to facilitate VCR-style operations such as fast-forward and fast-reverse. Systems designed based on this method are less expensive because the fast-forward and fast-reverse versions of the video files are no longer needed. Eliminating these files also improves system performance because mapping between the regular files and their fast-forward and fast-reverse versions is no longer part of the VCR operations. Furthermore, since each client machine handles its own VCR-style interaction, this technique is very scalable. We provide simulation results to show that 2-Phase Service Model is able to handle VCR functions efficiently. We also implement a video player called {\em FRVplayer}. With this prototype, we are able to judge that the visual quality of the previews and VCR-style operations is excellent. These features are essential to many important applications. We discuss the application of FRVplayer in the design of a video management system, called VideoCenter. This system is intended for Internet applications such as digital video libraries. 相似文献

17.

Metrics for shot boundary detection in digital video sequences 总被引：5，自引：0，他引：5

Ralph M. Ford Craig Robson Daniel Temple Michael Gerlach 《Multimedia Systems》2000,8(1):37-46

The detection of shot boundaries in video sequences is an important task for generating indexed video databases. This paper provides a comprehensive quantitative comparison of the metrics that have been applied to shot boundary detection. In addition, several standardized statistical tests that have not been applied to this problem, as well as three new metrics, are considered. A mathematical framework for quantitatively comparing metrics is supplied. Experimental results based on a video database containing 39,000 frames are included. 相似文献

18.

A survey of statistical source models for variable-bit-rate compressed video 总被引：5，自引：0，他引：5

Michael R. Izquierdo Douglas S. Reeves 《Multimedia Systems》1999,7(3):199-213

相似文献

19.

Disk placement for arbitrary-rate playback in an interactive video server

Taeck-Geun Kwon Yanghee Choi Sukho Lee 《Multimedia Systems》1997,5(4):271-281

Multimedia data, especially continuous media including video and audio objects, represent a rich and natural stimulus for humans, but require large amount of storage capacity and real-time processing. In this paper, we describe how to organize video data efficiently on multiple disks in order to support arbitrary-rate playback requested by different users independently. Our approach is to segment and decluster video objects and to place the segments in multiple disks using a restricted round-robin scheme, called prime round-robin (PRR). Its placement scheme provides uniform load balance of disks for arbitrary retrieval rate as well as normal playback, since it eliminates hot spots. Moreover, it does not require any additional disk bandwidth to support VCR-like operations such as fast-forward and rewind. We have studied the various effects of placement and retrieval schemes in a storage server by simulation. The results show that PRR offers even disk accesses, and the failure in reading segment by deadline occurs only at the beginning of new operations. In addition, the number of users admitted is not decreased, regardless of arbitrary-rate playback requests. 相似文献

20.

Impact of QOS requirements on video coding for ATM networks

Shree Murthy Hari Lalgudi 《Multimedia Systems》1996,4(6):316-327

The broadband integrated services digital networks (B-ISDN) based on asynchronous transfer mode (ATM) technology can support a wide range of applications such as voice, video, still images, and data. Compression techniques increase the effective bandwidth utilization, but the bursty and asynchronous nature of the traffic can still lead to congestion in the network, and degradation of image quality and quality of service (QOS). Some of the features to provide better coding schemes for ATM networks are layered coding, resynchronization, buffering, interleaved schemes, constrained bit rate due to buffers, encapsulation with the RTP or AAL1 for clock recovery, lapped transforms, motion compensation, and optimal bit allocation for coders based on wavelet transforms. We review various techniques forimage and video coding such as transforms, motion compensation, vector quantization, and subband coding. We outline the impact of the cell loss ratio (CLR), delay and cell delay variation (CDV) on video coding: blocking effects, loss of frame synchronization, motion vectors, and vector quantization codewords. The open problems include tuning coding parameters to the available QOS provided by the network. 相似文献