首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
Pattern Analysis and Applications - This paper presents a novel compressed domain saliency estimation method based on analyzing block motion vectors and transform residuals extracted from the...  相似文献   

2.
As the majority of content-based image retrieval systems operate on full images in pixel domain, decompression is a prerequisite for the retrieval of compressed images. To provide a possible on-line indexing and retrieval technique for those jpg image files, we propose a novel pseudo-pixel extraction algorithm to bridge the gap between the existing image indexing technology, developed in the pixel domain, and the fact that an increasing number of images stored on the Web are already compressed by JPEG at the source. Further, we describe our Web-based image retrieval system, WEBimager, by using the proposed algorithm to provide a prototype visual information system toward automatic management, indexing, and retrieval of compressed images available on the Internet. This provides users with efficient tools to search the Web for compressed images and establish a database or a collection of special images to their interests. Experiments using texture- and colour-based indexing techniques support the idea that the proposed algorithm achieves significantly better results in terms of computing cost than their full decompression or partial decompression counterparts. This technology will help control the explosion of media-rich content by offering users a powerful automated image indexing and retrieval tool for compressed images on the Web.J. Jiang: Contacting author  相似文献   

3.
Low power as a de facto is one of the most important criteria for many signal-processing system designs, particularly in multimedia cellular applications and multimedia system on chip design. There have been many approaches to achieve this design goal at many different implementation levels ranging from very-large-scale-integration fabrication technology to system design. In this paper, the multirate low-power design technique will be used along with other methods such as look-ahead, pipelining in designing cost-effective low-power architectures of compressed domain video coding co-processor. Our emphasis is on optimizing power consumption by minimizing computational units along the data path. We demonstrate both low-power and high-speed can be accomplished at algorithm/architecture level. Based on the calculation and simulation results, the design can achieve significant power savings in the range of 60%-80% or speedup factor of two at the needs of users  相似文献   

4.
目的 基于深度模型的跟踪算法往往需要大规模的高质量标注训练数据集,而人工逐帧标注视频数据会耗费大量的人力及时间成本。本文提出一个基于Transformer模型的轻量化视频标注算法(Transformer-based label network,TLNet),实现对大规模稀疏标注视频数据集的高效逐帧标注。方法 该算法通过Transformer模型来处理时序的目标外观和运动信息,并融合前反向的跟踪结果。其中质量评估子网络用于筛选跟踪失败帧,进行人工标注;回归子网络则对剩余帧的初始标注进行优化,输出更精确的目标框标注。该算法具有强泛化性,能够与具体跟踪算法解耦,应用现有的任意轻量化跟踪算法,实现高效的视频自动标注。结果 在2个大规模跟踪数据集上生成标注。对于LaSOT (large-scale single object tracking)数据集,自动标注过程仅需约43 h,与真实标注的平均重叠率(mean intersection over union,mIoU)由0.824提升至0.871。对于TrackingNet数据集,本文使用自动标注重新训练3种跟踪算法,并在3个数据集上测试跟踪性能,使用本文标注训练的模型在跟踪性能上超过使用TrackingNet原始标注训练的模型。结论 本文算法TLNet能够挖掘时序的目标外观和运动信息,对前反向跟踪结果进行帧级的质量评估并进一步优化目标框。该方法与具体跟踪算法解耦,具有强泛化性,并能节省超过90%的人工标注成本,高效地生成高质量的视频标注。  相似文献   

5.
Multimedia Tools and Applications - Compressed Sensing, an emerging framework for signal processing, can be used in image and video application, especially when available resources at the...  相似文献   

6.
Image and video analysis requires rich features that can characterize various aspects of visual information. These rich features are typically extracted from the pixel values of the images and videos, which require huge amount of computation and seldom useful for real-time analysis. On the contrary, the compressed domain analysis offers relevant information pertaining to the visual content in the form of transform coefficients, motion vectors, quantization steps, coded block patterns with minimal computational burden. The quantum of work done in compressed domain is relatively much less compared to pixel domain. This paper aims to survey various video analysis efforts published during the last decade across the spectrum of video compression standards. In this survey, we have included only the analysis part, excluding the processing aspect of compressed domain. This analysis spans through various computer vision applications such as moving object segmentation, human action recognition, indexing, retrieval, face detection, video classification and object tracking in compressed videos.  相似文献   

7.
8.
Nearest neighbor (NN) search is emerging as an important search paradigm in a variety of applications in which objects are represented as vectors of d numeric features. However, despite decades of efforts, except for the filtering approach such as the VA-file, the current solutions to find exact kNNs are far from satisfactory for large d. The filtering approach represents vectors as compact approximations and by first scanning these smaller approximations, only a small fraction of the real vectors are visited. In this paper, we introduce the local polar coordinate file (LPC-file) using the filtering approach for nearest-neighbor searches in high-dimensional image databases. The basic idea is to partition the vector space into rectangular cells and then to approximate vectors by polar coordinates on the partitioned local cells. The LPC information significantly enhances the discriminatory power of the approximation. To demonstrate the effectiveness of the LPC-file, we conducted extensive experiments and compared the performance with the VA-file and the sequential scan by using synthetic and real data sets. The experimental results demonstrate that the LPC-file outperforms both of the VA-file and the sequential scan in total elapsed time and in the number of disk accesses and that the LPC-file is robust in both "good" distributions (such as random) and "bad" distributions (such as skewed and clustered)  相似文献   

9.
Multimedia Tools and Applications - Analyzing multimedia data in mobile devices is often constrained by limited computing capacity and power storage. Therefore, more and more studies are trying to...  相似文献   

10.
Song  Yun  Yang  Gaobo  Xie  Hongtao  Zhang  Dengyong  Xingming  Sun 《Multimedia Tools and Applications》2017,76(7):10083-10096
Multimedia Tools and Applications - For compressed sensing (CS) recovery, the reconstruction quality is highly dependent on the sparsity level of the representation for the signal. Motivated by the...  相似文献   

11.
Efficiently simulating large deformations of flexible objects is a challenging problem in computer graphics. In this paper, we present a physically based approach to this problem, using the linear elasticity model and a finite elements method. To handle large deformations in the linear elasticity model, we exploit the domain decomposition method, based on the observation that each sub-domain undergoes a relatively small local deformation, involving a global rigid transformation. In order to efficiently solve the deformation at each simulation time step, we pre-compute the object responses in terms of displacement accelerations to the forces acting on each node, yielding a force–displacement matrix. However, the force–displacement matrix could be too large to handle for densely tessellated objects. To address this problem, we present two methods. The first method exploits spatial coherence to compress the force-displacement matrix using the clustered principal component analysis method; and the second method pre-computes only the force–displacement vectors for the boundary vertices of the sub-domains and resorts to the Cholesky factorization to solve the acceleration for the internal vertices of the sub-domains. Finally, we present some experimental results to show the large deformation effects and fast performance on complex large scale objects under interactive user manipulations.  相似文献   

12.
In this study the authors proposed a real-time video object segmentation algorithm that works in the H.264 compressed domain. The algorithm utilises the motion information from the H.264 compressed bit stream to identify background motion model and moving objects. In order to preserve spatial and temporal continuity of objects, Markov random field (MRF) is used to model the foreground field. Quantised transform coefficients of the residual frame are also used to improve segmentation result. Experimental results show that the proposed algorithm can effectively extract moving objects from different kinds of sequences. The computation time of the segmentation process is merely about 16 ms per frame for CIF size frame, allowing the algorithm to be applied in real-time applications.  相似文献   

13.
张问银  曾振柄 《计算机应用》2006,26(5):1004-1005
在JPEG2000压缩框架下给出了两种压缩图像索引方法, 不需要完全解压缩, 减少了数据处理量。实验结果证明,给出的索引方法具有很强的图像表征能力,利用该索引进行图像检索,提高了检索效率。  相似文献   

14.
Environmental monitoring applications require seamless registration of optical data into large area mosaics that are geographically referenced to the world frame. Using frame-by-frame image registration alone, we can obtain seamless mosaics, but it will not exhibit geographical accuracy due to frame-to-frame error accumulation. On the other hand, the 3D geo-data from GPS, a laser profiler, an INS system provides a globally correct track of the motion without error propagation. However, the inherent (absolute) errors in the instrumentation are large for seamless mosaicing. The paper describes an effective two-track method for combining two different sources of data to achieve a seamless and geo-referenced mosaic, without 3D reconstruction or complex global registration. Experiments with real airborne video images show that the proposed algorithms are practical in important environmental applications. Zhigang Zhu received his B.E., M.E. and Ph.D. degrees, all in computer science from Tsinghua University, Beijing, in 1988, 1991 and 1997, respectively. He is currently an associate professor in the Department of Computer Science, the City College of the City University of New York. Previously, he was an associate professor at Tsinghua University, and a senior research fellow at the University of Massachusetts, Amherst. His research interests include 3D computer vision, HCI, virtual/augmented reality, video representation, and various applications in education, environment, robotics, surveillance and transportation. He has published over 90 technical papers in the related fields. He is a member of IEEE and ACM. Edward M. Riseman received his B.S. degree from Clarkson College of Technology in 1964 and his M.S. and Ph.D. degrees in electrical engineering from Cornell University in 1966 and 1969, respectively. He joined the Computer Science Department at UMass-Amherst as assistant professor in 1969, has been a professor since 1978, and served as chairman of the department from 1981 to 1985. Professor Riseman has conducted research in computer vision, artificial intelligence, learning, and pattern recognition, and has more than 200 publications. He has co-directed the Computer Vision Laboratory since its inception in 1975. Professor Riseman has been on the editorial boards of Computer Vision and Image Understanding (CVIU) from 1992 to 1997 and of the International Journal of Computer Vision (IJCV) from 1987 to the present. He is a senior member of IEEE, and a fellow of AAAI. Allen R. Hanson received his B.S. degree from Clarkson College of Technology in 1964 and his M.S. and Ph.D. degrees in electrical engineering from Cornell University in 1966 and 1969, respectively. He joined the Computer Science Department at UMass-Amherst as an associate professor in 1981, and has been a professor there since 1989. Professor Hanson has conducted research in computer vision, artificial intelligence, learning, and pattern recognition, and has more than 150 publications. He is co-director of the Computer Vision Laboratory at UMass-Amherst, and has been on the editorial boards of the following journals: Computer Vision, Graphics and Image Processing 1983–1990, Computer Vision, Graphics, and Image ProcessingImage Understanding 1991–1994, and Computer Vision and Image Understanding 1995–present. Howard Schultz received a M.S. degree in physics from UCLA in 1974 and a Ph.D. in physical oceanography from the University of Michigan in 1982. Currently, he is a senior research fellow with the Computer Science Department at the University of Massachusetts, Amherst. His research interests include quantitative methods for image understanding and remote sensing. The current focus of his research activities are on developing automatic techniques for generating complex, 3D models from sequences of images. This research has found application in a variety of programs including real-time terrain modeling and video aided navigation. He is a member of the IEEE, the American Geophysical Union, and the American Society of Photogrammetry and Remote Sensing.  相似文献   

15.
视频检索是高维空间中的计算。针对高维计算量大的特点,提出了构造一个核矢量的算法,将高维空间转换到低维空间,在低维空间逐维过滤不相似的数据集,缩小检索范围,提高检索速度。  相似文献   

16.
17.
将视频序列通过关键帧提取的方式转换成静态图像,然后利用图像处理技术进行分析是视频处理的一种有效方法。首先讨论了近年来基于压缩域的关键帧提取技术,然后分析和讨论了针对敏感视频识别应用的关键帧提取方法,并给出了一种快速有效的关键帧提取方案。  相似文献   

18.
Detection of human faces in a compressed domain for video stratification   总被引:5,自引:0,他引:5  
Published online: 15 March 2002  相似文献   

19.
胡新韬  郭雷  任建峰 《计算机应用》2005,25(6):1302-1304
如何在压缩域进行镜头的切变检测一直是视频自动索引和检索中的难点。提出了一种MPEG压缩域多尺度镜头切变检测算法,在GOP、slot和B帧三个尺度上对MPEG视频流进行分析。通过对相邻I帧的检测,确定一个GOP中是否存在镜头切变;通过对slot的分析,确定镜头切变在GOP中所处的区域;通过对B帧的检测,确定镜头切变发生的确切位置。  相似文献   

20.
Everyday, we encounter high-quality multimedia contents from HDTV broadcasting, DVD, and high-speed Internet services. These contents are, unhappily, processed and distributed without protection. This paper proposes a practical video watermarking technique on the compressed domain that is real-time and robust against video processing attacks. In particular, we focus on video processing that is commonly used in practice such as downscaling resolution, framerate changing, and transcoding. Most previous watermarking algorithms are unable to survive when these processings are strong or composite. We extract low frequency coefficients of frames in fast by partly decoding videos and apply a quantization index modulation scheme to embed and detect the watermark. On an Intel architecture computer, we implement a prototype system and measure performance against video processing attacks frequently occur in the real world. Simulation results show that our video watermarking system satisfies real-time requirements and is robust to protect the copyright of HD video contents.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号