The analysis of the depth coordinates of objects in a visual scene is of vital importance for animals as well as in technological applications like autonomous robot navigation or product quality control. In this article we describe a phase-based algorithm for stereoscopic depth analysis which utilizes IIR-filters.1 This algorithm is especially well suited to be built into dedicated VLSI-hardware and can therefore, also be used as a fast real-time front end in any more general image processing system. Example movies which demonstrate the real-time capabilities of this algorithm can be found at: http://www.cn.stir.ac.uk/Real-Time-Stereo.  相似文献   

基于计算机三维立体视差映射的双目立体成像涉及计算机视觉、模式识别、计算机图形学等领域中许多具有挑战性的研究问题。主要存在的问题是所成的立体图像仍不够逼真和自然,人们对人眼的功能以及双目立体成像模型的了解还不够彻底,立体图像对的获取还有一些难题没有得到较好的解决。该文提出了一种已建三维模型的情况下立体图像生成方法。介绍了三维软件中如何利用摄像机对象生成双目立体图像,研究了影响立体效果的几个重要因素,包括目标摄像机与三维模型的位置关系、镜头距离、成像位置的控制等内容。这些工作基于计算机三维立体视差映射的双目立体成像推向深入,也为双目立体成像在可视化立体展示中的应用提供了理论和技术上的支撑。  相似文献   

视频序列的全景图拼接技术   总被引:10,自引:0,他引:10       下载免费PDF全文
提出了一种对视频序列进行全景图拼接的方法。主要讨论了有大面积的非刚性运动物体出现的序列,不过此方法也同样适用于无运动物体的纯背景序列。为计算各帧间的投影关系,用仿射模型来描述摄像机运动,并用特征点匹配的方法计算出模型中各参数的值。由于用相关法计算的匹配结果准确率比较低,所以用RANSAC(Random Sampling Consensus)对匹配结果进行了筛选,可以准确求出摄像机运动参数。利用运动参数进行投影,然后用多帧相减并求交集,估计出每帧图像中运动物体存在的区域,最后计算得到了全景图。该方法的结果与前人得到的结果进行了比较,证明用此方法能获得质量较高的全景图。  相似文献   

魏峰  王文成  吴恩华 《计算机学报》2006,29(12):2086-2095
提出一种线绘制视频的可视化方法.一方面,基于体数据和视点的重要性度量使视频能从更有效的视点来观察体数据,并且每帧图像都能对数据场的内容有很高的反映强度;与此同时,视点连续高效地变化,也有利于用户形成关于数据场内容的完整印象.与先进的基于线绘制的交互可视化方法相比,新方法的成像速度和可视化反映强度更高,能以更少的可视化图像达到对数据场内容更全面的认识.  相似文献   

蒋勇  张海涛 《计算机科学》2016,43(11):19-23, 60
为有效处理视频数据问题,提出一种识别海量数据集中代表子集的方法,即代表选择方法,经选择后的小容量的数据代表完全可以代表原始大数据集的结构特征。对于给定的大数据集,首先生成相应1-norm非负稀疏图,然后利用一种谱聚类算法基于所生成的稀疏图将大数据反复划分直至形成聚类簇。代表选择过程中,将每个聚类看作Grassmann流形中的一个点,然后基于测地距衡量这些点间的距离,接着利用min-max算法分析距离以提取出较优的聚类子集。最后,通过分析被选集类的一个稀疏子图,利用主成分集中性方法探测出数据代表,称此过程为基于非负稀疏图与Grassmann流形测地距的代表选择框架。为验证所提出的框架,将之应用于视频分析中,从一长段的视频流中识别出少数的几个关键帧,实验效果通过人工判断与标准评价方法进行评价,并与现有的几种方法的效果进行比对,结果证明所提出的代表选择框架方法具有更好的效果与可行性。  相似文献   

随着互联网和大数据的飞速发展,数据规模越来越大,种类也越来越多.视频作为其中重要的一种信息方式,随着近期短视频的发展,占比越来越大.如何对这些大规模视频进行理解分析,成为学界关注的热点.实体链接作为一种背景知识补全方式,可以提供丰富的外部知识.视频上的实体链接可以有效地帮助理解视频内容,从而实现对视频内容的分类、检索、推荐等.但是现有的视频链接数据集和方法的粒度过粗,因此提出面向视频的细粒度实体链接,并立足于直播场景,构建了细粒度视频实体链接数据集.此外,依据细粒度视频链接任务的难点,提出利用大模型抽取视频中的实体及其属性,并利用对比学习得到视频和对应实体的更好表示.实验结果表明,该方法能够有效地处理视频上的细粒度实体链接任务.  相似文献   

盛斌  吴恩华 《软件学报》2008,19(7):1806-1816
首先推导与归纳了图像三维变换中像素深度场的变换规律,同时提出了基于深度场和极线原则的像素可见性别方法,根据上述理论和方法,提出一种基于深度图像的建模与绘制(image-based modeling and rendering,简称IBMR)技术,称为虚平面映射.该技术可以基于图像空间内任意视点对场景进行绘制.绘制时,先在场景中根据视线建立若干虚拟平面,将源深度图像中的像素转换到虚平面上,然后通过对虚平面上像素的中间变换,将虚平面转换成平面纹理,再利用虚平面的相互拼接,将视点的成像以平面纹理映射的方式完成.新方法还能在深度图像内侧,基于当前视点快速获得该视点的全景图,从而实现视点的实时漫游.新方法视点运动空间大、存储需求小,且可以发挥图形硬件的纹理映射功能,并能表现物体表面的三维凹凸细节和成像视差效果,克服了此前类似算法的局限和不足.  相似文献   

This paper addresses the problem of detecting objectionable videos, which has never been carefully studied before. Our method can be efficiently used to filter objectionable videos on Internet. One tensor based key-frame selection algorithm, one cube based color model and one objectionable video estimation algorithm are presented. The key frame selection is based on motion analysis using the three-dimensional structure tensor. Then the cube based color model is employed to detect skin color in each key frame. Finally, the video estimation algorithm is applied to estimate objectionable degree in videos. Experimental results on a variety of real-world videos downloaded from Internet show that this method is promising.  相似文献   

Yang proposed the concept of borrow-and-return (BR) to leverage the unused server bandwidth when a group of popular videos being broadcast with the FSFC (first segment on the first channel) broadcasting schemes in order to improve the mean waiting time (MWT) of the viewers with the help of additional receiving bandwidth available at the high-end clients. The BR model borrows the bandwidth of the videos with no new-coming viewers during a timeslot to speed up the transmission of the first segments of some of the remaining videos. In this paper, we first address the relative advantage issue among various possible BR schemes by developing a parametric generic BR (GBR) scheme controlled externally by independent borrow parameters. Later, we propose a new BR (NBR) model by incorporating an efficient transmission strategy to reduce the MWT further. Finally, an optimal NBR scheme is developed by augmenting with the optimal borrow parameters, which significantly outperforms the existing and new BR schemes in terms of overall MWT.   相似文献   

Stereoscopic ray-tracing   总被引:3,自引:0,他引:3  
In this paper, we describe a method to create an approximate ray-traced stereoscopic pair by transforming a fully raytraced left-eye view into an inferred right-eye view. Performance results from evaluating several random scenes, which indicate that the second view in a stereoscopic image can be computed with as little as 5% of the effort required to fully ray-trace the first view, are presented. We also discuss worst-case performance of our algorithm and demonstrate that our technique is always at least as efficient as two passes of a standard ray-tracer.  相似文献   

This article addresses the use of stereoscopic images in teleoperated tasks. Depth perception is a key point in the ability to skillfully manipulate in remote environments. Displaying three‐dimensional images is a complex process but it is possible to design a teleoperation interface that displays stereoscopic images to assist in manipulation tasks. The appropriate interface for image viewing must be chosen and the stereoscopic video cameras must be calibrated so that the image disparity is natural for the observer. Attention is given to the calculation of stereoscopic image disparity, and suggestions are made as to the limits within which adequate stereoscopic image perception takes place. The authors have designed equipment for image visualization in teleoperated systems. These devices are described and their performance evaluated. Finally, an architecture for the transmission of stereoscopic video images via network is proposed, which in the future will substitute for current image processing devices. © 2005 Wiley Periodicals, Inc.  相似文献   

近几年,随着视频数据规模的不断增加,近重复视频数据不断涌现,视频的数据质量问题越来越突出。通过近重复视频清洗方法,有助于提高视频集的数据质量。然而,目前针对近重复视频清洗问题的研究较少,主要集中于近重复视频检索等方面的研究。现有研究方法尽管可以有效识别近重复视频,但较难在保证数据完整性的前提下,自动清洗近重复视频数据,以便改善视频数据质量。为解决上述问题,提出一种融合VGG-16深度网络与FD-means(feature distance-means)聚类的近重复视频清洗方法。该方法借助MOG2模型和中值滤波算法对视频进行背景分割和前景降噪;利用VGG-16深度网络模型提取视频的深度空间特征;构建一种新的FD-means聚类算法模型,通过迭代产生的近重复视频簇,更新簇类中心点,并最终删除簇中中心点之外的近重复视频数据。实验结果表明,该方法能够有效解决近重复视频数据清洗问题,改善视频的数据质量。  相似文献   

We cast the problem of multiframe stereo reconstruction of a smooth shape as the global region segmentation of a collection of images of the scene. Dually, the problem of segmenting multiple calibrated images of an object becomes that of estimating the solid shape that gives rise to such images. We assume that the radiance of the scene results in piecewise homogeneous image statistics. This simplifying assumption covers Lambertian scenes with constant albedo as well as fine homogeneous textures, which are known challenges to stereo algorithms based on local correspondence. We pose the segmentation problem within a variational framework, and use fast level set methods to find the optimal solution numerically. Our algorithm does not work in the presence of strong photometric features, where traditional reconstruction algorithms do. It enjoys significant robustness to noise under the assumptions it is designed for.  相似文献   

Most studies in the literature for video quality assessment have been focused on the evaluation of quantized video sequences at fixed and high spatial and temporal resolutions. Only limited work has been reported for assessing video quality under different spatial and temporal resolutions. In this paper, we consider a wider scope of video quality assessment in the sense of considering multiple dimensions. In particular, we address the problem of evaluating perceptual visual quality of low bit-rate videos under different settings and requirements. Extensive subjective view tests for assessing the perceptual quality of low bit-rate videos have been conducted, which cover 150 test scenarios and include five distinctive dimensions: encoder type, video content, bit rate, frame size, and frame rate. Based on the obtained subjective testing results, we perform thorough statistical analysis to study the influence of different dimensions on the perceptual quality and some interesting observations are pointed out. We believe such a study brings new knowledge into the topic of cross-dimensional video quality assessment and it has immediate applications in perceptual video adaptation for scalable video over mobile networks.   相似文献   

该文针对新闻视频设计并实现了一个显著人脸检索系统。首先将新闻视频分割成镜头序列,利用训练好的CascadeAdaboost人脸检测器对每个镜头检测出一定数目的候选人脸,按照一些规则选取可信度高的作为样本,用于提取该镜头内的肤色模型。接着对肤色分割后的区域进行位置、大小分析和模板匹配,以淘汰非人脸区域,确定待跟踪的对象列表。为了做精确的跟踪和识别,系统对每个跟踪对象建立更细致的肤色模型。跟踪过程中每间隔一定帧数重新进行人脸检测,以减少误差积累和探测是否有新人脸出现。最后从每个人脸序列挑选最适合进行人脸识别的图像建立其特征脸空间,结合肤色信息和PCA算法判断其是否为要检索的目标人脸。  相似文献   

