期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

1.

A two-level generative model for cloth representation and shape from shading

Han F Zhu SC 《IEEE transactions on pattern analysis and machine intelligence》2007,29(7):1230-1243

In this paper, we present a two-level generative model for representing the images and surface depth maps of drapery and clothes. The upper level consists of a number of folds which will generate the high contrast (ridge) areas with a dictionary of shading primitives (for 2D images) and fold primitives (for 3D depth maps). These primitives are represented in parametric forms and are learned in a supervised learning phase using 3D surfaces of clothes acquired through photometric stereo. The lower level consists of the remaining flat areas which fill between the folds with a smoothness prior (Markov random field). We show that the classical ill-posed problem-shape from shading (SFS) can be much improved by this two-level model for its reduced dimensionality and incorporation of middle-level visual knowledge, i.e., the dictionary of primitives. Given an input image, we first infer the folds and compute a sketch graph using a sketch pursuit algorithm as in the primal sketch (Guo et al., 2003). The 3D folds are estimated by parameter fitting using the fold dictionary and they form the "skeleton" of the drapery/cloth surfaces. Then, the lower level is computed by conventional SFS method using the fold areas as boundary conditions. The two levels interact at the final stage by optimizing a joint Bayesian posterior probability on the depth map. We show a number of experiments which demonstrate more robust results in comparison with state-of-the-art work. In a broader scope, our representation can be viewed as a two-level inhomogeneous MRF model which is applicable to general shape-from-X problems. Our study is an attempt to revisit Marr's idea (Marr and Freeman, 1982) of computing the 2frac12D sketch from primal sketch. In a companion paper (Barbu and Zhu, 2005), we study shape from stereo based on a similar two-level generative sketch representation. 相似文献

2.

Perceptual Scale-Space and Its Applications

Yizhou Wang Song-Chun Zhu 《International Journal of Computer Vision》2008,80(1):143-165

When an image is viewed at varying resolutions, it is known to create discrete perceptual jumps or transitions amid the continuous intensity changes. In this paper, we study a perceptual scale-space theory which differs from the traditional image scale-space theory in two aspects. (i) In representation, the perceptual scale-space adopts a full generative model. From a Gaussian pyramid it computes a sketch pyramid where each layer is a primal sketch representation (Guo et al. in Comput. Vis. Image Underst. 106(1):5–19, 2007)—an attribute graph whose elements are image primitives for the image structures. Each primal sketch graph generates the image in the Gaussian pyramid, and the changes between the primal sketch graphs in adjacent layers are represented by a set of basic and composite graph operators to account for the perceptual transitions. (ii) In computation, the sketch pyramid and graph operators are inferred, as hidden variables, from the images through Bayesian inference by stochastic algorithm, in contrast to the deterministic transforms or feature extraction, such as computing zero-crossings, extremal points, and inflection points in the image scale-space. Studying the perceptual transitions under the Bayesian framework makes it convenient to use the statistical modeling and learning tools for (a) modeling the Gestalt properties of the sketch graph, such as continuity and parallelism etc; (b) learning the most frequent graph operators, i.e. perceptual transitions, in image scaling; and (c) learning the prior probabilities of the graph operators conditioning on their local neighboring sketch graph structures. In experiments, we learn the parameters and decision thresholds through human experiments, and we show that the sketch pyramid is a more parsimonious representation than a multi-resolution Gaussian/Wavelet pyramid. We also demonstrate an application on adaptive image display—showing a large image in a small screen (say PDA) through a selective tour of its image pyramid. In this application, the sketch pyramid provides a means for calculating information gain in zooming-in different areas of an image by counting a number of operators expanding the primal sketches, such that the maximum information is displayed in a given number of frames. A short version was published in ICCV05 (Wang et al. 2005). 相似文献

3.

Multiple piecewise constant with geodesic active contours (MPC-GAC) framework for interactive image segmentation using graph cut optimization

Wenbing Tao Xue-Cheng Tai 《Image and vision computing》2011,29(8):499-508

This paper proposes an improved variational model, multiple piecewise constant with geodesic active contour (MPC-GAC) model, which generalizes the region-based active contour model by Chan and Vese, 2001 [11] and merges the edge-based active contour by Caselles et al., 1997 [7] to inherit the advantages of region-based and edge-based image segmentation models. We show that the new MPC-GAC energy functional can be iteratively minimized by graph cut algorithms with high computational efficiency compared with the level set framework. This iterative algorithm alternates between the piecewise constant functional learning and the foreground and background updating so that the energy value gradually decreases to the minimum of the energy functional. The k-means method is used to compute the piecewise constant values of the foreground and background of image. We use a graph cut method to detect and update the foreground and background. Numerical experiments show that the proposed interactive segmentation method based on the MPC-GAC model by graph cut optimization can effectively segment images with inhomogeneous objects and background. 相似文献

4.

基于图割的低景深图像自动分割

刘毅陈圣磊冯国富黄兵夏德深《自动化学报》2015,41(8):1471-1481

结合图割算法,提出了一种针对低景深(Depth of field, DOF)图像的自动分割模型.首先,通过改进的点锐度算法得到图像的点锐度图, 并结合图像的颜色特征,得到一个四维的特征向量.其次, 通过对图像点锐度图强边缘的计算,利用图像清晰部分边缘较连续, 模糊部分边缘较弱、连续性较差的特点得到图像初步的前景/背景区域. 然后,对前景/背景的颜色和点锐度特征进行高斯混合模型(Gaussian mixture model, GMM)建模,结合全局、局部自适应的λ值,对图割算法的Shrinking bias 现象进行改善.最后,通过迭代的图割算法对前景/背景区域进行修正. 实验结果表明,该模型鲁棒性较高,分割结果更加精确. 相似文献

5.

A Spatially Constrained Generative Model and an EM Algorithm for Image Segmentation 总被引：3，自引：0，他引：3

Diplaros A. Vlassis N. Gevers T. 《Neural Networks, IEEE Transactions on》2007,18(3):798-808

In this paper, we present a novel spatially constrained generative model and an expectation-maximization (EM) algorithm for model-based image segmentation. The generative model assumes that the unobserved class labels of neighboring pixels in the image are generated by prior distributions with similar parameters, where similarity is defined by entropic quantities relating to the neighboring priors. In order to estimate model parameters from observations, we derive a spatially constrained EM algorithm that iteratively maximizes a lower bound on the data log-likelihood, where the penalty term is data-dependent. Our algorithm is very easy to implement and is similar to the standard EM algorithm for Gaussian mixtures with the main difference that the labels posteriors are "smoothed" over pixels between each E- and M-step by a standard image filter. Experiments on synthetic and real images show that our algorithm achieves competitive segmentation results compared to other Markov-based methods, and is in general faster 相似文献

6.

What are Textons? 总被引：2，自引：0，他引：2

Zhu Song-Chun Guo Cheng-en Wang Yizhou Xu Zijian 《International Journal of Computer Vision》2005,62(1-2):121-143

相似文献

7.

Image-based rendering of range data with estimated depth uncertainty 总被引：2，自引：0，他引：2

Hofsetz C. Ng K. Chen G. McGuinness P. Max N. Liu Y. 《Computer Graphics and Applications, IEEE》2004,24(4):34-41

Image-based rendering (IBR) involves constructing an image from a new viewpoint, using several input images from different viewpoints. Our approach is to acquire or estimate the depth for each pixel of each input image. We then reconstruct the new view from the resulting collection of 3D points. When rendering images from photographs, acquiring and registering data is far from perfect. Accuracy can fluctuate, depending on the choice of geometry reconstruction technique. Our image-rendering approach involves three steps: depth extraction, uncertainty estimation, and rendering. That is, we first compute a depth map for every input image. Then we calculate the uncertainty information using the estimated depth maps as starting points. Finally, we perform the actual rendering, which renders the uncertainty estimated in the previous step as ellipsoidal Gaussian splats. 相似文献

8.

基于生成对抗网络的漫画草稿图简化 总被引：2，自引：0，他引：2

卢倩雯陶青川赵娅琳刘蔓霄《自动化学报》2018,44(5):840-854

在漫画绘制的过程中,按草稿绘制出线条干净的线稿是很重要的一环.现有的草图简化方法已经具有一定的线条简化能力,然而由于草图的绘制方式的多样性以及画面复杂程度的不同,这些方法适用范围有限且效果不理想.本文提出了一种新颖的草图简化方法,利用条件随机场（Conditional random field,CRF）和最小二乘生成式对抗网络（Least squares generative adversarial networks,LSGAN）理论搭建了深度卷积神经网络的草图简化模型,通过该网络生成器与判别器之间的零和博弈与条件约束,得到更加接近真实的简化线稿图.同时,为了训练对抗模型的草图简化能力,本文建立了包含更多绘制方式与不同内容的草图与简化线稿图对的训练数据集.实验表明,本文算法对于复杂情况下的草图,相比于目前的方法,具有更好的简化效果. 相似文献

9.

Learning Active Basis Model for Object Detection and Recognition 总被引：1，自引：0，他引：1

Ying Nian Wu Zhangzhang Si Haifeng Gong Song-Chun Zhu 《International Journal of Computer Vision》2010,90(2):198-235

This article proposes an active basis model, a shared sketch algorithm, and a computational architecture of sum-max maps for representing, learning, and recognizing deformable templates. In our generative model, a deformable template is in the form of an active basis, which consists of a small number of Gabor wavelet elements at selected locations and orientations. These elements are allowed to slightly perturb their locations and orientations before they are linearly combined to generate the observed image. The active basis model, in particular, the locations and the orientations of the basis elements, can be learned from training images by the shared sketch algorithm. The algorithm selects the elements of the active basis sequentially from a dictionary of Gabor wavelets. When an element is selected at each step, the element is shared by all the training images, and the element is perturbed to encode or sketch a nearby edge segment in each training image. The recognition of the deformable template from an image can be accomplished by a computational architecture that alternates the sum maps and the max maps. The computation of the max maps deforms the active basis to match the image data, and the computation of the sum maps scores the template matching by the log-likelihood of the deformed active basis. 相似文献

10.

Stable Morse Decompositions for Piecewise Constant Vector Fields on Surfaces

Andrzej Szymczak 《Computer Graphics Forum》2011,30(3):851-860

Numerical simulations and experimental observations are inherently imprecise. Therefore, most vector fields of interest in scientific visualization are known only up to an error. In such cases, some topological features, especially those not stable enough, may be artifacts of the imprecision of the input. This paper introduces a technique to compute topological features of user‐prescribed stability with respect to perturbation of the input vector field. In order to make our approach simple and efficient, we develop our algorithms for the case of piecewise constant (PC) vector fields. Our approach is based on a super‐transition graph, a common graph representation of all PC vector fields whose vector value in a mesh triangle is contained in a convex set of vectors associated with that triangle. The graph is used to compute a Morse decomposition that is coarse enough to be correct for all vector fields satisfying the constraint. Apart from computing stable Morse decompositions, our technique can also be used to estimate the stability of Morse sets with respect to perturbation of the vector field or to compute topological features of continuous vector fields using the PC framework. 相似文献

11.

A hierarchical compositional model for face representation and sketching

Xu Z Chen H Zhu SC Luo J 《IEEE transactions on pattern analysis and machine intelligence》2008,30(6):955-969

相似文献

12.

Analysis and synthesis of textured motion: particles and waves

Wang Y Zhu SC 《IEEE transactions on pattern analysis and machine intelligence》2004,26(10):1348-1363

Natural scenes contain a wide range of textured motion phenomena which are characterized by the movement of a large amount of particle and wave elements, such as falling snow, wavy water, and dancing grass. In this paper, we present a generative model for representing these motion patterns and study a Markov chain Monte Carlo algorithm for inferring the generative representation from observed video sequences. Our generative model consists of three components. The first is a photometric model which represents an image as a linear superposition of image bases selected from a generic and overcomplete dictionary. The dictionary contains Gabor and LoG bases for point/particle elements and Fourier bases for wave elements. These bases compete to explain the input images and transfer them to a token (base) representation with an O(10(2))-fold dimension reduction. The second component is a geometric model which groups spatially adjacent tokens (bases) and their motion trajectories into a number of moving elements--called "motons." A moton is a deformable template in time-space representing a moving element, such as a falling snowflake or a flying bird. The third component is a dynamic model which characterizes the motion of particles, waves, and their interactions. For example, the motion of particle objects floating in a river, such as leaves and balls, should be coupled with the motion of waves. The trajectories of these moving elements are represented by coupled Markov chains. The dynamic model also includes probabilistic representations for the birth/death (source/sink) of the motons. We adopt a stochastic gradient algorithm for learning and inference. Given an input video sequence, the algorithm iterates two steps: 1) computing the motons and their trajectories by a number of reversible Markov chain jumps, and 2) learning the parameters that govern the geometric deformations and motion dynamics. Novel video sequences are synthesized from the learned models and, by editing the model parameters, we demonstrate the controllability of the generative model. 相似文献

13.

局部均值噪声估计的盲3维滤波降噪算法

下载免费PDF全文

徐少平张兴强姜尹楠唐祎玲江顺亮《中国图象图形学报》2017,22(4):422-434

目的图像在获取和传输的过程中很容易受到噪声的干扰,图像降噪作为众多图像处理系统的预处理模块在过去数十年中得到了广泛的研究。在已提出的降噪算法中,往往采用加性高斯白噪声模型AWGN（additive white Gaussian noise）为噪声建模,噪声水平（严重程度）由方差参数控制。经典的BM3D 3维滤波算法属于非盲降噪（non-blind denoising algorithm）算法,在实际使用中需要由人工评估图像噪声水平并设置参数,存在着噪声评估值随机性大而导致无法获得最佳降噪效果的问题。为此,提出了一种新的局部均值噪声估计（LME）算法并作为BM3D算法的前置预处理模块。方法本文专注于利用基于自然统计规律（NSS）的图像质量感知特征和局部均值估计技术构建图像噪声水平预测器,并通过它高效地获得噪声图像中准确的噪声水平值。关于自然场景统计方面的研究表明,无失真的自然场景图像在空域或者频率域上具有显著的统计规律,一旦受到噪声干扰会产生规律性的偏移,可以提取这些特征值作为反映图像质量好坏的图像质量感知特征。另外,局部均值估计因其简单而高效率的预测特性被采用。具体实现上,在具有广泛代表性且未受噪声干扰图像集合上添加不同噪声水平的高斯噪声构建失真图像集合,然后利用小波变换对这些失真图像进行不同尺度和不同方向的分解,再用广义高斯分布模型（GGD）提取子带滤波系数的统计信息构成描述图像失真程度的特征矢量,最后用每幅失真图像上所提取的特征矢量及对其所施加的高斯噪声水平值构成了失真特征矢量库。在降噪阶段,用相同的特征提取方法提取待降噪的图像的特征矢量并在失真特征矢量库中检索出与之类似的若干特征矢量及它们所对应的噪声水平值,然后用局部均值法估计出待降噪图像中高斯噪声大小作为经典BM3D算法的输入参数。结果改进后的BM3D算法转换为盲降噪算法,称为BM3D-LME（block-matching and 3D filtering based on local means estimation）算法。准确的噪声估计对于诸如图像降噪,图像超分辨率和图像分割等图像处理任务非常重要。已经验证了所提出噪声水平估计算法的准确性、鲁棒性和有效性。结论相对人工进行噪声估计,LME算法能够准确、快速地估算出任意待降噪图像中的噪声大小。配合BM3D算法使用后,有效提高了它的实际降噪效果并扩大它的应用范围。相似文献

14.

Rendering the image of glare effect based on paired and unpaired dual generative adversarial network

《Displays》2023

It is a great challenge to rendering glare on image as the current rendering algorithms did not consider well the refraction of human eyes, thus the effect of rendering, in some critical application such as vehicle headlamps, is not real and may affect the safety evaluation. The traditional glare rendering algorithm relies on a large number of hand-designed wave optics processing operators, not only cannot complete the rendering work online in real time, but also cannot cope with the complex and changeable imaging conditions in reality. The mainstream generative adversarial network based algorithms in the field of image style translation are introduced to generate glare effect, which could be rendering online in a real time, however they still fail to render some effects such as detail distortion. In this work, we present a novel glare simulation generation method which is the first algorithm to apply a generative model based style transfer method to glare rendering. In a nutshell, a new method named Glare Generation Network is proposed to aggregate the benefits of content diversity and style consistency, which combines both paired and unpaired branch in a dual generative adversarial network. Our approach increase the structural similarity index measure by at least 0.039 on the custom darkroom vehicle headlamp dataset. We further show our method significantly improve the inference speed. 相似文献

15.

Image Classification with the Fisher Vector: Theory and Practice

Jorge Sánchez Florent Perronnin Thomas Mensink Jakob Verbeek 《International Journal of Computer Vision》2013,105(3):222-245

相似文献

16.

环境光照下的毛发渲染与外观编辑

徐昆马里千任博胡事民《计算机辅助设计与图形学学报》2012,24(2):143-145

提出一种复杂环境光照条件下交互级毛发绘制与外观编辑的算法,其中光源采用球面径向基函数(SRBF)表示.推导出了一个可以用来精确表达Marschner毛发散射函数(Marschner S R,Jensen H W,Cammarano M,et al.Light scattering from human hair fibers.ACM Transactions on Graphics,2003,22(3):780-791)[2]的简洁一维圆高斯表达,该表达可以高效地以解析形式计算每个SRBF光源与毛发散射函数的积分,因此支持高效的单次散射和多次散射计算.与前人工作不同,文中算法完全在运行时进行所有的计算,不需要昂贵的预计算步骤,因此可以动态地改变毛发的散射参数.分析表明,文中提出的近似表示是精确且简洁的.此外,该算法可以处理椭圆头发截面的情况.通过在GPU上实现,文中算法可以达到交互帧率. 相似文献

17.

外极面图像的运动遮挡模型和运动纹理方向检测算法

朱志刚林学訚徐光祐《计算机学报》1999,22(3):283-289

运动遮挡边界处的运动估计是一种困难的问题,外极面图像方法将运动估计转化为转迹线的检测,人造物体的轨迹线容易通过边缘跟踪的方法获得,但对于纹理复杂的自然景物,轨迹跟踪较为困难。相似文献

18.

基于多尺度HOG的草图检索

李思思陈曦肖建《计算机工程与科学》2016,38(3):520-527

草图检索是图像处理领域中的重要研究内容。提出了一种将高斯金字塔和局部HOG特征融合的特征提取改进方法,并将其用于草图检索。采用高斯金字塔将图像分解到多尺度空间,在所有尺度上进行兴趣点提取,获得基于兴趣点的多尺度HOG特征。利用图像的多尺度HOG特征集生成视觉词典,最终形成与视觉词典相关的特征描述向量,通过相似度匹配实现草图检索。将该算法与单一尺度下的HOG算法及其他几种算法比较,实验结果表明了其可行性和有效性。相似文献

19.

基于二维点云图的三维人体建模方法

下载免费PDF全文

张广翩计忠平《计算机工程与应用》2020,56(19):205-215

近年来基于二维图像的三维建模方法取得了快速发展,但就人体建模而言,由于摄像头采集到的二维人体图像包含衣物、发丝等大量的纹理信息,而像虚拟试衣等相关应用需要将人体表面的衣物褶皱等纹理信息去除,同时考虑到裸体数据采集侵犯了用户的隐私,因此提出一种基于二维点云图像到三维人体模型的新型建模方法。与摄像机等辅助设备进行二维图片数据集的采集不同,该算法的输入是由三维人体点云模型以顶点模式绘制的二维点云渲染图。主要工作是建立一个由二维点云图和相应的人体黑白二值图构成的数据集,并训练一个由前者生成后者的生成对抗网络模型。该模型将二维点云图转化为相应的黑白二值图。将该二值图输入一个训练好的卷积神经网络,用于评估二维图像到三维人体模型构建的效果。考虑到由不完整三维点云数据重建完整的三维人体网格模型是一个具有挑战性的问题,因此通过模拟二维点云的破损和残缺状态,使得算法能够处理不完整的二维点云图。大量的实验结果表明,该方法重建出的三维人体模型能够有效实现视觉上的真实感,为了对重建后的精度进行定量的分析,选取了人体特征中具有代表性的腰围特征作为误差评估;为了增加三维人体模型库中人体形态的多样性,还引入一种便捷的三维人体模型数据增强技术。实验结果表明,该算法只需要输入一张二维点云图像,就能快速创建出相应的数字化人体模型。相似文献

20.

Image segmentation by iterative optimization of multiphase multiple piecewise constant model and Four-Color relabeling

Liman Liu Wenbing Tao 《Pattern recognition》2011,44(12):2819-2833

In the paper an iteratively unsupervised image segmentation algorithm is developed, which is based on our proposed multiphase multiple piecewise constant (MMPC) model and its graph cuts optimization. The MMPC model use multiple constants to model each phase instead of one single constant used in Chan and Vese (CV) model and cartoon limit so that heterogeneous image object segmentation can be effectively dealt with. We show that the multiphase optimization problem based on our proposed model can be approximately solved by graph cuts methods. Four-Color theorem is used to relabel the regions of image after every iteration, which makes it possible to represent and segment an arbitrary number of regions in image with only four phases. Therefore, the computational cost and memory usage are greatly reduced. The comparison with some typical unsupervised image segmentation methods using a large number of images from the Berkeley Segmentation Dataset demonstrates the proposed algorithm can effectively segment natural images with a good performance and acceptable computational time. 相似文献