首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 24 毫秒
1.
光场成像技术及其在计算机视觉中的应用   总被引:2,自引:1,他引:1       下载免费PDF全文
目的 光场成像技术刚刚在计算机视觉研究中展开初步应用,其相关研究比较零散,缺乏系统性。本文旨在系统介绍光场成像技术发展以及其应用在计算机视觉研究中有代表性的工作。方法 从解决计算机视觉问题的角度出发,4个层面讨论光场成像技术最近十年的研究工作,包括:1)主流的光场成像设备及其作为计算机视觉传感器的优点与不足;2)光场相机作为视觉传感器的标定、解码以及预处理方法;3)基于4维光场的图像渲染与重建技术,以及其如何促进计算机视觉研究;4)以4维光场数据为基础的特征表达方法。结果 逐层梳理出光场成像在求解视觉问题中的优势和局限,分析其中根本性的原理与掣肘,力图总结出亟待解决的关键问题以及未来的发展趋势。结论 作为一种颇具前景的新型计算机视觉传感器技术,光场成像技术的研究必将更为广泛和深入。研究应用于计算机视觉的光场成像技术将有力的引导和促进计算机视觉和光场成像技术协同发展。  相似文献   

2.
Human beings can become experts in performing specific vision tasks, for example, doctors analysing medical images, or botanists studying leaves. With sufficient knowledge and experience, people can become very efficient at such tasks. When attempting to perform these tasks with a machine vision system, it would be highly beneficial to be able to replicate the process which the expert undergoes. Advances in eye-tracking technology can provide data to allow us to discover the manner in which an expert studies an image. This paper presents a first step towards utilizing these data for computer vision purposes. A growing-neural-gas algorithm is used to learn a set of Gabor filters which give high responses to image regions which a human expert fixated on. These filters can then be used to identify regions in other images which are likely to be useful for a given vision task. The algorithm is evaluated by learning filters for locating specific areas of plant leaves.  相似文献   

3.
基于彩色视频图像的运动人体检测方法   总被引:2,自引:0,他引:2  
在视频图像中进行运动人体检测是许多计算机视觉任务的基础而又关键的研究步骤.其目的在于将运动的人体从视频图像中检测出来,以便进行后续的的诸如智能监控中对人体进行跟踪及行为理解等工作.而彩色图像由于具有比灰度图像更多的视觉信息,受到了越来越多的重视.研究了一种直接在彩色环境中基于时空联合的运动人体检测算法,该算法将时域分割与空域分割相联合而得到具有精确边缘的运动人体,并且消除了运动人体的影子.时域分割采用一种基于RGB彩色图像的双阈值分割背景减除法.空域分割采用了基于RGB彩色空间的区域生长法.实验结果表明上述算法能够实时有效地从彩色图像序列中检测出运动人体,消除运动人体的影子,而且最终检测出来的运动人体是彩色的.  相似文献   

4.
图像纹理分类方法研究进展和展望   总被引:4,自引:0,他引:4  
纹理分类是计算机视觉和模式识别领域的一个重要的基本问题,也是图像分割、物体识别、场景理解等其他视觉任务的基础.本文从纹理分类问题的基本定义出发,首先,对纹理分类研究中存在的困难与挑战进行阐述;接下来,对纹理分类方面的典型数据库进行全面梳理和总结;然后,对近期的纹理特征提取方法的发展和现状进行归类总结,并对主流纹理特征提取方法进行了详细的阐述和评述;最后,对纹理分类发展方向进行思考和讨论.  相似文献   

5.
吴晓婷  冯晓毅  黄安  张雪毅  董晶  刘丽 《自动化学报》2022,48(12):2886-2910
人脸亲子关系验证即通过给定的不同人的两幅人脸图像判断其是否具有亲子关系,是计算机视觉和机器学习领域中一个重要的研究问题,在丢失儿童寻找、社会媒体分析、图像自动标注等领域具有广泛的应用价值.随着人脸亲子关系验证问题受到越来越多的关注,其在多个方面都得到了相应的发展,本文对人脸亲子关系验证方法做了综述整理.首先,简要介绍了人脸亲子关系验证在近十年的研究现状,随后对问题进行了定义并讨论其面临的挑战.接下来,汇总了常用的亲子数据库,对数据库属性做了详细的总结和对比.然后,对人脸亲子关系验证方法进行了分类总结、对比,以及不同方法的性能表现.最后,展望了人脸亲子关系验证今后可能的研究方向.  相似文献   

6.
医学影像的诊断是许多临床决策的基础,而医学影像的智能分析是医疗人工智能的重要组成部分。与此同时,随着越来越多3D空间传感器的兴起和普及,3D计算机视觉正变得越发重要。本文关注医学影像分析和3D计算机的交叉领域,即医学3D计算机视觉或医学3D视觉。本文将医学3D计算机视觉系统划分为任务、数据和表征3个层面,并结合最新文献呈现这3个层面的研究进展。在任务层面,介绍医学3D计算机视觉中的分类、分割、检测、配准和成像重建,以及这些任务在临床诊断和医学影像分析中的作用和特点。在数据层面,简要介绍了医学3D数据中最重要的数据模态:包括计算机断层成像(computed tomography,CT)、磁共振成像(magnetic resonance imaging,MRI)、正电子放射断层成像(positron emission tomography,PET)等,以及一些新兴研究提出的其他数据格式。在此基础上,整理了医学3D计算机视觉中重要的研究数据集,并标注其数据模态和主要视觉任务。在表征层面,介绍并讨论了2D网络、3D网络和混合网络在医学3D数据的表征学习上的优缺点。此外,针对医学影像中普遍存在的小数据问题,重点讨论了医学3D数据表征学习中的预训练问题。最后,总结了目前医学3D计算机视觉的研究现状,并指出目前尚待解决的研究挑战、问题和方向。  相似文献   

7.
The boundaries of objects in an image are often considered a nuisance to be “handled” due to the occlusion they exhibit. Since most, if not all, computer vision techniques aggregate information spatially within a scene, information spanning these boundaries, and therefore from different physical surfaces, is invariably and erroneously considered together. In addition, these boundaries convey important perceptual information about 3D scene structure and shape. Consequently, their identification can benefit many different computer vision pursuits, from low-level processing techniques to high-level reasoning tasks. While much focus in computer vision is placed on the processing of individual, static images, many applications actually offer video, or sequences of images, as input. The extra temporal dimension of the data allows the motion of the camera or the scene to be used in processing. In this paper, we focus on the exploitation of subtle relative-motion cues present at occlusion boundaries. When combined with more standard appearance information, we demonstrate these cues’ utility in detecting occlusion boundaries locally. We also present a novel, mid-level model for reasoning more globally about object boundaries and propagating such local information to extract improved, extended boundaries.  相似文献   

8.
Computing the convex hull of a set of points is a fundamental operation in many research fields, including geometric computing, computer graphics, computer vision, robotics, and so forth. This problem is particularly challenging when the number of points goes beyond some millions. In this article, we describe a very fast algorithm that copes with millions of points in a short period of time without using any kind of parallel computing. This has been made possible because the algorithm reduces to a sorting problem of the input point set, what dramatically minimizes the geometric computations (e.g., angles, distances, and so forth) that are typical in other algorithms. When compared with popular convex hull algorithms (namely, Graham’s scan, Andrew’s monotone chain, Jarvis’ gift wrapping, Chan’s, and Quickhull), our algorithm is capable of generating the convex hull of a point set in the plane much faster than those five algorithms without penalties in memory space.  相似文献   

9.
图像超分辨率重建是用低分辨率图像重建出对应的高分辨率图像的过程。目前,图像超分辨率技术已经成功应用于计算机视觉和图像处理领域。近年来,由于深度学习具有能够从大量数据中自动学习特征的能力,因此被广泛应用于图像超分辨率领域中。介绍了图像超分辨重建的背景,详细总结了用于图像超分辨率的深度学习模型,阐述了图像超分辨率技术在卫星遥感图像、医学影像、视频监控、工业检测任务方面的应用。总结了图像超分辨算法的当前研究现状以及未来发展方向。  相似文献   

10.

Analyzing videos and images captured by unmanned aerial vehicles or aerial drones is an emerging application attracting significant attention from researchers in various areas of computer vision. Currently, the major challenge is the development of autonomous operations to complete missions and replace human operators. In this paper, based on the type of analyzing videos and images captured by drones in computer vision, we have reviewed these applications by categorizing them into three groups. The first group is related to remote sensing with challenges such as camera calibration, image matching, and aerial triangulation. The second group is related to drone-autonomous navigation, in which computer vision methods are designed to explore challenges such as flight control, visual localization and mapping, and target tracking and obstacle detection. The third group is dedicated to using images and videos captured by drones in various applications, such as surveillance, agriculture and forestry, animal detection, disaster detection, and face recognition. Since most of the computer vision methods related to the three categories have been designed for real-world conditions, providing real conditions based on drones is impossible. We aim to explore papers that provide a database for these purposes. In the first two groups, some survey papers presented are current. However, the surveys have not been aimed at exploring any databases. This paper presents a complete review of databases in the first two groups and works that used the databases to apply their methods. Vision-based intelligent applications and their databases are explored in the third group, and we discuss open problems and avenues for future research.

  相似文献   

11.
《Real》1998,4(6):417-428
The spatial transformation of images, commonly known as image warping, is fundamental to many applications, e.g. remote sensing, medical imaging, computer vision, and computer graphics. Computational demands in image warping are high, requiring a geometric transformation, address and coefficient generation, and some form of interpolation. However, unlike most image processing algorithms, the data flow for image warping can be highly irregular, which makes any efficient implementation challenging.This paper describes an efficient algorithm which addresses these challenges by making use of the capabilities of a single-chip multiprocessing microprocessor, the Texas Instruments TMS320C80 MVP (multimedia video processor). the MVP's advanced digital signal processors (ADSPs) offer tremendous computational power through instruction-level parallelism and several key features designed for image processing. The MVP's intelligent input/output interface via the transfer controller (TC) permits efficient irregular memory accesses.Affine and perspective warps have been implemented for 8-bit, 16-bit and and RGB color data using bilinear interpolation.The affine warp can generate 512 × 512 warped output images faster than real-time video rates require. For 8-bit images, the performance is 14.1 ms. Although the amount of computation necessary is the same for 16-bit images, the execution time increases to 15.2 ms since twice as many bytes need to be transferred. For RGB color images, it takes 28.0 ms. The perspective warp requires 46.3 ms for 8-bit and 16-bit images, and 60.4 ms for RGB color images. This unprecedented performance for software-based image warping exceeds many hardware approaches reported in the literature.  相似文献   

12.
视觉问答与对话是人工智能领域的重要研究任务,是计算机视觉与自然语言处理交叉领域的代表性问题之一.视觉问答与对话任务要求机器根据指定的视觉图像内容,对单轮或多轮的自然语言问题进行作答.视觉问答与对话对机器的感知能力、认知能力和推理能力均提出了较高的要求,在跨模态人机交互应用中具有实用前景.文中对近年来视觉问答与对话的研究...  相似文献   

13.
蒋峰岭  孔斌  钱晶  王灿  杨静 《测控技术》2021,40(1):1-15
人类的视觉系统能够迅速地、有选择地从视觉场景中检测出感兴趣的目标或者具有显著特征的物体,并根据更高层次的视觉任务目的对它们进行处理和理解,从而实现相应的行为或决策.将人类这种选择性视觉注意机制引入到计算机视觉的信息处理中,可以有效地减少视觉计算所需处理的数据量、加速整个处理过程,并进一步方便更高层次视觉任务的处理,因而...  相似文献   

14.
The goal of human image generation (HIG) is to synthesize a human image in a novel pose. HIG can potentially benefit various computer vision applications and engineering tasks. The recently-developed CNN-based approach applies the attention architecture to vision tasks. However, owing to the locality in CNNs, extracting and maintaining the long-range pixel interactions input images is difficult. Thus, existing human image generation methods face limited content representation. In this paper, we propose a novel human image generation framework called HIGSA that can utilize the position information from the input source image. The proposed HIGSA contains two complementary self-attention blocks to generate photo-realistic human images, named as stripe self-attention block (SSAB) and content attention block (CAB), respectively. In SSAB, this paper establishes global dependencies of human images and computes the attention map for each pixel based on its relative spatial positions concerning other pixels. In CAB, this paper introduces an effective feature extraction module to interactively enhance both person’s appearance and shape feature representations. Therefore, the HIGSA framework inherently preserves the better appearance consistency and shape consistency with sharper details. Extensive experiments on mainstream datasets demonstrate that HIGSA achieves the state-of-the-art (SOTA) results.  相似文献   

15.
基于计算机视觉的Transformer研究进展   总被引:1,自引:0,他引:1       下载免费PDF全文
Transformer是一种基于自注意力机制、并行化处理数据的深度神经网络.近几年基于Transformer的模型成为计算机视觉任务的重要研究方向.针对目前国内基于Transformer综述性文章的空白,对其在计算机视觉上的应用进行概述.回顾了Transformer的基本原理,重点介绍了其在图像分类、目标检测、图像分割...  相似文献   

16.
3D digital content has become popular as emerging media that can be created, edited and shared by users in a collaborative environment, likewise images and videos. The popularity of 3D media is not confined to the leisure sphere but it increased in many fields ranging from the entertainment market to the industrial product modelling, to health, biology, art, virtual tourism, and more. While problems related to the representation of the geometry of 3D shapes have been largely solved by the CG community, tools for coding, extracting, sharing, and retrieving the semantic content of 3D media are still far from satisfactory: interdisciplinary research efforts are needed to foster the development of the 3D Internet and its applications. The purpose of this paper is thus motivating research in this direction, presenting our vision of the future and, without offering any off-the-shelf solution, giving an overview of the various aspects of semantics required to optimise tasks and processes related to 3D content in different application domains. We identified four grand challenges which synthesise the open issues in common to the considered fields and represent a roadmap towards semantic 3D media.  相似文献   

17.
18.
Automatic Radial Distortion Estimation from a Single Image   总被引:1,自引:0,他引:1  
Many computer vision algorithms rely on the assumptions of the pinhole camera model, but lens distortion with off-the-shelf cameras is usually significant enough to violate this assumption. Many methods for radial distortion estimation have been proposed, but they all have limitations. Robust automatic radial distortion estimation from a single natural image would be extremely useful for many applications, particularly those in human-made environments containing abundant lines. For example, it could be used in place of an extensive calibration procedure to get a mobile robot or quadrotor experiment up and running quickly in an indoor environment. We propose a new method for automatic radial distortion estimation based on the plumb-line approach. The method works from a single image and does not require a special calibration pattern. It is based on Fitzgibbon’s division model, robust estimation of circular arcs, and robust estimation of distortion parameters. We perform an extensive empirical study of the method on synthetic images. We include a comparative statistical analysis of how different circle fitting methods contribute to accurate distortion parameter estimation. We finally provide qualitative results on a wide variety of challenging real images. The experiments demonstrate the method’s ability to accurately identify distortion parameters and remove distortion from images.  相似文献   

19.
张宇  温光照  米思娅  张敏灵  耿新 《软件学报》2022,33(11):4173-4191
人体姿态估计是计算机视觉领域的一个基础且具有挑战的任务,人体姿态估计对于描述人体姿态、描述人体行为等至关重要,是行为识别、行为检测等计算机视觉任务的基础.近年来,随着深度学习的发展,基于深度学习的人体姿态估计算法展现出了极其优异的效果.从单人人体姿态估计、自顶向下的多人人体姿态估计和自底向上的多人人体姿态估计这3种主流的人体姿态估计方式,介绍近年来基于深度学习的二维人体姿态估计算法的发展,并讨论目前二维人体姿态估计所面临的困难和挑战.最后,对人体姿态估计未来的发展做出展望.  相似文献   

20.
Many computer vision applications can benefit from omnidirectional vision sensing, rather than depending solely on conventional cameras that have constrained fields of view. For example, mobile robots often require a full 360 view of their environment in order to perform navigational tasks such identifying landmarks, localizing within the environment, and determining free paths in which to move. There has been much research interest in omnidirectional vision in the past decade and many techniques have been developed. These techniques include: (i) catadioptric methods which can provide rapid image acquisition, but lack image resolution; and (ii) mosaicing and linear scanning techniques which have high image resolution but typically have slow image acquisition speed. In this paper, we introduce a novel linear scanning panoramic vision system that can acquire panoramic images quickly with little loss of image resolution. The system makes use of a fast line-scan camera, instead of a slower, conventional area-scan camera. In addition, a unique coarse-to-fine panoramic imaging technique has been developed that is based on smart sensing principles. Using the active vision paradigm, we control the motion of the rotating camera using feedback from the images. This results in high acquisition speeds and proportionally low storage requirements. Experimentation has been carried out, and results are given. Correspondence to: M.J. Barth (e-mail: barth@ee.ucr.edu)  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号