首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
基于单幅图像的物体三维重建是计算机视觉领域的一个重要问题,近几十年来得到了广泛的关注.随着深度学习的不断发展,近年来基于单幅图像的物体三维重建取得了显著进展.本文对深度学习在基于单幅图像的物体三维重建领域的研究进展及具体应用进行了综述.首先介绍了基于单幅图像的三维重建的研究背景及其传统方法的研究现状,其次简要介绍了深度学习并详细综述了深度学习在基于单幅图像的物体三维重建中的应用,随后简要概述了三维物体重建的常用公共数据集,最后进行了分析与总结,指出了目前存在的问题及未来的研究方向.  相似文献   

2.
Wang  Xing-Gang  Wang  Jia-Si  Tang  Peng  Liu  Wen-Yu 《计算机科学技术学报》2019,34(6):1269-1278

Learning an effective object detector with little supervision is an essential but challenging problem in computer vision applications. In this paper, we consider the problem of learning a deep convolutional neural network (CNN) based object detector using weakly-supervised and semi-supervised information in the framework of fast region-based CNN (Fast R-CNN). The target is to obtain an object detector as accurate as the fully-supervised Fast R-CNN, but it requires less image annotation effort. To solve this problem, we use weakly-supervised training images (i.e., only the image-level annotation is given) and a few proportions of fully-supervised training images (i.e., the bounding box level annotation is given), that is a weakly- and semi-supervised (WASS) object detection setting. The proposed solution is termed as WASS R-CNN, in which there are two main components. At first, a weakly-supervised R-CNN is firstly trained; after that semi-supervised data are used for finetuning the weakly-supervised detector. We perform object detection experiments on the PASCAL VOC 2007 dataset. The proposed WASS R-CNN achieves more than 85% of a fully-supervised Fast R-CNN’s performance (measured using mean average precision) with only 10% of fully-supervised annotations together with weak supervision for all training images. The results show that the proposed learning framework can significantly reduce the labeling efforts for obtaining reliable object detectors.

  相似文献   

3.

Deep learning proved its efficiency in many fields of computer science such as computer vision, image classifications, object detection, image segmentation, and more. Deep learning models primarily depend on the availability of huge datasets. Without the existence of many images in datasets, different deep learning models will not be able to learn and produce accurate models. Unfortunately, several fields don't have access to large amounts of evidence, such as medical image processing. For example. The world is suffering from the lack of COVID-19 virus datasets, and there is no benchmark dataset from the beginning of 2020. This pandemic was the main motivation of this survey to deliver and discuss the current image data augmentation techniques which can be used to increase the number of images. In this paper, a survey of data augmentation for digital images in deep learning will be presented. The study begins and with the introduction section, which reflects the importance of data augmentation in general. The classical image data augmentation taxonomy and photometric transformation will be presented in the second section. The third section will illustrate the deep learning image data augmentation. Finally, the fourth section will survey the state of the art of using image data augmentation techniques in the different deep learning research and application.

  相似文献   

4.
Liu  Liying  Si  Yain-Whar 《The Journal of supercomputing》2022,78(12):14191-14214

This paper proposes a novel deep learning-based approach for financial chart patterns classification. Convolutional neural networks (CNNs) have made notable achievements in image recognition and computer vision applications. These networks are usually based on two-dimensional convolutional neural networks (2D CNNs). In this paper, we describe the design and implementation of one-dimensional convolutional neural networks (1D CNNs) for the classification of chart patterns from financial time series. The proposed 1D CNN model is compared against support vector machine, extreme learning machine, long short-term memory, rule-based and dynamic time warping. Experimental results on synthetic datasets reveal that the accuracy of 1D CNN is highest among all the methods evaluated. Results on real datasets also reveal that chart patterns identified by 1D CNN are also the most recognized instances when they are compared to those classified by other methods.

  相似文献   

5.
The computer graphics and computer vision communities have been working closely together in recent years, and a variety of algorithms and applications have been developed to analyze and manipulate the visual media around us. There are three major driving forces behind this phenomenon: 1) the availability of big data from the Internet has created a demand for dealing with the ever-increasing, vast amount of resources; 2) powerful processing tools, such as deep neural networks, provide effective ways for learning how to deal with heterogeneous visual data; 3) new data capture devices, such as the Kinect, the bridge between algorithms for 2D image understanding and 3D model analysis. These driving forces have emerged only recently, and we believe that the computer graphics and computer vision communities are still in the beginning of their honeymoon phase. In this work we survey recent research on how computer vision techniques benefit computer graphics techniques and vice versa, and cover research on analysis, manipulation, synthesis, and interaction. We also discuss existing problems and suggest possible further research directions.  相似文献   

6.
深度学习在目标视觉检测中的应用进展与展望   总被引:2,自引:0,他引:2  
张慧  王坤峰  王飞跃 《自动化学报》2017,43(8):1289-1305
目标视觉检测是计算机视觉领域的一个重要问题,在视频监控、自主驾驶、人机交互等方面具有重要的研究意义和应用价值.近年来,深度学习在图像分类研究中取得了突破性进展,也带动着目标视觉检测取得突飞猛进的发展.本文综述了深度学习在目标视觉检测中的应用进展与展望.首先对目标视觉检测的基本流程进行总结,并介绍了目标视觉检测研究常用的公共数据集;然后重点介绍了目前发展迅猛的深度学习方法在目标视觉检测中的最新应用进展;最后讨论了深度学习方法应用于目标视觉检测时存在的困难和挑战,并对今后的发展趋势进行展望.  相似文献   

7.
Object tracking is one of the most important processes for object recognition in the field of computer vision. The aim is to find accurately a target object in every frame of a video sequence. In this paper we propose a combination technique of two algorithms well-known among machine learning practitioners. Firstly, we propose a deep learning approach to automatically extract the features that will be used to represent the original images. Deep learning has been successfully applied in different computer vision applications. Secondly, object tracking can be seen as a ranking problem, since the regions of an image can be ranked according to their level of overlapping with the target object (ground truth in each video frame). During object tracking, the target position and size can change, so the algorithms have to propose several candidate regions in which the target can be found. We propose to use a preference learning approach to build a ranking function which will be used to select the bounding box that ranks higher, i.e., that will likely enclose the target object. The experimental results obtained by our method, called \( DPL ^{2}\) (Deep and Preference Learning), are competitive with respect to other algorithms.  相似文献   

8.
计算机视觉在智能制造工业检测中发挥着检测识别和定位分析的重要作用,为提高工业检测的检测速率和准确率以及智能自动化程度做出了巨大的贡献。然而计算机视觉在应用过程中一直存在技术应用难点,其中3大瓶颈问题是:计算机视觉应用易受光照影响、样本数据难以支持深度学习、先验知识难以加入演化算法。这些瓶颈问题使得计算机视觉在智能制造中的应用无法发挥最佳效能。因此,需要系统地加以分析和解决。本文总结了智能制造和计算机视觉的概念及其重要性,分析了计算机视觉在智能制造工业检测领域的发展现状和需求。针对计算机视觉应用存在的3大瓶颈问题总结分析了问题现状和已有解决方法。经过深入分析发现:针对受光照影响大的问题,可以通过算法和图像采集两个环节解决;针对样本数据难以支持深度学习的问题,可以通过小样本数据处理算法和样本数量分布平衡方法解决;针对先验知识难以加入演化算法的问题,可以通过机器学习和强化学习解决。上述解决方案中的方法不尽相同,各有优劣,需要结合智能制造中具体应用研究和改进。  相似文献   

9.
In reinforcement learning an agent may explore ineffectively when dealing with sparse reward tasks where finding a reward point is difficult. To solve the problem, we propose an algorithm called hierarchical deep reinforcement learning with automatic sub-goal identification via computer vision (HADS) which takes advantage of hierarchical reinforcement learning to alleviate the sparse reward problem and improve efficiency of exploration by utilizing a sub-goal mechanism. HADS uses a computer vision method to identify sub-goals automatically for hierarchical deep reinforcement learning. Due to the fact that not all sub-goal points are reachable, a mechanism is proposed to remove unreachable sub-goal points so as to further improve the performance of the algorithm. HADS involves contour recognition to identify sub-goals from the state image where some salient states in the state image may be recognized as sub-goals, while those that are not will be removed based on prior knowledge. Our experiments verified the effect of the algorithm.   相似文献   

10.
In recent years, the success and capabilities of embedded vision have showed up in embedded applications. The embedding of vision into electronic devices such as embedded medical applications is being driven by the availability of high-performance processors, integrating with deep learning algorithms, as well as advances in image processing technology. But, including image processing in embedded vision systems need huge amount of computational capabilities even to process a single image to detect an object and it's extremely challenging to implement in embedded systems. Implementing deep learning algorithms and testing it on a task specific data set could provide enhanced results. In this paper, an approach for enhancing image processing architecture using deep learning for embedded vision systems is proposed and analyzed. Implementing deep learning algorithms and testing it on embedded vision yielded effective results.  相似文献   

11.
该文首先介绍了计算机视觉研究领域新近提出的一种知识学习方法———多事件学习模型,简要说明了其研究思路、研究进展、计算方法以及在图像检索中的应用。其次,在简要回顾当前三维物体识别的研究进展和困难的基础上,提出了一种改进的多事件学习模型计算方法,并将其引入到三维物体识别的研究中,以有效简化三维物体的特征表达,提高识别效率。  相似文献   

12.
医学影像的诊断是许多临床决策的基础,而医学影像的智能分析是医疗人工智能的重要组成部分。与此同时,随着越来越多3D空间传感器的兴起和普及,3D计算机视觉正变得越发重要。本文关注医学影像分析和3D计算机的交叉领域,即医学3D计算机视觉或医学3D视觉。本文将医学3D计算机视觉系统划分为任务、数据和表征3个层面,并结合最新文献呈现这3个层面的研究进展。在任务层面,介绍医学3D计算机视觉中的分类、分割、检测、配准和成像重建,以及这些任务在临床诊断和医学影像分析中的作用和特点。在数据层面,简要介绍了医学3D数据中最重要的数据模态:包括计算机断层成像(computed tomography,CT)、磁共振成像(magnetic resonance imaging,MRI)、正电子放射断层成像(positron emission tomography,PET)等,以及一些新兴研究提出的其他数据格式。在此基础上,整理了医学3D计算机视觉中重要的研究数据集,并标注其数据模态和主要视觉任务。在表征层面,介绍并讨论了2D网络、3D网络和混合网络在医学3D数据的表征学习上的优缺点。此外,针对医学影像中普遍存在的小数据问题,重点讨论了医学3D数据表征学习中的预训练问题。最后,总结了目前医学3D计算机视觉的研究现状,并指出目前尚待解决的研究挑战、问题和方向。  相似文献   

13.

Due to the rapid development of the high-speed wired and wireless Internet, image contents including objects with exposed personal information are being distributed freely, which is a social problem. In this paper, we introduce a method of robustly detecting a target object with facial region exposed from an image that is quickly entered using skin color and a deep learning algorithm and effectively covering the detected target object through prediction. The proposed method in this paper accurately detects the target object containing facial region exposed from the image entered by applying an image adaptive skin color model and a CNN-based deep learning algorithm. Subsequently, the location prediction algorithm is used to quickly track the detected object. A mosaic is overlapped over the target object area to effectively protect the object area where the facial region is exposed. The experimental results show that the proposed approach accurately detects the target object including the facial region exposed from the continuously entered video, and efficiently covers the detected object through mosaic processing while quickly tracking it using a prediction-based tracking algorithm. The tracking-based target covering method proposed in this study is expected to be useful in various practical applications related to pattern recognition and image security, such as content-based image retrieval, real-time surveillance, human–computer interaction, and face detection.

  相似文献   

14.
Stratified 3D reconstruction, or a layer-by-layer 3D reconstruction upgraded from projective to affine, then to the final metric reconstruction, is a well-known 3D reconstruction method in computer vision. It is also a key supporting technology for various well-known applications, such as streetview, smart3D, oblique photogrammetry. Generally speaking, the existing computer vision methods in the literature can be roughly classified into either the geometry-based approaches for spatial vision or the learning-based approaches for object vision. Although deep learning has demonstrated tremendous success in object vision in recent years, learning 3D scene reconstruction from multiple images is still rare, even not existent, except for those on depth learning from single images. This study is to explore the feasibility of learning the stratified 3D reconstruction from putative point correspondences across images, and to assess whether it could also be as robust to matching outliers as the traditional geometry-based methods do. In this study, a special parsimonious neural network is designed for the learning. Our results show that it is indeed possible to learn a stratified 3D reconstruction from noisy image point correspondences, and the learnt reconstruction results appear satisfactory although they are still not on a par with the state-of-the-arts in the structure-from-motion community due to largely its lack of an explicit robust outlier detector such as random sample consensus (RANSAC). To the best of our knowledge, our study is the first attempt in the literature to learn 3D scene reconstruction from multiple images. Our results also show that how to implicitly or explicitly integrate an outlier detector in learning methods is a key problem to solve in order to learn comparable 3D scene structures to those by the current geometry-based state-of-the-arts. Otherwise any significant advancement of learning 3D structures from multiple images seems difficult, if not impossible. Besides, we even speculate that deep learning might be, in nature, not suitable for learning 3D structure from multiple images, or more generally, for solving spatial vision problems.  相似文献   

15.
深度卷积神经网络在计算机视觉中的应用研究综述   总被引:13,自引:0,他引:13  
随着大数据时代的到来,含更多 隐含层的深度卷积神经网络(Convolutional neural networks, CNNs)具有更复杂的网络结构,与传统机器学习方法相比具有更强大的特征学习和特征表达能力。使用深度学习算法训练的卷积神经网络模型自提出以来在计算机视觉领域的多个大规模识别任务上取得了令人瞩目的 成绩。本文首先简要介绍深度学习和卷积神经网络的兴起与展,概述卷积神经网络的基本模型结构、卷积特征提取和池化操作。然后综述了基于深度学习的卷积神经网络模型在图像分类、物体检测、姿态估计、图像分割和人脸识别等多个计算机视觉应用领域中的研究现状 和发展趋势,主要从典型的网络结构的构建、训练方法和性能表现3个方面进行介绍。最后对目前研究中存在的一些问题进行简要的总结和讨论,并展望未来发展的新方向。  相似文献   

16.
The process of identifying and bringing to the fore people’s unsafe behaviour is a core function of implementing a behaviour-based safety (BBS) program in construction. This can be a labour-intensive and challenging process but is needed to enable people to reflect and learn about how their unsafe actions can jeopardise not only their safety but that of their co-workers. With advances being made in computer vision, the capability exists to automatically capture and identify unsafe behaviour and hazards in real-time from two-dimensional (2D) digital images/videos. The corollary developments in computer vision have stimulated a wealth of research in construction to examine its potential application to practice. Hindering the application of computer vision in construction has been its inability to accurately, and generalise the detection of objects. To address this shortcoming, developments in deep learning have provided computer vision with the ability to improve the accuracy, reliability and ability to generalise object detection and therefore its usage in construction. In this paper we review the developments of computer vision studies that have been used to identify unsafe behaviour from 2D images that arises on construction sites. Then, in light of advances made with deep learning, we examine and discuss its integration with computer vision to support BBS. We also suggest that future computer-vision research should aim to support BBS by being able to: (1) observe and record unsafe behaviour; (2) understand why people act unsafe behaviour; (3) learn from unsafe behaviour; and (4) predict unsafe behaviour.  相似文献   

17.
《Advanced Robotics》2012,26(24):1264-1280
ABSTRACT

To collect a human-annotated dataset for training deep convolutional neural networks is a very time-consuming and laborious process. To reduce this burden, we previously proposed an automated annotation by placing one visual marker above the detection target object in the training phase. However, in this approach, occasionally the marker hides the object surface. To avoid this issue, we propose placing a pedestal with multiple markers at the bottom of the object. If we use multiple markers, the object can be annotated even when the object hides some of the markers. Besides that, the simple modification of placing the markers on the bottom allows the use of simple background masking to avoid the neural network learning the remaining markers in the training image as a feature of the object. Background masking can completely remove the markers during the training process. Experiments showed the proposed vision system using our automatic object annotation outperformed the vision system using manual annotation in terms of object detection, orientation estimation, and 2D position estimation while reducing the time required for dataset collection from 16.1 hours to 7.30 hours.  相似文献   

18.

Deep neural networks are more and more pervading many computer vision applications and in particular image classification. Notwithstanding that, recent works have demonstrated that it is quite easy to create adversarial examples, i.e., images malevolently modified to cause deep neural networks to fail. Such images contain changes unnoticeable to the human eye but sufficient to mislead the network. This represents a serious threat for machine learning methods. In this paper, we investigate the robustness of the representations learned by the fooled neural network, analyzing the activations of its hidden layers. Specifically, we tested scoring approaches used for kNN classification, in order to distinguish between correctly classified authentic images and adversarial examples. These scores are obtained searching only between the very same images used for training the network. The results show that hidden layers activations can be used to reveal incorrect classifications caused by adversarial attacks.

  相似文献   

19.

Computer vision techniques enhanced by the advent of deep learning has become a quintessential part of our day-to-day life. The application of such computer vision techniques in image retrieval can be termed as query based image retrieval process. Conventional methods have limitations such as increased dimensionality, reduced accuracy, high time consumption, and dependence on indexing for retrieval. In order to overcome these limitations, this research work aims to develop a new image retrieval system by developing an image preprocessing mechanism via target prediction technique, which isolates object from the background. Further, a Micro-structure based Pattern Extraction (MPE) technique is implemented to extract the patterns from the preprocessed image, where the diagonal patterns are generated for increasing the accuracy of the retrieval process. Consequently, the Convolutional Neural Network (CNN) is utilized to reduce the dimensionality of the features, and the similarity learning approach is utilized to map the selected features with trained features based on the distance metric. The performance of the proposed system is evaluated by using various measures. Thereby, the efficiency of the proposed technique is ascertained by comparing it with the existing techniques.

  相似文献   

20.
目标检测算法应用广泛,一直是计算机视觉领域备受关注的研究热点。近年来,随着深度学习的发展,3D图像的目标检测研究取得了巨大的突破。与2D目标检测相比,3D目标检测结合了深度信息,能够提供目标的位置、方向和大小等空间场景信息,在自动驾驶和机器人领域发展迅速。文中首先对基于深度学习的2D目标检测算法进行概述;其次根据图像、激光雷达、多传感器等不同数据采集方式,分析目前具有代表性和开创性的3D目标检测算法;结合自动驾驶的应用场景,对比分析不同3D目标检测算法的性能、优势和局限性;最后总结了3D目标检测的应用意义以及待解决的问题,并对3D目标检测的发展方向和新的挑战进行了讨论和展望。  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号