首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
ColorCheckers are reference standards that professional photographers and filmmakers use to ensure predictable results under every lighting condition. The objective of this work is to propose a new fast and robust method for automatic ColorChecker detection. The process is divided into two steps: (1) ColorCheckers localization and (2) ColorChecker patches recognition. For the ColorChecker localization, we trained a detection convolutional neural network using synthetic images. The synthetic images are created with the 3D models of the ColorChecker and different background images. The output of the neural networks are the bounding box of each possible ColorChecker candidates in the input image. Each bounding box defines a cropped image which is evaluated by a recognition system, and each image is canonized with regards to color and dimensions. Subsequently, all possible color patches are extracted and grouped with respect to the center's distance. Each group is evaluated as a candidate for a ColorChecker part, and its position in the scene is estimated. Finally, a cost function is applied to evaluate the accuracy of the estimation. The method is tested using real and synthetic images. The proposed method is fast, robust to overlaps and invariant to affine projections. The algorithm also performs well in case of multiple ColorCheckers detection.  相似文献   

2.
3.
We suggest a method to directly deep‐learn light transport, i. e., the mapping from a 3D geometry‐illumination‐material configuration to a shaded 2D image. While many previous learning methods have employed 2D convolutional neural networks applied to images, we show for the first time that light transport can be learned directly in 3D. The benefit of 3D over 2D is, that the former can also correctly capture illumination effects related to occluded and/or semi‐transparent geometry. To learn 3D light transport, we represent the 3D scene as an unstructured 3D point cloud, which is later, during rendering, projected to the 2D output image. Thus, we suggest a two‐stage operator comprising a 3D network that first transforms the point cloud into a latent representation, which is later on projected to the 2D output image using a dedicated 3D‐2D network in a second step. We will show that our approach results in improved quality in terms of temporal coherence while retaining most of the computational efficiency of common 2D methods. As a consequence, the proposed two stage‐operator serves as a valuable extension to modern deferred shading approaches.  相似文献   

4.
基于方面图技术的三维运动目标识别   总被引:1,自引:0,他引:1       下载免费PDF全文
三维目标在不同的视点下呈现不同的姿态 ,所得的二维视图也不尽相同 ,因此三维目标识别是一个较为复杂的问题 .为此提出了通过图象序列和图象序列之间的转移关系 ,根据胜者为王的原则来识别三维目标的方法 .该方法采用极指数栅格技术和傅立叶变换相结合得到目标的轮廓不变量 ;用神经网络结合方面图技术 ,通过识别运动目标图象序列来识别三维运动目标 ,实现了一个目标识别系统 .实验结果证明 ,此方法可以有效地用于三维运动目标的识别  相似文献   

5.
The process of segmenting images is one of the most critical ones in automatic image analysis whose goal can be regarded as to find what objects are present in images. Artificial neural networks have been well developed so far. First two generations of neural networks have a lot of successful applications. Spiking neuron networks (SNNs) are often referred to as the third generation of neural networks which have potential to solve problems related to biological stimuli. They derive their strength and interest from an accurate modeling of synaptic interactions between neurons, taking into account the time of spike emission. SNNs overcome the computational power of neural networks made of threshold or sigmoidal units. Based on dynamic event-driven processing, they open up new horizons for developing models with an exponential capacity of memorizing and a strong ability to fast adaptation. Moreover, SNNs add a new dimension, the temporal axis, to the representation capacity and the processing abilities of neural networks. In this paper, we present how SNN can be applied with efficacy in image segmentation and edge detection. Results obtained confirm the validity of the approach.  相似文献   

6.
Model-based recognition of 3D objects from single images   总被引:1,自引:0,他引:1  
In this work, we treat major problems of object recognition which have received relatively little attention lately. Among them are the loss of depth information in the projection from a 3D object to a single 2D image, and the complexity of finding feature correspondences between images. We use geometric invariants to reduce the complexity of these problems. There are no geometric invariants of a projection from 3D to 2D. However, given certain modeling assumptions about the 3D object, such invariants can be found. The modeling assumptions can be either a particular model or a generic assumption about a class of models. Here, we use such assumptions for single-view recognition. We find algebraic relations between the invariants of a 3D model and those of its 2D image under general projective projection. These relations can be described geometrically as invariant models in a 3D invariant space, illuminated by invariant “light rays,” and projected onto an invariant version of the given image. We apply the method to real images  相似文献   

7.
Similarity and affine invariant distances between 2D point sets   总被引:4,自引:0,他引:4  
We develop expressions for measuring the distance between 2D point sets, which are invariant to either 2D affine transformations or 2D similarity transformations of the sets, and assuming a known correspondence between the point sets. We discuss the image normalization to be applied to the images before their comparison so that the computed distance is symmetric with respect to the two images. We then give a general (metric) definition of the distance between images, which leads to the same expressions for the similarity and affine cases. This definition avoids ad hoc decisions about normalization. Moreover, it makes it possible to compute the distance between images under different conditions, including cases where the images are treated asymmetrically. We demonstrate these results with real and simulated images  相似文献   

8.
In this paper, a neural tree-based approach for classifying range images into a set of nonoverlapping regions is presented. An innovative procedure is applied to extract invariant surface features from each pixel of the range image. These features are: 1) robust to noise, and 2) invariant to scale, shift, rotations, curvature variations, and direction of the normal. Then, a generalized neural tree is used to classify each image point as belonging to one of the six surface models of differential geometry, i.e., peak, ridge, valley, saddle, pit, and flat. Comparisons with other methods and experiments on both synthetic and real three-dimensional range images are proposed.  相似文献   

9.
Based on the studies of existing local-connected neural network models, in this brief, we present a new spiking cortical neural networks model and find that time matrix of the model can be recognized as a human subjective sense of stimulus intensity. The series of output pulse images of a proposed model represents the segment, edge, and texture features of the original image, and can be calculated based on several efficient measures and forms a sequence as the feature of the original image. We characterize texture images by the sequence for an invariant texture retrieval. The experimental results show that the retrieval scheme is effective in extracting the rotation and scale invariant features. The new model can also obtain good results when it is used in other image processing applications.   相似文献   

10.
In this paper, a robust position, scale, and rotation invariant system for the recognition of closed 2-D noise corrupted images using the bispectral features of a contour sequence and the weighted fuzzy classifier are derived. The higher-order spectrum based on third-order moment, called a bispectrum, is applied to the contour sequences of an image to extract a 15-dimensional feature vector for each of the 2-D images. This bispectral feature vector, which is invariant to shape translation, scale, and rotation transformation, can be used to represent a 2-D planar image and is fed into a weighted fuzzy classifier for the recognition process. The experiments with eight different shapes of aircraft images are presented to illustrate the high performance of the proposed system even when the image is significantly corrupted by noise.  相似文献   

11.
基于径向基神经网络的立体匹配算法*   总被引:2,自引:1,他引:1  
针对双目视觉中的图像立体匹配问题,提出了一种基于径向基神经网络的立体匹配算法。该算法提取图像的尺度不变特征变换(SIFT)特征建立特征匹配矩阵,对特征匹配向量进行约简,最后将约简的特征匹配向量输入径向基神经网络进行识别输出。仿真和实际图像实验表明,该算法的匹配正确率比标准的SIFT有所到提高。  相似文献   

12.
The COVID-19 pandemic has caused trouble in people’s daily lives and ruined several economies around the world, killing millions of people thus far. It is essential to screen the affected patients in a timely and cost-effective manner in order to fight this disease. This paper presents the prediction of COVID-19 with Chest X-Ray images, and the implementation of an image processing system operated using deep learning and neural networks. In this paper, a Deep Learning, Machine Learning, and Convolutional Neural Network-based approach for predicting Covid-19 positive and normal patients using Chest X-Ray pictures is proposed. In this study, machine learning tools such as TensorFlow were used for building and training neural nets. Scikit-learn was used for machine learning from end to end. Various deep learning features are used, such as Conv2D, Dense Net, Dropout, Maxpooling2D for creating the model. The proposed approach had a classification accuracy of 96.43 percent and a validation accuracy of 98.33 percent after training and testing the X-Ray pictures. Finally, a web application has been developed for general users, which will detect chest x-ray images either as covid or normal. A GUI application for the Covid prediction framework was run. A chest X-ray image can be browsed and fed into the program by medical personnel or the general public.  相似文献   

13.
本文提出一种基于全卷积神经网络的图像中文字目标语义分割算法和一种新的数据集制作与增广方法. 该算法首先采用改进全卷积神经网络对图像中的文字目标进行初步分割, 然后利用大津法进行二值化处理, 划分出目标的大致区域, 最后用全连接条件随机场算法进行修正, 得到最终结果. 该算法在测试集上准确率为85.7%, 速度为0.181秒/幅, 为图像目标区域的进一步分析做准备.  相似文献   

14.
目的 遥感图像语义分割是根据土地覆盖类型对图像中每个像素进行分类,是遥感图像处理领域的一个重要研究方向。由于遥感图像包含的地物尺度差别大、地物边界复杂等原因,准确提取遥感图像特征具有一定难度,使得精确分割遥感图像比较困难。卷积神经网络因其自主分层提取图像特征的特点逐步成为图像处理领域的主流算法,本文将基于残差密集空间金字塔的卷积神经网络应用于城市地区遥感图像分割,以提升高分辨率城市地区遥感影像语义分割的精度。方法 模型将带孔卷积引入残差网络,代替网络中的下采样操作,在扩大特征图感受野的同时能够保持特征图尺寸不变;模型基于密集连接机制级联空间金字塔结构各分支,每个分支的输出都有更加密集的感受野信息;模型利用跳线连接跨层融合网络特征,结合网络中的高层语义特征和低层纹理特征恢复空间信息。结果 基于ISPRS (International Society for Photogrammetry and Remote Sensing) Vaihingen地区遥感数据集展开充分的实验研究,实验结果表明,本文模型在6种不同的地物分类上的平均交并比和平均F1值分别达到69.88%和81.39%,性能在数学指标和视觉效果上均优于SegNet、pix2pix、Res-shuffling-Net以及SDFCN (symmetrical dense-shortcut fully convolutional network)算法。结论 将密集连接改进空间金字塔池化网络应用于高分辨率遥感图像语义分割,该模型利用了遥感图像不同尺度下的特征、高层语义信息和低层纹理信息,有效提升了城市地区遥感图像分割精度。  相似文献   

15.
Image compression is applied to many fields such as television dissemination, remote sensing, image storage. Digitized images are compressed by a method which exploits the redundancy of the images so that the number of bits required to represent the image can be reduced with acceptable degradation of the decoded image. The humiliation of the image quality is limited with respect to the application used. There are various biomedical applications where accuracy is of major concern. To attain the objective of performance improvement with respect to decoded picture quality and compression ratios, in contrast to existing image compression techniques, an effective image coding technique which involves transforming the image into another domain with ridgelet function and then quantizing the coefficients with hybrid neural networks combining two different learning networks called auto-associative multilayer perceptron and self-organizing feature map is proposed. Ridge functions are effective in representing functions that have discontinuities along straight lines. Normal wavelet transforms not succeed to represent such functions effectively. The results obtained from the combination of finite ridgelet transform with hybrid neural networks found much better than that obtained from the JPEG2000 image compression system.  相似文献   

16.
As many different 3D volumes could produce the same 2D x‐ray image, inverting this process is challenging. We show that recent deep learning‐based convolutional neural networks can solve this task. As the main challenge in learning is the sheer amount of data created when extending the 2D image into a 3D volume, we suggest firstly to learn a coarse, fixed‐resolution volume which is then fused in a second step with the input x‐ray into a high‐resolution volume. To train and validate our approach we introduce a new dataset that comprises of close to half a million computer‐simulated 2D x‐ray images of 3D volumes scanned from 175 mammalian species. Future applications of our approach include stereoscopic rendering of legacy x‐ray images, re‐rendering of x‐rays including changes of illumination, view pose or geometry. Our evaluation includes comparison to previous tomography work, previous learning methods using our data, a user study and application to a set of real x‐rays.  相似文献   

17.
目的 卫星图像往往目标、背景复杂而且带有噪声,因此使用人工选取的特征进行卫星图像的分类就变得十分困难。提出一种新的使用卷积神经网络进行卫星图像分类的方案。使用卷积神经网络可以提取卫星图像的高层特征,进而提高卫星图像分类的识别率。方法 首先,提出一个包含六类图像的新的卫星图像数据集来解决卷积神经网络的有标签训练样本不足的问题。其次,使用了一种直接训练卷积神经网络模型和3种预训练卷积神经网络模型来进行卫星图像分类。直接训练模型直接在文章提出的数据集上进行训练,预训练模型先在ILSVRC(the ImageNet large scale visual recognition challenge)-2012数据集上进行预训练,然后在提出的卫星图像数据集上进行微调训练。完成微调的模型用于卫星图像分类。结果 提出的微调预训练卷积神经网络深层模型具有最高的分类正确率。在提出的数据集上,深层卷积神经网络模型达到了99.50%的识别率。在数据集UC Merced Land Use上,深层卷积神经网络模型达到了96.44%的识别率。结论 本文提出的数据集具有一般性和代表性,使用的深层卷积神经网络模型具有很强的特征提取能力和分类能力,且是一种端到端的分类模型,不需要堆叠其他模型或分类器。在高分辨卫星图像的分类上,本文模型和对比模型相比取得了更有说服力的结果。  相似文献   

18.
提出了基于神经网络和隐马尔可夫模型组合的彩色人脸图像检测方法 .根据归一化后的彩色图像的色度彩色分量直方图将图像粗分割成若干幅二值图像 ;在亮度图像上 ,以上述二值图像为掩模进行多分辨率的旋转不变性人脸检测 .在人脸检测时 ,本文分两步 :第一步先用神经网络来确定人脸的旋转角度 ,然后对旋正后的图像运用识别人脸奇异值特征的隐马尔可夫模型进行验证 .实验结果表明 ,本文算法是有效的  相似文献   

19.
This paper introduces an approach to cosmetic surface flaw identification that is essentially invariant to changes in workpiece orientation and position while being efficient in the use of computer memory. Visual binary images of workpieces are characterized according to the number of pixels in progressive subskeleton iterations. Those subskeletons are constructed using a modified Zhou skeleton transform with disk shaped structuring elements. Two coding schemes are proposed to record the pixel counts of succeeding subskeletons with and without lowpass filtering. The coded pixel counts are on-line fed to a supervised neural network that is previously trained by the backpropagation method using flawed and unflawed simulation patterns. The test workpiece is then identified as flawed or unflawed by comparing its coded pixel counts to associated training patterns. Such off-line trainings using simulated patterns avoid the problems of collecting flawed samples. Since both coding schemes tremendously reduce the representative skeleton image data, significant run time in each epoch is saved in the application of neural networks. Experimental results are reported using six different shapes of workpieces to corroborate the proposed approach.  相似文献   

20.
目的 双目视觉是目标距离估计问题的一个很好的解决方案。现有的双目目标距离估计方法存在估计精度较低或数据准备较繁琐的问题,为此需要一个可以兼顾精度和数据准备便利性的双目目标距离估计算法。方法 提出一个基于R-CNN(region convolutional neural network)结构的网络,该网络可以实现同时进行目标检测与目标距离估计。双目图像输入网络后,通过主干网络提取特征,通过双目候选框提取网络以同时得到左右图像中相同目标的包围框,将成对的目标框内的局部特征输入目标视差估计分支以估计目标的距离。为了同时得到左右图像中相同目标的包围框,使用双目候选框提取网络代替原有的候选框提取网络,并提出了双目包围框分支以同时进行双目包围框的回归;为了提升视差估计的精度,借鉴双目视差图估计网络的结构,提出了一个基于组相关和3维卷积的视差估计分支。结果 在KITTI(Karlsruhe Institute of Technology and Toyota Technological Institute)数据集上进行验证实验,与同类算法比较,本文算法平均相对误差值约为3.2%,远小于基于双目视差图估计算法(11.3%),与基于3维目标检测的算法接近(约为3.9%)。另外,提出的视差估计分支改进对精度有明显的提升效果,平均相对误差值从5.1%下降到3.2%。通过在另外采集并标注的行人监控数据集上进行类似实验,实验结果平均相对误差值约为4.6%,表明本文方法可以有效应用于监控场景。结论 提出的双目目标距离估计网络结合了目标检测与双目视差估计的优势,具有较高的精度。该网络可以有效运用于车载相机及监控场景,并有希望运用于其他安装有双目相机的场景。  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号