首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 46 毫秒
1.
The intensity and direction of the light field (LF) can be recorded simultaneously by using LF cameras. However, since LF cameras sacrifice spatial resolution for higher angular resolution, the images acquired by LF cameras tend to have low spatial resolution. Therefore, LF image super-resolution (SR) has become an integral part of LF studies. Many existing LF image SR methods fail to fully utilize angular and spatial information due to only using partial sub-aperture images (SAIs). In this paper, we propose a progressive spatial-angular feature enhancement network (PSAFENet) to deal with the problem of missing information in LF image SR. Specifically, we first extract the spatial features of SAIs, the spatial and angular features contained in the macro-pixel images (MacPIs) by three different feature extraction modules. Then, these features are fed into a spatial-angular feature enhancement (SAFE) module to perform enhancement of spatial-angular information on the SAIs. To improve the reconstruction accuracy, we also use the information multi-distillation block (IMDB) to remove the redundant information before upsampling. Our network can well merge the angular and spatial information into each SAI, which facilitates the reconstruction of the LF images. Experimental results on five public datasets show that the proposed PSAFENet method outperforms existing methods in both qualitative and quantitative comparisons.  相似文献   

2.
Recent advances in remote sensing techniques allow for the collection of hyperspectral images with enhanced spatial and spectral resolution. In many applications, these images need to be processed and interpreted in real-time, since analysis results need to be obtained almost instantaneously. However, the large amount of data that these images comprise introduces significant processing challenges. This also complicates the analysis performed by traditional machine learning algorithms. To address this issue, dimensionality reduction techniques aim at reducing the complexity of data while retaining the relevant information for the analysis, removing noise and redundant information. In this paper, we present a new real-time method for dimensionality reduction and classification of hyperspectral images. The newly proposed method exploits artificial neural networks, which are used to develop a fast compressor based on the extreme learning machine. The obtained experimental results indicate that the proposed method has the ability to compress and classify high-dimensional images fast enough for practical use in real-time applications.  相似文献   

3.
Wireless multimedia sensor networks (WMSNs) are interconnected devices that allow retrieving video and audio streams, still images, and scalar data from the environment. In a densely deployed WMSN, there exists correlation among the visual information observed by cameras with overlapped field of views. This paper proposes a novel spatial correlation model for visual information in WMSNs. By studying the sensing model and deployments of cameras, a spatial correlation function is derived to describe the correlation characteristics of visual information observed by cameras with overlapped field of views. The joint effect of multiple correlated cameras is also studied. An entropy-based analytical framework is developed to measure the amount of visual information provided by multiple cameras in the network. Furthermore, according to the proposed correlation function and entropy-based framework, a correlation-based camera selection algorithm is designed. Experimental results show that the proposed spatial correlation function can model the correlation characteristics of visual information in WMSNs through low computation and communication costs. Further simulations show that, given a distortion bound at the sink, the correlation-based camera selection algorithm requires fewer cameras to report to the sink than the random selection algorithm.  相似文献   

4.
数码相机中的彩色成像传感器通过彩色滤波阵列(CFA)在空域经降采样获取三个颜色分量,并通过对三个颜色分量去马赛克、去噪和颜色校正等过程获得最终图像。成像过程的算法比较复杂,尤其部分过程的非线性和噪声影响促使成像过程更趋复杂。研究去马赛克算法引入的噪声及对噪声传播的影响,考虑去马赛克和去噪顺序不同对成像质量造成的影响,为了更好地理解每一步成像过程如何影响和传播噪声,对图像和噪声进行有效监测和分析,并通过MSE和s-CIELAB来衡量噪声特性,最后给出结论。  相似文献   

5.
针对双摄像机下存在人体遮挡情况时的跟踪问题,提出了利用人体3维位置信息来实现跟踪的方法。该方法首先对其中一个摄像机视频图像中的人体像素抽样,接着在其他摄像机视频图像中找出抽样像素的匹配点,计算出每一对匹配点在世界坐标系中所对应的3维点,然后依据3维位置信息将3维点聚类,找出每一个聚类区域中的3维点所对应的图像中的一组像素点,并对其构建高斯平滑直方图模型。在此基础上,依据直方图模型将互相遮挡的人体分割开来,最后根据求取的人体像素点的匹配关系来确定不同摄像机中同一个人的对应关系。实验结果表明,该方法能有效实现遮挡情况下的人体跟踪。  相似文献   

6.
目的 利用深度图序列进行人体行为识别是机器视觉和人工智能中的一个重要研究领域,现有研究中存在深度图序列冗余信息过多以及生成的特征图中时序信息缺失等问题。针对深度图序列中冗余信息过多的问题,提出一种关键帧算法,该算法提高了人体行为识别算法的运算效率;针对时序信息缺失的问题,提出了一种新的深度图序列特征表示方法,即深度时空能量图(depth spatial-temporal energy map,DSTEM),该算法突出了人体行为特征的时序性。方法 关键帧算法根据差分图像序列的冗余系数剔除深度图序列的冗余帧,得到足以表述人体行为的关键帧序列。DSTEM算法根据人体外形及运动特点建立能量场,获得人体能量信息,再将能量信息投影到3个正交轴获得DSTEM。结果 在MSR_Action3D数据集上的实验结果表明,关键帧算法减少冗余量,各算法在关键帧算法处理后运算效率提高了20% 30%。对DSTEM提取的方向梯度直方图(histogram of oriented gradient,HOG)特征,不仅在只有正序行为的数据库上识别准确率达到95.54%,而且在同时具有正序和反序行为的数据库上也能保持82.14%的识别准确率。结论 关键帧算法减少了深度图序列中的冗余信息,提高了特征图提取速率;DSTEM不仅保留了经过能量场突出的人体行为的空间信息,而且完整地记录了人体行为的时序信息,在带有时序信息的行为数据上依然保持较高的识别准确率。  相似文献   

7.
随着网络的持续发展,数据量以惊人的速度增长,冗余信息大量存在,同时数据间存在着复杂的关联关系,这使得现有的排序方法面临着严重的问题:信息冗余影响排序结果。基于异质信息网络,希望得到同时具有权威性、多样性的多目标排序模型。该模型将数据建模成一个异质信息网络,使用MutualRank通过直接在异质信息网络上的随机游走来更好地建模对象的权威度;使用PDRank融合各个对象的权威度及对象之间的多样性,最终能得到同时具备权威度及多样性的排序序列。该模型直接利用数据中的异质关联关系对对象的权威度进行建模,解决了数据冗余的问题。通过实验证明了MutualRank对于权威度的学习效果优于传统的PageRank,同时基于两阶段排序模型得到的排序结果也优于已有的基准方法。  相似文献   

8.
Abstract

Visual information must be represented digitally to allow its processing by computer. This representation by means of a finite amount of digital data conveys an enormous amount of superfluous information that can be eliminated. We can consider two kinds of superfluous information: statistical redundant and subjective redundant. Discrete transform coding takes blocks of pixels and transforms them into another domain, the transform domain, prior to coding and transmission. An important property is that coefficients need not to be transmitted all in order to obtain good-quality reconstructions. We have applied a special nonor-thogonal transform in order to compress radiographic images.  相似文献   

9.
Stereo-pair images obtained from two cameras can be used to compute three-dimensional (3D) world coordinates of a point using triangulation. However, to apply this method, camera calibration parameters for each camera need to be experimentally obtained. Camera calibration is a rigorous experimental procedure in which typically 12 parameters are to be evaluated for each camera. The general camera model is often such that the system becomes nonlinear and requires good initial estimates to converge to a solution. We propose that, for stereo vision applications in which real-world coordinates are to be evaluated, artificial neural networks be used to train the system such that the need for camera calibration is eliminated. The training set for our neural network consists of a variety of stereo-pair images and corresponding 3D world coordinates. We present the results obtained on our prototype mobile robot that employs two cameras as its sole sensors and navigates through simple regular obstacles in a high-contrast environment. We observe that the percentage errors obtained from our set-up are comparable with those obtained through standard camera calibration techniques and that the system is accurate enough for most machine-vision applications.  相似文献   

10.
由于无线传感器网络的资源比较有限,尤其是节点的能量受限,为了尽可能的减少信息收集与传输过程中的能耗,延长网络的寿命,本文提出了基于BP神经网络的路由协议改进算法模型,该算法模型将BP神经网络的层次结构与无线传感器网络路由协议的分簇结构相结合,在每个簇结构中应用设计一个三层的BP神经网络模型,把采集到的大量原始数据通过设计好的神经网络模型,得到能够反映原始数据特征的的少量的数据信息。只需要将融合得到的特征数据传送给汇聚节点,从而减少了数据信息的传送量,降低信息传送的通信能耗,延长网络生存时间。仿真结果表明:改进后的算法较LEACH协议在平衡节点能量和延长网络寿命方面具有更优越的性能.  相似文献   

11.
贫困地区的大规模疾病筛查主要依靠便携式手持摄像机和远程诊断来完成,这一过程获得的眼底图像往往质量较差,分辨率较低。针对这一问题,设计了一种基于信息蒸馏与异构上采样的轻量级超分辨网络。网络考虑到眼底图像与自然图像的区别,利用蒸馏特征对粗特征进行补足,然后以异构的方式将粗特征与深度特征分别进行上采样,最后集成两种上采样的特征得到高清眼底图像。从图像质量、参数内存和运行时间三个方面与先进的方法进行比较,在参数内存和运行时间成绩优异的同时取得了最高的图像质量,这为超分辨算法嵌入在手持眼底摄像机和普通医用设备上提供了思路。  相似文献   

12.
Video surveillance systems are consolidated techniques for monitoring eruptive phenomena in volcanic areas. Along with these systems, which use standard video cameras, people working in this field sometimes make use of infrared cameras providing useful information about the thermal evolution of eruptions. Real-time analysis of the acquired frames is required, along with image storing, to analyze and classify the activity of volcanoes. Human effort and large storing capabilities are hence required to perform monitoring tasks.In this paper we present a new strategy aimed at improving the performance of video surveillance systems in terms of human-independent image processing and storing optimization. The proposed methodology is based on real-time thermo-graphic analysis of the area considered. The analysis is performed by processing images acquired with an IR camera and extracting information about meaningful volcanic events.Two software tools were developed. The first provides information about the activity being monitored and automatically adapts the image storing rate. The second tool automatically produces useful information about the eruptive activity encompassed by a selected frame sequence.The software developed includes a suitable user interface allowing for convenient management of the acquired images and easy access to information about the volcanic activity monitored.  相似文献   

13.
视频拼接技术是计算机图形学和计算机视觉的重要分支,它的发展基于静态图像的拼接技术,但由于视频信息的复杂性,视频拼接也有区别于图像拼接,针对实际运用中的实时拼接的需要,本文提出了一种基于控制帧的固定摄像头视频拼接方法。首先采集控制帧图像,对摄像头进行参数标定获得相机内参和光心坐标,再使用一种改进的畸变矫正方法去除摄像头畸变带来的成像失真。然后对控制帧图像进行SIFT特征提取并进行粗匹配,再用RANSAC的方法剔除误匹配点并拟合出图像变换单应阵。最后使用查表法将各摄像头的图像同步投影到大场景图片上,对重合区域进行光亮补偿和多带融合。最终实现速度可达25帧/秒的实时视频拼接。  相似文献   

14.
We present a practical system which can provide a textured full-body avatar within 3 s. It uses sixteen RGB-depth (RGB-D) cameras, ten of which are arranged to capture the body, while six target the important head region. The configuration of the multiple cameras is formulated as a constraint-based minimum set space-covering problem, which is approximately solved by a heuristic algorithm. The camera layout determined can cover the full-body surface of an adult, with geometric errors of less than 5 mm. After arranging the cameras, they are calibrated using a mannequin before scanning real humans. The 16 RGB-D images are all captured within 1 s, which both avoids the need for the subject to attempt to remain still for an uncomfortable period, and helps to keep pose changes between different cameras small. All scans are combined and processed to reconstruct the photorealistic textured mesh in 2 s. During both system calibration and working capture of a real subject, the high-quality RGB information is exploited to assist geometric reconstruction and texture stitching optimization.  相似文献   

15.
This article proposes multiple self-organizing maps (SOMs) for control of a visuo-motor system that consists of a redundant manipulator and multiple cameras in an unstructured environment. The maps control the manipulator so that it reaches its end-effector at targets given in the camera images. The maps also make the manipulator take obstacle-free poses. Multiple cameras are introduced to avoid occlusions, and multiple SOMs are introduced to deal with multiple camera images. Some simulation results are shown.  相似文献   

16.
Probabilistic-topological calibration of widely distributed camera networks   总被引:1,自引:0,他引:1  
We propose a method for estimating the topology of distributed cameras, which can provide useful information for multi-target tracking in a wide area, without object identification among the fields of view (FOVs) of the cameras. In our method, each camera first detects objects in its observed images independently in order to obtain the positions/times where/when the objects enter/exit its FOV. Each obtained data is tentatively paired with all other data detected before the data is observed. A transit time between each paired data and their xy coordinates are then computed. Based on classifying the distribution of the transit times and the xy coordinates, object routes between FOVs can be detected. The classification is achieved by simple and robust vector quantization. The detected routes are then categorized to acquire the probabilistic-topological information of distributed cameras. In addition, offline tracking of observed objects can be realized by means of the calibration process. Experiments demonstrated that our method could automatically estimate the topological relationships of the distributed cameras and the object transits among them.  相似文献   

17.
A large portion of digital images available today are acquired using digital cameras or scanners. While cameras provide digital reproduction of natural scenes, scanners are often used to capture hard-copy art in a more controlled environment. In this paper, new techniques for nonintrusive scanner forensics that utilize intrinsic sensor noise features are proposed to verify the source and integrity of digital scanned images. Scanning noise is analyzed from several aspects using only scanned image samples, including through image denoising, wavelet analysis, and neighborhood prediction, and then obtain statistical features from each characterization. Based on the proposed statistical features of scanning noise, a robust scanner identifier is constructed to determine the model/brand of the scanner used to capture a scanned image. Utilizing these noise features, we extend the scope of acquisition forensics to differentiating scanned images from camera-taken photographs and computer-generated graphics. The proposed noise features also enable tampering forensics to detect postprocessing operations on scanned images. Experimental results are presented to demonstrate the effectiveness of employing the proposed noise features for performing various forensic analysis on scanners and scanned images.   相似文献   

18.
In this article, an image segmentation method based on the SOLNN self-organising logic neural network is studied. The input image is initially processed using the TCS texture-highlighting technique and is then presented to the SOLNN network which segments it. The SOLNN is characterised by a variable sensitivity which enables it to be fine-tuned to detect different sub-textures within each texture to the desired degree of detail. The experimental results reported here illustrate the fact that the SOLNN indeed clusters accurately the textural information so that each cluster represents a single texture even for images which are objectively very difficult to segment. Thus, it is supported that the proposed approach leads to the design of an effective texture-based image-segmentation system.  相似文献   

19.
变压器绕组形变是常见的故障,传统的诊断方法参数过多且受噪音干扰导致诊断性能较差。提出了一种基于灰度转换的特征提取方法,该方法将振动信号转换为灰度图像,有效地提取特征。针对单源信号特性信息强度随距离变化的问题,利用多源通道采集振动信息,并利用图像融合手段抑制多源图像中大量冗余信息、信噪比低的问题,提出基于多源Mallat-NIN-CNN网络的电力变压器绕组故障诊断模型,利用Mallat算法对多源振动信号灰度图像分解,通过基于区域特性量测和加权平均方法分别对各分解层的高频分量和低频分量进行融合,将重构的灰度图像输入NIN-CNN网络进行故障诊断。经实验验证,该方法有效抑制了多源信号中的噪声,提高特征信息的完整性,降低了计算量,提高了故障诊断准确性。  相似文献   

20.
Multi-view image sensing is currently gaining momentum, fostered by new applications such as autonomous vehicles and self-propelled robots. In this paper, we prototype and evaluate a multi-view smart vision system for object recognition. The system exploits an optimized Multi-View Convolutional Neural Network (MVCNN) in which the processing is distributed among several sensors (heads) and a camera body. The main challenge for designing such a system comes from the computationally expensive workload of real-time MVCNNs which is difficult to support with embedded processing and high frame rates. This paper focuses on the decisions to be taken for distributing an MVCNN on the camera heads, each camera head embedding a Field-Programmable Gate Array (FPGA) for processing images on the stream. In particular, we show that the first layer of the AlexNet network can be processed at the nearest of the sensors, by performing a Direct Hardware Mapping (DHM) using a dataflow model of computation. The feature maps produced by the first layers are merged and processed by a camera central processing node that executes the remaining layers. The proposed system exploits state-of-the-art deep learning optimization methods, such as parameter removing and data quantization. We demonstrate that accuracy drops caused by these optimizations can be compensated by the multi-view nature of the captured information. Experimental results conducted with the AlexNet CNN show that the proposed partitioning and resulting optimizations can fit the first layer of the multi-view network in low-end FPGAs. Among all the tested configurations, we propose 2 setups with an equivalent accuracy compared to the original network on the ModelNet40 dataset. The first one is composed of 4 cameras based on a Cyclone III E120 FPGA to embed the least expensive version in terms of logic resources while the second version requires 2 cameras based on a Cyclone 10 GX220 FPGA. This distributed computing with workload reduction is demonstrated to be a practical solution when building a real-time multi-view smart camera processing several frames per second.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号