首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
Infrared and visible image fusion aims to synthesize a single fused image containing salient targets and abundant texture details even under extreme illumination conditions. However, existing image fusion algorithms fail to take the illumination factor into account in the modeling process. In this paper, we propose a progressive image fusion network based on illumination-aware, termed as PIAFusion, which adaptively maintains the intensity distribution of salient targets and preserves texture information in the background. Specifically, we design an illumination-aware sub-network to estimate the illumination distribution and calculate the illumination probability. Moreover, we utilize the illumination probability to construct an illumination-aware loss to guide the training of the fusion network. The cross-modality differential aware fusion module and halfway fusion strategy completely integrate common and complementary information under the constraint of illumination-aware loss. In addition, a new benchmark dataset for infrared and visible image fusion, i.e., Multi-Spectral Road Scenarios (available at https://github.com/Linfeng-Tang/MSRS), is released to support network training and comprehensive evaluation. Extensive experiments demonstrate the superiority of our method over state-of-the-art alternatives in terms of target maintenance and texture preservation. Particularly, our progressive fusion framework could round-the-clock integrate meaningful information from source images according to illumination conditions. Furthermore, the application to semantic segmentation demonstrates the potential of our PIAFusion for high-level vision tasks. Our codes will be available at https://github.com/Linfeng-Tang/PIAFusion.  相似文献   

2.
Multispectral pedestrian detection has received much attention in recent years due to its superiority in detecting targets under adverse lighting/weather conditions. In this paper, we aim to generate highly discriminative multi-modal features by aggregating the human-related clues based on all available samples presented in multispectral images. To this end, we present a novel multispectral pedestrian detector performing locality guided cross-modal feature aggregation and pixel-level detection fusion. Given a number of single bounding boxes covering pedestrians in both modalities, we deploy two segmentation sub-branches to predict the existence of pedestrians on visible and thermal channels. By referring to the important locality information in the reference modality, we perform locality guided cross-modal feature aggregation to learn highly discriminative human-related features in the complementary modality by exploring the clues of all available pedestrians. Moreover, we utilize the obtained spatial locality maps to provide prediction confidence scores in visible and thermal channels and conduct pixel-wise adaptive fusion of detection results in complementary modalities. Extensive experiments demonstrate the effectiveness of our proposed method, outperforming the current state-of-the-art detectors on both KAIST and CVC-14 multispectral pedestrian detection datasets.  相似文献   

3.
目的 目前主流物体检测算法需要预先划定默认框,通过对默认框的筛选剔除得到物体框。为了保证足够的召回率,就必须要预设足够密集和多尺度的默认框,这就导致了图像中各个区域被重复检测,造成了极大的计算浪费。提出一种不需要划定默认框,实现完全端到端深度学习语义分割及物体检测的多任务深度学习模型(FCDN),使得检测模型能够在保证精度的同时提高检测速度。方法 首先分析了被检测物体数量不可预知是目前主流物体检测算法需要预先划定默认框的原因,由于目前深度学习物体检测算法都是由图像分类模型拓展而来,被检测数量的无法预知导致无法设置检测模型的输出,为了保证召回率,必须要对足够密集和多尺度的默认框进行分类识别;物体检测任务需要物体的类别信息以实现对不同类物体的识别,也需要物体的边界信息以实现对各个物体的区分、定位;语义分割提取了丰富的物体类别信息,可以根据语义分割图识别物体的种类,同时采用语义分割的思想,设计模块提取图像中物体的边界关键点,结合语义分割图和边界关键点分布图,从而完成物体的识别和定位。结果 为了验证基于语义分割思想的物体检测方法的可行性,训练模型并在VOC(visual object classes)2007 test数据集上进行测试,与目前主流物体检测算法进行性能对比,结果表明,利用新模型可以同时实现语义分割和物体检测任务,在训练样本相同的条件下训练后,其物体检测精度优于经典的物体检测模型;在算法的运行速度上,相比于FCN,减少了8 ms,比较接近于YOLO(you only look once)等快速检测算法。结论 本文提出了一种新的物体检测思路,不再以图像分类为检测基础,不需要对预设的密集且多尺度的默认框进行分类识别;实验结果表明充分利用语义分割提取的丰富信息,根据语义分割图和边界关键点完成物体检测的方法是可行的,该方法避免了对图像的重复检测和计算浪费;同时通过减少语义分割预测的像素点数量来提高检测效率,并通过实验验证简化后的语义分割结果仍足够进行物体检测任务。  相似文献   

4.
Multispectral pedestrian detection is an important functionality in various computer vision applications such as robot sensing, security surveillance, and autonomous driving. In this paper, our motivation is to automatically adapt a generic pedestrian detector trained in a visible source domain to a new multispectral target domain without any manual annotation efforts. For this purpose, we present an auto-annotation framework to iteratively label pedestrian instances in visible and thermal channels by leveraging the complementary information of multispectral data. A distinct target is temporally tracked through image sequences to generate more confident labels. The predicted pedestrians in two individual channels are merged through a label fusion scheme to generate multispectral pedestrian annotations. The obtained annotations are then fed to a two-stream region proposal network (TS-RPN) to learn the multispectral features on both visible and thermal images for robust pedestrian detection. Experimental results on KAIST multispectral dataset show that our proposed unsupervised approach using auto-annotated training data can achieve performance comparable to state-of-the-art deep neural networks (DNNs) based pedestrian detectors trained using manual labels.  相似文献   

5.
储珺  束雯  周子博  缪君  冷璐 《自动化学报》2022,48(1):282-291
遮挡及背景中相似物干扰是行人检测准确率较低的主要原因. 针对该问题, 提出一种结合语义和多层特征融合(Combining semantics with multi-level feature fusion, CSMFF)的行人检测算法. 首先, 融合多个卷积层特征, 并在融合层上添加语义分割, 得到的语义特征与相应的卷积层连接作为行人位置的先验信息, 增强行人和背景的辨别性. 然后, 在初步回归的基础上构建行人二次检测模块(Pedestrian secondary detection module, PSDM), 进一步排除误检物体. 实验结果表明, 所提算法在数据集Caltech和CityPersons上漏检率(Miss rate, MR)为7.06 %和11.2 %. 该算法对被遮挡的行人具有强鲁棒性, 同时可方便地嵌入到其他检测框架.  相似文献   

6.

Semantic edge detection (SED), which aims at jointly extracting edges as well as their category information, has far-reaching applications in domains such as semantic segmentation, object proposal generation, and object recognition. SED naturally requires achieving two distinct supervision targets: locating fine detailed edges and identifying high-level semantics. Our motivation comes from the hypothesis that such distinct targets prevent state-of-the-art SED methods from effectively using deep supervision to improve results. To this end, we propose a novel fully convolutional neural network using diverse deep supervision within a multi-task framework where bottom layers aim at generating category-agnostic edges, while top layers are responsible for the detection of category-aware semantic edges. To overcome the hypothesized supervision challenge, a novel information converter unit is introduced, whose effectiveness has been extensively evaluated on SBD and Cityscapes datasets.

  相似文献   

7.
叶剑锋  徐轲  熊峻峰  王化明 《计算机工程》2021,47(9):203-209,216
为提高网络模型低层特征的离散度和语义分割算法的性能,以全卷积神经网络作为基础模型,提出一种基于辅助损失、边缘检测辅助任务和注意力机制的语义分割算法。通过重新设计网络模型的辅助损失分支,使网络低层特征编码更多语义信息。在多任务学习中,选择边缘检测作为辅助任务,基于注意力机制设计边缘检测的辅助任务分支,使网络模型更关注物体的形状和边缘信息。在此基础上,将基础模型、辅助损失分支、辅助任务分支集成构造为语义分割模型。在VOC2012数据集上的实验结果表明,该算法的平均交并比为71.5%,相比基础模型算法提高了6个百分点。  相似文献   

8.
行人碰撞预警系统通常依据行人检测与碰撞时间判断的方式为驾驶员提供预警信息。为了提供更加可靠的危险判断依据,本文提出一种同时分析道路状况与驾驶员头部姿态的行人碰撞预警方法,用两个单目相机分别获取车辆内外环境图像。通道特征检测器用于定位行人,根据单目视觉距离测量方法估计出行人与自车间的纵向与横向距离。多任务级联卷积网络用于定位驾驶员面部特征点,通过求解多点透视问题获取头部方向角以反映驾驶员注意状态。结合行人位置信息与驾驶员状态信息,本文构建模糊推理系统判断碰撞风险等级。在实际路况下的实验结果表明,根据模糊系统输出的风险等级可以为预防碰撞提供有效的指导。  相似文献   

9.
针对光照变化和大位移运动等复杂场景下图像序列变分光流计算的边缘模糊与过度分割问题,文中提出基于运动优化语义分割的变分光流计算方法.首先,根据图像局部区域的去均值归一化匹配模型,构建变分光流计算能量泛函.然后,利用去均值归一化互相关光流估计结果,获取图像运动边界信息,优化语义分割,设计运动约束语义分割的变分光流计算模型.最后,融合图像不同标签区域光流,获得光流计算结果.在Middlebury、UCF101数据库上的实验表明,文中方法的光流估计精度与鲁棒性较高,尤其对光照变化、弱纹理和大位移运动等复杂场景的边缘保护效果较优.  相似文献   

10.
获取周围环境中的语义信息是语义同时定位与建图(Simultaneous Localization and Mapping,SLAM)的重要任务,然而,采用语义分割或实例分割网络会影响系统的时间性能,采用目标检测方法又会损失一部分精度.因此,文中提出联合深度图聚类与目标检测的像素级分割算法,在保证实时性的前提下,提高当前语义SLAM系统的定位精度.首先,采用均值滤波算法对深度图的无效点进行修复,使深度信息更真实可靠.然后,分别对RGB图像和对应的深度图像进行目标检测和K-means聚类处理,结合两者结果,得出像素级的物体分割结果.最后,利用上述结果剔除周围环境中的动态点,建立完整、不含动态物体的语义地图.在TUM数据集和真实家居场景中分别进行深度图修复、像素级分割、估计相机轨迹与真实相机轨迹对比实验,结果表明,文中算法具有较好的实时性与鲁棒性.  相似文献   

11.
Xiao  Feng  Liu  Baotong  Li  Runa 《Multimedia Tools and Applications》2020,79(21-22):14593-14607

In response to the problem that the primary visual features are difficult to effectively address pedestrian detection in complex scenes, we present a method to improve pedestrian detection using a visual attention mechanism with semantic computation. After determining a saliency map with a visual attention mechanism, we can calculate saliency maps for human skin and the human head-shoulders. Using a Laplacian pyramid, the static visual attention model is established to obtain a total saliency map and then complete pedestrian detection. Experimental results demonstrate that the proposed method achieves state-of-the-art performance on the INRIA dataset with 92.78% pedestrian detection accuracy at a very competitive time cost.

  相似文献   

12.
视频行人检测是计算机视觉的一个重要应用,本文利用深度学习检测近似垂直视角的行人,但若单纯检测行人,易受与行人语义相关的行人附属属性(如背包和帽子)的干扰,容易造成误检.本文提出一种基于更快区域卷积神经网络的联合语义行人检测方法:首先调整网络模型,增强对小目标的辨别力,使其可以有效的检测行人和行人的语义属性;然后利用空间关系建立行人及其语义属性的关联,合并行人与其语义信息,并对候选行人目标进行自适应得分调整,结合行人语义属性判断候选行人目标.大量的实验表明,本文的方法精度高,速度快,具有实用价值,且检出的行人与其语义属性还可用于后续的人数统计和行人行为分析.  相似文献   

13.
音松  陈雪云  贝学宇 《计算机工程》2021,47(6):271-276,283
Mask RCNN算法在特征提取过程中存在语义信息丢失的问题,而自然场景中的行人具有姿态不同、遮挡和背景复杂等特点,导致算法应用于行人实例分割时检测准确性较差。对此,提出一种改进的Mask RCNN算法。在Mask RCNN网络的Mask分支中增加串联特征金字塔网络(CFPN)模块,对网络生成的多层特征进行融合,充分利用不同特征层的语义信息。在此基础上,执行RoI Align操作生成行人掩膜。仿照COCO数据集,从生活场景中拍摄1 000张图片,自建一个新的行人数据集。基于该数据集的实验结果表明,改进算法较原算法具有更高的检测精确率。  相似文献   

14.
Fusion of laser and vision in object detection has been accomplished by two main approaches: (1) independent integration of sensor-driven features or sensor-driven classifiers, or (2) a region of interest (ROI) is found by laser segmentation and an image classifier is used to name the projected ROI. Here, we propose a novel fusion approach based on semantic information, and embodied on many levels. Sensor fusion is based on spatial relationship of parts-based classifiers, being performed via a Markov logic network. The proposed system deals with partial segments, it is able to recover depth information even if the laser fails, and the integration is modeled through contextual information—characteristics not found on previous approaches. Experiments in pedestrian detection demonstrate the effectiveness of our method over data sets gathered in urban scenarios.  相似文献   

15.
王雪  李占山  陈海鹏 《软件学报》2022,33(9):3165-3179
基于U-Net的编码-解码网络及其变体网络在医学图像语义分割任务中取得了卓越的分割性能.然而,网络在特征提取过程中丢失了部分空间细节信息,影响了分割精度.另一方面,在多模态的医学图像语义分割任务中,这些模型的泛化能力和鲁棒性不理想.针对以上问题,本文提出一种显著性引导及不确定性监督的深度卷积编解码网络,以解决多模态医学图像语义分割问题.该算法将初始生成的显著图和不确定概率图作为监督信息来优化语义分割网络的参数.首先,通过显著性检测网络生成显著图,初步定位图像中的目标区域;然后,根据显著图计算不确定分类的像素点集合,生成不确定概率图;最后,将显著图和不确定概率图与原图像一同送入多尺度特征融合网络,引导网络关注目标区域特征的学习,同时增强网络对不确定分类区域和复杂边界的表征能力,以提升网络的分割性能.实验结果表明,本文算法能够捕获更多的语义信息,在多模态医学图像语义分割任务中优于其他的语义分割算法,并具有较好的泛化能力和鲁棒性.  相似文献   

16.
A new spatio-temporal segmentation approach for moving object(s) detection and tracking from a video sequence is described. Spatial segmentation is carried out using rough entropy maximization, where we use the quad-tree decomposition, resulting in unequal image granulation which is closer to natural granulation. A three point estimation based on Beta Distribution is formulated for background estimation during temporal segmentation. Reconstruction and tracking of the object in the target frame is performed after combining the two segmentation outputs using its color and shift information. The algorithm is more robust to noise and gradual illumination change, because their presence is less likely to affect both its spatial and temporal segments inside the search window. The proposed methods for spatial and temporal segmentation are seen to be superior to several related methods. The accuracy of reconstruction has been significantly high.  相似文献   

17.
受行人姿态变化、光照视角、背景变换等因素的影响,现有行人再识别模型通常对数据集中的行人分成若干块提取图像的局部特征进行辨识以提高识别精度,但存在人体局部特征不匹配、容易丢失非人体部件的上下文线索等问题。构建一种改进的行人再识别模型,通过将人体语义解析网络的局部特征进行对齐,增强行人语义分割模型对图像中行人任意轮廓的建模能力,利用局部注意力网络捕捉非人体部分丢失的语境线索。实验结果表明,该模型在Market-1501、DukeMTMC和CUHK03数据集上的平均精度均值分别达到83.5%、80.8%和92.4%,在DukeMTMC数据集上的Rank-1为90.2%,相比基于注意力机制、行人语义解析和局部对齐网络的行人再识别模型具有更强的鲁棒性和迁移性。  相似文献   

18.
一种细胞分割新方法*   总被引:2,自引:1,他引:1  
为了解决传统的细胞彩色图像分割中遇到的诸多问题,采用了一种新颖的多光谱成像技术进行图像采集,并探索性地将支持向量机用于骨髓细胞图像的分割。实验证明这种以多光谱技术为基础的分割方法推广性好,准确率非常高,并且对采集设备状况和涂片质量依赖性低。  相似文献   

19.
行人外观属性是区分行人差异的重要语义信息。行人属性识别在智能视频监控中有着至关重要的作用,可以帮助我们对目标行人进行快速的筛选和检索。在行人重识别任务中,可以利用属性信息得到精细的特征表达,从而提升行人重识别的效果。文中尝试将行人属性识别与行人重识别相结合,寻找一种提高行人重识别性能的方法,进而提出了一种基于特征定位与融合的行人重识别框架。首先,利用多任务学习的方法将行人重识别与属性识别结合,通过修改卷积步长和使用双池化来提升网络模型的性能。其次,为了提高属性特征的表达能力,设计了基于注意力机制的平行空间通道注意力模块,它不仅可以在特征图上定位属性的空间位置,而且还可以有效地挖掘与属性关联度较高的通道特征,同时采用多组平行分支结构减小误差,进一步提高网络模型的性能。最后,利用卷积神经网络设计特征融合模块,将属性特征与行人身份特征进行有效融合,以获得更具鲁棒性和表达力的行人特征。实验在两个常用的行人重识别数据集DukeMTMC-reID和Market-1501上进行,结果表明,所提方法在现有的行人重识别方法中处于领先水平。  相似文献   

20.
目的 行人检测在自动驾驶、视频监控领域中有着广泛应用,是一个热门的研究话题。针对当前基于深度学习的行人检测算法在分辨率较低、行人尺度较小的情况下存在误检和漏检问题,提出一种融合多层特征的多尺度的行人检测算法。方法 首先,针对行人检测问题,删除了深度残差网络的一部分,仅采用深度残差网络的3个区域提取特征图,然后采用最邻近上采样法将最后一层提取的特征图放大两倍后再用相加法,将高层语义信息丰富的特征和低层细节信息丰富的特征进行融合;最后将融合后的3层特征分别输入区域候选网络中,经过softmax分类,得到带有行人的候选框,从而实现行人检测的目的。结果 实验结果表明,在Caltech行人检测数据集上,在每幅图像虚警率(FPPI)为10%的条件下,本文算法丢失率仅为57.88%,比最好的模型之一——多尺度卷积神经网络模型(MS-CNN)丢失率(60.95%)降低3.07%。结论 深层的特征具有高语义信息且感受野较大的特点,而浅层的特征具有位置信息且感受野较小的特点,融合两者特征可以达到增强深层特征的效果,让深层的特征具有较为丰富的目标位置信息。融合后的多层特征图具有不同程度的细节和语义信息,对检测不同尺度的行人有较好的效果。所以利用融合后的特征进行行人检测,能够提高行人检测性能。  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号