首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
Because salient objects usually have fewer data in a scene, the problem of class imbalance is often encountered in salient object detection (SOD). In order to address this issue and achieve the consistent salient objects, we propose an adversarial focal loss network with improving generative adversarial networks for RGB-D SOD (called AFLNet), in which color and depth branches constitute the generator to achieve the saliency map, and adversarial branch with high-order potentials, instead of pixel-wise loss function, refines the output of the generator to obtain contextual information of objects. We infer the adversarial focal loss function to solve the problem of foreground–background class imbalance. To sufficiently fuse the high-level features of color and depth cues, an inception model is adopted in deep layers. We conduct a large number of experiments using our proposed model and its variants, and compare them with state-of-the-art methods. Quantitative and qualitative experimental results exhibit that our proposed approach can improve the accuracy of salient object detection and achieve the consistent objects.  相似文献   

2.
Depth maps have been proven profitable to provide supplements for salient object detection in recent years. However, most RGB-D salient object detection approaches ignore that there are usually low-quality depth maps, which will inevitably result in unsatisfactory results. In this paper, we propose a depth cue enhancement and guidance network (DEGNet) for RGB-D salient object detection by exploring the depth quality enhancement and utilizing the depth cue guidance to generate predictions with highlighted objects and suppressed backgrounds. Specifically, a depth cue enhancement module is designed to generate high-quality depth maps by enhancing the contrast between the foreground and the background. Then considering the different characteristics of unimodal RGB and depth features, we use different feature enhancement strategies to strengthen the representation capability of side-output unimodal features. Moreover, we propose a depth-guided feature fusion module to excavate depth cues provided by the depth stream to guide the fusion of multi-modal features by fully making use of different modal properties, thus generating discriminative cross-modal features. Besides, we aggregate cross-modal features at different levels to obtain the final prediction by adopting a pyramid feature shrinking structure. Experimental results on six benchmark datasets demonstrate that the proposed network DEGNet outperforms 17 state-of-the-art methods.  相似文献   

3.
Representing contextual features at multiple scales is important for RGB-D SOD. Recently, due to advances in backbone convolutional neural networks (CNNs) revealing stronger multi-scale representation ability, many methods achieved comprising performance. However, most of them represent multi-scale features in a layer-wise manner, which ignores the fine-grained global contextual cues in a single layer. In this paper, we propose a novel global contextual exploration network (GCENet) to explore the performance gain of multi-scale contextual features in a fine-grained manner. Concretely, a cross-modal contextual feature module (CCFM) is proposed to represent the multi-scale contextual features at a single fine-grained level, which can enlarge the range of receptive fields for each network layer. Furthermore, we design a multi-scale feature decoder (MFD) that integrates fused features from CCFM in a top-down way. Extensive experiments on five benchmark datasets demonstrate that the proposed GCENet outperforms the other state-of-the-art (SOTA) RGB-D SOD methods.  相似文献   

4.
朱佩佩  吴元  赖作镁 《电讯技术》2022,62(5):619-624
无人机目标检测与识别任务中,目标随着飞行高度的改变尺寸发生显著变化。常规目标检测模型中,获取的小目标细节信息有限,检测精度较低;而适用于小目标的实时检测模型往往容易丢失大目标的背景信息,降低大目标的检测精度。针对以上多尺度目标检测识别任务难点,提出一种基于改进特征金字塔网络(Feature Pyramid Network, FPN)结构的实时多尺度目标检测识别模型。该模型通过增加特征金字塔层级覆盖更广的目标尺度,获取更为丰富的目标信息;同时,利用跨连接增加不同尺度特征融合的多样性,降低特征传导距离,保留更加完整的尺度特征来提高模型检测识别多尺度目标的性能。通过实验发现,相比于原始网络结构和相同特征层级的四层特征金字塔结构,加入改进特征金字塔结构的多尺度目标检测模型识别性能得到了提升。  相似文献   

5.
LiDAR-based 3D object detection is important for autonomous driving scene perception, but point clouds produced by LiDAR are irregular and unstructured in nature, and cannot be adopted by the conventional Convolutional Neural Networks (CNN). Recently, Graph Convolutional Networks (GCN) has been proved as an ideal way to handle non-Euclidean structure data, as well as for point cloud processing. However, GCN involves massive computation for searching adjacent nodes, and the heavy computational cost limits its applications in processing large-scale LiDAR point cloud in autonomous driving. In this work, we adopt a frustum-based point cloud-image fusion scheme to reduce the amount of LiDAR point clouds, thus making the GCN-based large-scale LiDAR point clouds feature learning feasible. On this basis, we propose an efficient graph attentional network to accomplish the goal of 3D object detection in autonomous driving, which can learn features from raw LiDAR point cloud directly without any conversions. We evaluate the model on the public KITTI benchmark dataset, the 3D detection mAP is 63.72% on KITTI Cars, Pedestrian and Cyclists, and the inference speed achieves 7.9 fps on a single GPU, which is faster than other methods of the same type.  相似文献   

6.
7.
The cutting-edge RGB saliency models are prone to fail for some complex scenes, while RGB-D saliency models are often affected by inaccurate depth maps. Fortunately, light field images can provide a sufficient spatial layout depiction of 3D scenes. Therefore, this paper focuses on salient object detection of light field images, where a Similarity Retrieval-based Inference Network (SRI-Net) is proposed. Due to various focus points, not all focal slices extracted from light field images are beneficial for salient object detection, thus, the key point of our model lies in that we attempt to select the most valuable focal slice, which can contribute more complementary information for the RGB image. Specifically, firstly, we design a focal slice retrieval module (FSRM) to choose an appropriate focal slice by measuring the foreground similarity between the focal slice and RGB image. Secondly, in order to combine the original RGB image and the selected focal slice, we design a U-shaped saliency inference module (SIM), where the two-stream encoder is used to extract multi-level features, and the decoder is employed to aggregate multi-level deep features. Extensive experiments are conducted on two widely used light field datasets, and the results firmly demonstrate the superiority and effectiveness of the proposed SRI-Net.  相似文献   

8.
真实遥感图像中,目标呈现任意方向分布的特点,原始YOLOv5网络存在难以准确表达目标的位置和范围、以及检测速度一般的问题。针对上述问题,提出一种遥感影像旋转目标检测模型YOLOv5-Left-Rotation,首先利用Transformer自注意力机制,让模型更加注意感兴趣的目标,并且在图像预处理过程中采用Mosaic数据增强,对后处理过程使用改进后的非极大值抑制算法Non-Maximum Suppression。其次,引入角度损失函数,增加网络的输出维度,得到旋转矩形的预测框。最后,在网络模型的浅层阶段,增加滑动窗口分支,来提高大尺寸遥感稀疏目标的检测效率。实验数据集为自制飞机数据集CASIA-plane78和公开的舰船数据集HRSC2016,结果表明,改进旋转目标检测算法相比于原始YOLOv5网络的平均精度提升了3.175%,在吉林一号某星推扫出的大尺寸多光谱影像中推理速度提升了13.6%,能够尽可能地减少冗余背景信息,更加准确检测出光学遥感图像中排列密集、分布无规律的感兴趣目标的区域。  相似文献   

9.
Aggregation of local and global contextual information by exploiting multi-level features in a fully convolutional network is a challenge for the pixel-wise salient object detection task. Most existing methods still suffer from inaccurate salient regions and blurry boundaries. In this paper, we propose a novel edge-aware global and local information aggregation network (GLNet) to fully exploit the integration of side-output local features and global contextual information and utilization of contour information of salient objects. The global guidance module (GGM) is proposed to learn discriminative multi-level information with the direct guidance of global semantic knowledge for more accurate saliency prediction. Specifically, the GGM consists of two key components, where the global feature discrimination module exploits the inter-channel relationship of global semantic features to boost representation power, and the local feature discrimination module enables different side-output local features to selectively learn informative locations by fusing with global attentive features. Besides, we propose an edge-aware aggregation module (EAM) to employ the correlation between salient edge information and salient object information for generating estimated saliency maps with explicit boundaries. We evaluate our proposed GLNet on six widely-used saliency detection benchmark datasets by comparing with 17 state-of-the-art methods. Experimental results show the effectiveness and superiority of our proposed method on all the six benchmark datasets.  相似文献   

10.
李维鹏  杨小冈  李传祥  卢瑞涛  黄攀 《红外与激光工程》2021,50(3):20200511-1-20200511-8
针对红外数据集规模小,标记样本少的特点,提出了一种红外目标检测网络的半监督迁移学习方法,主要用于提高目标检测网络在小样本红外数据集上的训练效率和泛化能力,提高深度学习模型在训练样本较少的红外目标检测等场景当中的适应性。文中首先阐述了在标注样本较少时无标注样本对提高模型泛化能力、抑制过拟合方面的作用。然后提出了红外目标检测网络的半监督迁移学习流程:在大量的RGB图像数据集中训练预训练模型,后使用少量的有标注红外图像和无标注红外图像对网络进行半监督学习调优。另外,文中提出了一种特征相似度加权的伪监督损失函数,使用同一批次样本的预测结果相互作为标注,以充分利用无标注图像内相似目标的特征分布信息;为降低半监督训练的计算量,在伪监督损失函数的计算中,各目标仅将其特征向量邻域范围内的预测目标作为伪标注。实验结果表明,文中方法所训练的目标检测网络的测试准确率高于监督迁移学习所获得的网络,其在Faster R-CNN上实现了1.1%的提升,而在YOLO-v3上实现了4.8%的显著提升,验证了所提出方法的有效性。  相似文献   

11.
针对基于深度学习的激光雷达(light detection and ranging, LiDAR)点云三维(3D)目标检测对小目标的检测精度较低和噪声干扰问题,提出一种基于交叉自注意力机制的3D点云目标检测方法CSA-RCNN (cross self-attention region convolutional neural network)。利用交叉自注意力(cross self-attention, CSA)同时学习点云的坐标和特征,并设计多尺度融合(multi-scale fusion, MF)模块自适应捕捉各层级多尺度特征。此外,还设计重叠采样策略对感兴趣目标区域选择性地重采样以获得更多前景点,有效降低了噪声采样。在广泛使用的KITTI数据集上进行算法性能测试,结果表明,本文方法对行人等小目标的检测精度有较大提升,平均精度均值相比PointRCNN等4种经典算法均获得提升,显著提高3D点云目标的检测性能。  相似文献   

12.
13.
针对复杂道路场景下行人检测精度与速度难以提升的问题,提出一种融合多尺度信息和跨维特征引导的轻量级行人检测算法。首先以高性能检测器YOLOX为基础框架,构建多尺度轻量卷积并嵌入主干网络中,以获取多尺度特征信息。然后设计了一种端到端的轻量特征引导注意力模块,采用跨维通道加权的方式将空间信息与通道信息融合,引导模型关注行人的可视区域。最后为减少模型在轻量化过程中特征信息的损失,使用增大感受野的深度可分离卷积构建特征融合网络。实验结果表明,相比于其他主流检测算法,所提算法在KITTI数据集上达到了71.03%的检测精度和80 FPS的检测速度,在背景复杂、密集遮挡、尺度不一等场景中都具有较好的鲁棒性和实时性。  相似文献   

14.
Most of current salient object detection (SOD) methods focus on well-lit scenes, and their performance drops when generalized into low-light scenes due to limitations such as blurred boundaries and low contrast. To solve this problem, we propose a global guidance-based integration network (G2INet) customized for low-light SOD. First, we propose a Global Information Flow (GIF) to extract comprehensive global information, for guiding the fusion of multi-level features. To facilitate information integration, we design a Multi-level features Cross Integration (MCI) module, which progressively fuses low-level details, high-level semantics, and global information by interweaving. Furthermore, a U-shaped Attention Refinement (UAR) module is proposed to further refine edges and details for accurate saliency predictions. In terms of five metrics, extensive experimental results demonstrate that our method outperforms the existing twelve state-of-the-art models.  相似文献   

15.
Dynamic object detection is essential for ensuring safe and reliable autonomous driving. Recently, light detection and ranging (LiDAR)-based object detection has been introduced and shown excellent performance on various benchmarks. Although LiDAR sensors have excellent accuracy in estimating distance, they lack texture or color information and have a lower resolution than conventional cameras. In addition, performance degradation occurs when a LiDAR-based object detection model is applied to different driving environments or when sensors from different LiDAR manufacturers are utilized owing to the domain gap phenomenon. To address these issues, a sensor-fusion-based object detection and classification method is proposed. The proposed method operates in real time, making it suitable for integration into autonomous vehicles. It performs well on our custom dataset and on publicly available datasets, demonstrating its effectiveness in real-world road environments. In addition, we will make available a novel three-dimensional moving object detection dataset called ETRI 3D MOD.  相似文献   

16.
In recent years, Wireless Sensor Networks (WSNs) have demonstrated successful applications for both civil and military tasks. However, sensor networks are susceptible to multiple types of attacks because they are randomly deployed in open and unprotected environments. It is necessary to utilize effective mechanisms to protect sensor networks against multiple types of attacks on routing protocols. In this paper, we propose a lightweight intrusion detection framework integrated for clustered sensor networks. Furthermore, we provide algorithms to minimize the triggered intrusion modules in clustered WSNs by using an over‐hearing mechanism to reduce the sending alert packets. Our scheme can prevent most routing attacks on sensor networks. In in‐depth simulation, the proposed scheme shows less energy consumption in intrusion detection than other schemes. Copyright © 2009 John Wiley & Sons, Ltd.  相似文献   

17.
曾婧  吴宏刚  张翔 《电讯技术》2017,57(11):1283-1288
为了改善运动目标检测的精度,提出了一种融合了预测过采样的运动目标检测新方法.首先,基于二维傅里叶变换预测当前帧的目标形状并计算形状相似度;然后,从历史检测结果中选择一定数量的参考帧,使用光流法跟踪目标像素点在参考帧与当前帧之间的运动轨迹,并以像素点轨迹为参考在采样区间执行稠密过采样;最后,基于过采样样本构造前景模型,并在图分割框架内联合使用前景背景模型实现目标检测.在公共数据与自采数据集上对所提方法进行了实验验证,结果表明,相对于经典的运动目标检测算法,所提方法能够有效提高检测精度.  相似文献   

18.
Robotic soccer is nowadays a popular research domain in the area of multi-robot systems. In the context of RoboCup, the Middle Size League is one of the most challenging. This paper presents an efficient omnidirectional vision system for real-time object detection, developed for the robotic soccer team of the University of Aveiro, CAMBADA. The vision system is used to find the ball and white lines, which are used for self-localization, as well as to find the presence of obstacles. Algorithms for detecting these objects and also for calibrating most of the parameters of the vision system are presented in this paper. We also propose an efficient approach for detecting arbitrary FIFA balls, which is an important topic of research in the Middle Size League. The experimental results that we present show the effectiveness of our algorithms, both in terms of accuracy and processing time, as well as the results that the team has been achieving: 1st place in RoboCup 2008, 3rd place in 2009 and 1st place in the mandatory technical challenge in RoboCup 2009, where the robots have to play with an arbitrary standard FIFA ball.  相似文献   

19.
陈哲  王慧斌  沈洁  徐立中 《通信学报》2013,34(3):192-198
由于图像建模及参数估计的困难和复杂性,水下目标检测算法的性能受到了严重影响。受水下生物视觉信息处理机制的启发,针对特殊的水下光学环境提出一种新的基于光强—光谱—偏振的仿生信息融合目标检测方法,能够根据所获得的水下光学先验知识进行适应性特征融合。算法摆脱繁琐的图像预处理过程,以较低的运算复杂度为代价实现可靠的目标检测结果。  相似文献   

20.
农业病害会导致作物脱叶早,光合作用减弱,从而影响作物质量,减少农民收入。针对病害初发期间目标小、背景复杂和室外光线变化大导致的目标误检问题,本文提出一种融合轻量级网络的YOLOv4检测算法。首先对主干网络进行剪枝并增加多尺度的分组卷积提高模型对复杂背景的抗干扰性,其次设计轻量级SCE(space channel expand)注意力机制降低深层网络中细节信息丢失的影响。最后设计跳跃连接特征金字塔(jump connection feature pyramid network, JC-FPN)替换PAnet(path aggregation network)特征融合模块从而进一步实现模型轻量化。实验结果表明,改进算法在本文数据集上的mAP50达到了84.17%,检测速度为50 FPS,相比于YOLOv4检测算法分别提高了0.71%和10 FPS,满足移动端对农业病害的检测精度和速度的要求。  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号