首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
Crowd counting is a conspicuous task in computer vision owing to scale variations, perspective distortions, and complex backgrounds. Existing research usually adopts the dilated convolution network to enlarge the receptive fields to solve the problem of scale variations. However, these methods easily bring background information into the large receptive fields to generate poor quality density maps. To address this problem, we propose a novel backbone called Context-guided Dense Attentional Dilated Network (CDADNet). CDADNet contains three components: an attentional module, a context-guided module and a dense attentional dilated module. The attentional module is used to provide attention maps which can remove background information, while the context-guided module is proposed to extract multi-scale contextual information. Moreover, the dense attentional dilated module aims to generate high-granularity density maps and the cascaded strategy is used to preserve information from changing scales. To verify the feasibility of our method, we compare it to the existing approaches on five crowd counting datasets (ShanghaiTech (Part_A and Part_B), WorldEXPO’10, UCSD, UCF_CC_50). The comparison results demonstrate that CDADNet is effective and robust for various scenes.  相似文献   

2.
对人群密度大、遮挡严重以及分不均等因素造成人群计数困难的问题,本文提出了一 种深度对抗式网络的人群计数模型, 该模型主要分为生成器网络和判别器网络。首先利用具有良好的迁移能力和特征提取能力VG G16的前十层作为前端模块,以初 步提取特征;然后,为应对人群遮挡严重以及分布不均的情况,使用我们设计的深度扩张卷 积模块来聚合人群信息,并将浅层与 深层人头特征进行融合,以增强网络对人群的适应能力。并在此过程中,使用扩张卷积代替 传统的卷积层,在不损失图像分辨率 的情况下对图像进行特征提取;最后,将密度图与标签密度图输入判别器网络进行判别,目 的是生成与标签密度图更为相似的密 度图,提高人群计数的准确性。实验结果表明,与其他方法相比,本文方法无论是在客观指 标或者主管视觉方面,均具有较好的效果。  相似文献   

3.
沈宁静  袁健 《电子科技》2022,35(6):6-12
现有人群计数算法采用多列融合结构来解决单一图像的多尺度问题,但该处理方法不能有效利用低层特征信息,从而导致最终人群计数结果不准确。针对这一缺陷,文中提出一种基于残差密集连接与注意力融合的人群计数算法。该算法的前端利用改进VGG16网络提取低级特征信息。算法后端主分支基于残差密集连接结构,利用残差网络和密集网络结合方式捕获层与层间的特征信息,可高效捕获多尺度信息。侧分支通过引入注意力机制,生成对应尺度注意力图,有效区分特征图的背景和前景,降低了背景噪声的影响。采用3个主流公开数据集对该算法进行验证。实验结果表明,该算法计数有效且计数准确率优于其他算法。  相似文献   

4.
Crowd counting has become a hot topic because of its wide applications in video surveillance and public security. However, one main problem of the deep learning methods for crowd counting is that the location information about the crowd is degraded irreversibly due to the spatial down-sampling of convolutional neural networks, which degrades the quality of generated density maps. To remedy the above problem, we propose an attention guided feature pyramid network (AG-FPN) for crowd counting, which can adaptively generate a high-quality density map with accurate spatial locations by combining the high- and low-level features. An attention block is added to each encoder layer to further emphasize the crowd regions and suppress the background clutters in feature extraction. Experimental results on the ShanghaiTech, UCF_CC_50, WorldExpo’10 and UCF-QNRF datasets demonstrate the superiority of the proposed method over state-of-the-art approaches.  相似文献   

5.
Crowd counting is a challenging task, which is partly due to the multiscale variation and perspective distortion of crowd images. To solve these problems, an improved deep multiscale crowd counting network with perspective awareness was proposed. This network contains two branches. One branch uses the improved ResNet50 network to extract multiscale features, and the other extracts perspective information using a perspective-aware network formed by fully convolutional networks. The proposed network structure improves the counting accuracy when the crowd scale changes, and reduce the influence of perspective distortion. To accommodate various crowd scenarios, data-driven approaches are used to fine-tune the trained convolutional neural networks (CNN) model of the target scenes. The extensive experiments on three public datasets demonstrate the validity and reliability of the proposed method.  相似文献   

6.
吴宏林  陈稳  汤辉 《信号处理》2021,37(11):2193-2199
信道估计作为无线通信的关键,近年来成为相关领域的研究热点。本文针对正交频分复用(Orthogonal Frequency Division Multiplexing,OFDM)系统下传统信道估计算法性能难以满足复杂场景的通信需求、受噪声影响大等问题,提出了一种基于反卷积网络及扩张卷积网络信道估计的深度学习方法。该方法利用信道的相关性构建了一个轻量级的反卷积网络,利用少数几层反卷积操作来逐步实现信道插值与估计,在较低的复杂度下较好地实现了信道估计。为改善估计性能,进一步构建了一个扩张卷积网络来抑制信道噪声,提高信道估计的准确度。仿真结果表明,在不同信噪比条件下,本文提出的基于反卷积及扩张卷积的深度学习方法比传统方法具有更低的估计误差,且复杂度较低。   相似文献   

7.
Crowd density estimation in wide areas is a challenging problem for visual surveillance. Because of the high risk of degeneration, the safety of public events involving large crowds has always been a major concern. In this paper, we propose a video-based crowd density analysis and prediction system for wide-area surveillance applications. In mo-nocular image sequences, the Accumulated Mosaic Image Difference (AMID) method is applied to extract crowd areas having irregular motion. The specific number of persons and velocity of a crowd can be adequately esti-mated by our system from the density of crowded areas. Using a multi-camera network, we can obtain predictions of a crowd’s density several minutes in advance. The system has been used in real applications, and numerous experiments conducted in real scenes (station, park, plaza) demonstrate the effectiveness and robustness of the proposed method.  相似文献   

8.
雷翰林  张宝华 《激光技术》2019,43(4):476-481
为了避免景深和遮挡的干扰, 提高人群计数的准确性, 采用了LeNet-5, AlexNet和VGG-16 3种模型, 提取图像中不同景深目标的特性, 调整上述模型的卷积核尺寸和网络结构, 并进行了模型融合。构造出一种基于多模型融合的深度卷积神经网络结构, 网络最后两层采用卷积核大小为1×1的卷积层取代传统的全连接层, 对提取的特征图进行信息整合并输出密度图, 极大地降低了网络参量且取得了一定提升的数据, 兼顾了算法效率和精度, 进行了理论分析和实验验证。结果表明, 在公开人群计数数据集shanghaitech两个子集和UCF_CC_50子集上, 本文中计数方法的平均绝对误差和均方误差分别是97.99和158.02, 23.36和41.86, 354.27和491.68, 取得比现有传统人群计数方法更好的性能; 通过迁移实验证明所提出的人群计数模型具有良好的泛化能力。该研究对人群计数精度的提高是有帮助的。  相似文献   

9.
The human visual system has the ability to rapidly identify and redirect attention to important visual information in high complexity scenes such as the human crowd. Saliency prediction in the human crowd scene is the process using computer vision techniques to imitate the human visual system, predicting which areas in a human crowd scene may attract human attention. However, it is a challenging task to identify which factors may attract human attention due to the high complexity of the human crowd scene. In this work, we propose Multiscale DenseNet — Dilated and Attention (MSDense-DAt), a convolutional neural network (CNN) using self-attention to integrate the result of knowledge-driven gaze in the human visual system to identify salient areas in the human crowd scene. Our method combines various state-of-the-art deep learning architectures to deal with the high complexity in human crowd image, such as multiscale DenseNet for multiscale deep features extraction, self-attention, and dilated convolution. Then the effectiveness of each component in our CNN architecture is evaluated by comparing different components combinations. Finally, the proposed method is further evaluated in different crowd density levels to appraise the effect of crowd density on model performance.  相似文献   

10.
Crowd counting algorithms have recently incorporated attention mechanisms into convolutional neural networks (CNNs) to achieve significant progress. The channel attention model (CAM), as a popular attention mechanism, calculates a set of probability weights to select important channel-wise feature responses. However, most CAMs roughly assign a weight to the entire channel-wise map, which makes useful and useless information being treat indiscriminately, thereby limiting the representational capacity of networks. In this paper, we propose a multi-scale and spatial position-based channel attention network (MS-SPCANet), which integrates spatial position-based channel attention models (SPCAMs) with multiple scales into a CNN. SPCAM assigns different channel attention weights to different positions of channel-wise maps to capture more informative features. Furthermore, an adaptive loss, which uses adaptive coefficients to combine density map loss and headcount loss, is constructed to improve network performance in sparse crowd scenes. Experimental results on four public datasets verify the superiority of the scheme.  相似文献   

11.
在有监督语音增强任务中,上下文信息对目标语音的估计产生重要影响,为了获取更加丰富的语音全局相关特征,该文以尽可能小的参数为前提,设计了一种新型卷积网络来进行语音增强。所提网络包含编码层、传输层与解码层3个部分:编解码部分提出一种2维非对称膨胀残差(2D-ADR)模块,其能明显减小训练参数并扩大感受野,提升网络对上下文信息的获取能力;传输层提出一种1维门控膨胀残差(1D-GDR)模块,该模块结合膨胀卷积、残差学习与门控机制,能够选择性传递特征并获取更多时序相关信息,同时采用密集跳跃连接的方式对8个1D-GDR模块进行堆叠,以增强层间信息流动并提供更多梯度传播方式;最后,对相应编解码层进行跳跃连接并引入注意力机制,以使解码过程获得更加鲁棒的底层特征。实验部分,使用了不同的参数设置以及对比方法来验证网络的有效性与鲁棒性,通过在28种噪声环境下训练及测试,相比于其他方法,该文方法以1.25×106的参数取得了更优的客观和主观指标,具备较强的增强效果与泛化能力。  相似文献   

12.
人群密度估计是智能化人群监控中的重要内容,在公共安防、管理控制和商业决策等方面起着重要作用。文中针对医院应用场景,采用一种基于分块的方法,对每一个子图像分别利用基于像素特征与最小二乘直线拟合方法进行人数定量分析和基于灰度共生矩阵与支持向量机的方法进行密度定性分析,得到整幅图像中不同子图及整幅图像的人数和密度分布图。实验表明,该方法能有效的提高人群密度估计的准确率,且还能对局部的密度异常精准定位。  相似文献   

13.
For reasons of public security, modeling large crowd distributions for counting or density estimation has attracted significant research interests in recent years. Existing crowd counting algorithms rely on predefined features and regression to estimate the crowd size. However, most of them are constrained by such limitations: (1) they can handle crowds with a few tens individuals, but for crowds of hundreds or thousands, they can only be used to estimate the crowd density rather than the crowd count; (2) they usually rely on temporal sequence in crowd videos which is not applicable to still images. Addressing these problems, in this paper, we investigate the use of a deep-learning approach to estimate the number of individuals presented in a mid-level or high-level crowd visible in a single image. Firstly, a ConvNet structure is used to extract crowd features. Then two supervisory signals, i.e., crowd count and crowd density, are employed to learn crowd features and estimate the specific counting. We test our approach on a dataset containing 107 crowd images with 45,000 annotated humans inside, and each with head counts ranging from 58 to 2201. The efficacy of the proposed approach is demonstrated in extensive experiments by quantifying the counting performance through multiple evaluation criteria.  相似文献   

14.
现有的人群计数方法不能够完全适用于轨道交通场景中,为此,提出一种基于卷积神经网络的人群计数模型。模型采用VGG16作为前端网络提取浅层特征,提出一种基于Inception结构改进的M-Inception结构,结合空洞卷积构成后端网络,增大感受野,适应多监控角度下不同尺寸的行人目标;并提出一种融合行人总数估计损失和密度图损失的加权损失函数。将本文模型与4种现有模型进行对比实验,结果表明,提出的人群计数算法在地铁场景中的平均绝对误差和均方误差仅为1.46和2.13,优于4种对比模型。考虑到模型的实际应用,将模型部署到海思嵌入式芯片上,实测结果表明,模型可在嵌入式芯片上取得较高的计算速度和准确率,满足实际应用场景的需求。  相似文献   

15.
场景监控中的人群密度估计   总被引:3,自引:0,他引:3  
人群密度估计是智能化人群监控中十分重要的内容,它对于人民群众的生命安全有着重要的作用和意义。本文提出了一种基于小波变换与灰度共生矩阵的人群密度特征提取方法,进而利用支撑向量机实现人群密度级别的估计。实验结果表明本文提出的方法是可行的。  相似文献   

16.
密集人群计数是计算机视觉领域的一个经典问题,仍然受制于尺度不均匀、噪声和遮挡等因素的影响。该文提出一种基于新型多尺度注意力机制的密集人群计数方法。深度网络包括主干网络、特征提取网络和特征融合网络。其中,特征提取网络包括特征支路和注意力支路,采用由并行卷积核函数组成的新型多尺度模块,能够更好地获取不同尺度下的人群特征,以适应密集人群分布的尺度不均匀特性;特征融合网络利用注意力融合模块对特征提取网络的输出特征进行增强,实现了注意力特征与图像特征的有效融合,提高了计数精度。在ShanghaiTech, UCF_CC_50, Mall和UCSD等公开数据集的实验表明,提出的方法在MAE和MSE两项指标上均优于现有方法。  相似文献   

17.
为了避免传统羊群计数任务中,羊只之间相互遮挡带来的干扰,提高羊群计数的准确度,采用了视觉几何群(VGG-16)与空洞卷积(DC)相结合的VDNet神经网络羊群计数方法。该方法在网络前端采用去除了全连接层的VGG-16网络提取2-D特征,后端采用6层具有不同空洞率的DC提取更多的高级特征;DC在保持分辨率不变的同时扩大了感受野,替代池化操作,降低了网络的复杂性;最后用一层卷积核大小为1×1的卷积层输出高质量的密度图,通过对密度图像素积分得出输入图片中羊的数量,并进行了理论分析和实验验证。结果表明,VDNet的平均绝对误差为2.51,均方误差为3.74,平均准确率为93%。这一结果对羊群计数任务是有帮助的。  相似文献   

18.
While some denoising methods based on deep learning achieve superior results on synthetic noise, they are far from dealing with photographs corrupted by realistic noise. Denoising on real-world noisy images faces more significant challenges due to the source of it is more complicated than synthetic noise. To address this issue, we propose a novel network including noise estimation module and removal module (NERNet). The noise estimation module automatically estimates the noise level map corresponding to the information extracted by symmetric dilated block and pyramid feature fusion block. The removal module focuses on removing the noise from the noisy input with the help of the estimated noise level map. Dilation selective block with attention mechanism in the removal module adaptively not only fuses features from convolution layers with different dilation rates, but also aggregates the global and local information, which is benefit to preserving more details and textures. Experiments on two datasets of synthetic noise and three datasets of realistic noise show that NERNet achieves competitive results in comparison with other state-of-the-art methods.  相似文献   

19.
Objects that occupy a small portion of an image or a frame contain fewer pixels and contains less information. This makes small object detection a challenging task in computer vision. In this paper, an improved Single Shot multi-box Detector based on feature fusion and dilated convolution (FD-SSD) is proposed to solve the problem that small objects are difficult to detect. The proposed network uses VGG-16 as the backbone network, which mainly includes a multi-layer feature fusion module and a multi-branch residual dilated convolution module. In the multi-layer feature fusion module, the last two layers of the feature map are up-sampled, and then they are concatenated at the channel level with the shallow feature map to enhance the semantic information of the shallow feature map. In the multi-branch residual dilated convolution module, three dilated convolutions with different dilated ratios based on the residual network are combined to obtain the multi-scale context information of the feature without losing the original resolution of the feature map. In addition, deformable convolution is added to each detection layer to better adapt to the shape of small objects. The proposed FD-SSD achieved 79.1% mAP and 29.7% mAP on PASCAL VOC2007 dataset and MS COCO dataset respectively. Experimental results show that FD-SSD can effectively improve the utilization of multi-scale information of small objects, thus significantly improve the effect of the small object detection.  相似文献   

20.
由于快速的卷积神经网络超分辨率重建算法(FSRCNN)卷积层数少、相邻卷积层的特征信息之间缺乏关联性,因此难以提取到图像深层信息导致图像超分辨率重建效果不佳。针对此问题,该文提出多级跳线连接的深度残差网络超分辨率重建方法。首先,该方法设计了多级跳线连接的残差块,在多级跳线连接的残差块基础上构造了多级跳线连接的深度残差网络,解决相邻卷积层的特性信息缺乏关联性的问题;然后,使用随机梯度下降法(SGD)以可调节的学习率策略对多级跳线连接的深度残差网络进行训练,得到该网络超分辨率重建模型;最后,将低分辨率图像输入到多级跳线连接的深度残差网络超分辨率重建模型中,通过多级跳线连接的残差块得到预测的残差特征值,再将残差图像和低分辨率图像组合在一起转化为高分辨率图像。该文方法与bicubic, A+, SRCNN, FSRCNN和ESPCN算法在Set5和Set14测试集上进行了对比测试,在视觉效果和评价指标数值上该方法都优于其它对比算法。  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号