首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
近年来,注意机制在行人重识别任务中效果较优,但是不同类型的注意机制(如空间注意、自注意等)联合使用的效果仍然有待提高.因此,文中首先提出改进型的卷积块注意模型(CBAM-Pro),再提出多类型特征网络模型.对CBAM-Pro与自注意机制的集成提取不同关注域的特征,同时引入不同划分粒度的局部特征,联合进行行人重识别.在现有的通用基准数据集上的实验验证文中模型的有效性与可靠性.  相似文献   

2.
杜鹏  宋永红  张鑫瑶 《自动化学报》2022,48(6):1457-1468
行人再识别是实现多目标跨摄像头跟踪的核心技术, 该技术能够广泛应用于安防、智能视频监控、刑事侦查等领域. 一般的行人再识别问题面临的挑战包括摄像机的低分辨率、行人姿态变化、光照变化、行人检测误差、遮挡等. 跨模态行人再识别相比于一般的行人再识别问题增加了相同行人不同模态的变化. 针对跨模态行人再识别中存在的模态变化问题, 本文提出了一种自注意力模态融合网络. 首先是利用CycleGAN生成跨模态图像. 在得到了跨模态图像后利用跨模态学习网络同时学习两种模态图像特征, 对于原始数据集中的图像利用SoftMax 损失进行有监督的训练, 对生成的跨模态图像利用LSR (Label smooth regularization) 损失进行有监督的训练. 之后, 使用自注意力模块将原始图像和CycleGAN生成的图像进行区分, 自动地对跨模态学习网络的特征在通道层面进行筛选. 最后利用模态融合模块将两种筛选后的特征进行融合. 通过在跨模态数据集SYSU-MM01上的实验证明了本文提出的方法和跨模态行人再识别其他方法相比有一定程度的性能提升.  相似文献   

3.
针对自适应图像隐写分析难度大、现有的模型难以对图像有利区域进行针对性分析的问题,提出了一种基于自注意力机制的图像隐写分析模型(self-attention steganalysis residual network,SA-SRNet)。该模型将自注意力机制引入SRNet(steganalysis residual network),引导模型更加关注图像全局对隐写分析有利的区域及图像长距离之间的依赖关系,解决了硬注意力机制在训练时容易陷入局部最优的问题。首先,奖励机制利用强化学习使模型找到对隐写分析最有利的检测点;其次,自注意力机制根据检测点生成注意力重点图像;最后,替换机制用注意力重点图像替换识别错误的图像,提高训练集的质量和模型的判别能力。实验在BOSSbase 1.01数据集上进行,结果表明SA-SRNet可获得比SRNet更好的隐写分析准确率,最多可提高1.8%。  相似文献   

4.
行人外观属性是区分行人差异的重要语义信息。行人属性识别在智能视频监控中有着至关重要的作用,可以帮助我们对目标行人进行快速的筛选和检索。在行人重识别任务中,可以利用属性信息得到精细的特征表达,从而提升行人重识别的效果。文中尝试将行人属性识别与行人重识别相结合,寻找一种提高行人重识别性能的方法,进而提出了一种基于特征定位与融合的行人重识别框架。首先,利用多任务学习的方法将行人重识别与属性识别结合,通过修改卷积步长和使用双池化来提升网络模型的性能。其次,为了提高属性特征的表达能力,设计了基于注意力机制的平行空间通道注意力模块,它不仅可以在特征图上定位属性的空间位置,而且还可以有效地挖掘与属性关联度较高的通道特征,同时采用多组平行分支结构减小误差,进一步提高网络模型的性能。最后,利用卷积神经网络设计特征融合模块,将属性特征与行人身份特征进行有效融合,以获得更具鲁棒性和表达力的行人特征。实验在两个常用的行人重识别数据集DukeMTMC-reID和Market-1501上进行,结果表明,所提方法在现有的行人重识别方法中处于领先水平。  相似文献   

5.
陈代丽  许国良 《计算机应用》2022,42(5):1391-1397
针对行人重识别任务跨域迁移时性能严重衰退的问题,提出了一种基于注意力机制学习域内变化的跨域行人重识别方法。首先,以ResNet50为基础架构并对其进行调整使其适合行人重识别任务,并引入实例-批归一化网络(IBN-Net)以提高模型的泛化能力,同时增加区域注意力分支以提取更具鉴别性的行人特征。对于源域的训练,将其作为分类任务,使用交叉熵损失进行源域的有监督学习,同时引入三元组损失来挖掘源域样本的细节,从而提高源域的分类性能。对于目标域的训练,通过学习域内变化来适应源域和目标域间的数据分布差异。在测试阶段,以ResNet50 pool-5层的输出作为图像特征,并计算查询图像与候选图像间的欧氏距离来度量两者的相似度。在两个大规模公共数据集Market-1501和DukeMTMC-reID上进行实验,所提方法的Rank-1准确率分别达到80.1%和67.7%,平均精度均值(mAP)分别为49.5%和44.2%。实验结果表明,所提方法在提高模型泛化能力方面性能较优。  相似文献   

6.
Tian  Peng  Mo  Hongwei  Jiang  Laihao 《Applied Intelligence》2021,51(11):7781-7793

Understanding scene image includes detecting and recognizing objects, estimating the interaction relationships of the detected objects, and describing image regions with sentences. However, since the complexity and variety of scene image, existing methods take object detection or vision relationship estimate as the research targets in scene understanding, and the obtained results are not satisfactory. In this work, we propose a Multi-level Semantic Tasks Generation Network (MSTG) to leverage mutual connections across object detection, visual relationship detection and image captioning, to solve jointly and improve the accuracy of the three vision tasks and achieve the more comprehensive and accurate understanding of scene image. The model uses a message pass graph to mutual connections and iterative updates across the different semantic features to improve the accuracy of scene graph generation, and introduces a fused attention mechanism to improve the accuracy of image captioning while using the mutual connections and refines of different semantic features to improve the accuracy of object detection and scene graph generation. Experiments on Visual Genome and COCO datasets indicate that the proposed method can jointly learn the three vision tasks to improve the accuracy of those visual tasks generation.

  相似文献   

7.
Xiang  Suncheng  Fu  Yuzhuo  Chen  Hao  Ran  Wei  Liu  Ting 《Multimedia Tools and Applications》2020,79(43-44):32079-32093

Person re-identification (re-ID) aims to match a specific person in a large gallery with different cameras and locations. Previous part-based methods mainly focus on part-level features with uniform partition, which increases learning ability for discriminative feature but not efficient or robust to scenarios with large variances. To address this problem, in this paper, we propose a novel feature fusion strategy based on traditional convolutional neural network. Then, a multi-branch deeper feature fusion network architecture is designed to perform discriminative learning for three semantically aligned region. Based on it, a novel self-attention mechanism is employed to softly assign corresponding weights to the semantic aligned feature during back-propagation. Comprehensive experiments have been conducted on several large-scale benchmark datasets, which demonstrates that proposed approach yields consistent and competitive re-ID accuracy compared with current single-domain re-ID methods.

  相似文献   

8.
Person re-identification means retrieving a same person in large amounts of images among disjoint camera views. An effective and robust similarity measure between a person image pair plays an important role in the re-identification tasks. In this work, we propose a new metric learning method based on least squares for person re-identification. Specifically, the similar training images pairs are used to learn a linear transformation matrix by being projected to finite discrete discriminant points using regression model; then, the metric matrix can be deduced by solving least squares problem with a closed form solution. We call it discriminant analytical least squares (DALS) metric. In addition, we develop the incremental learning scheme of DALS, which is particularly valuable in model retraining when given additional samples. Furthermore, DALS could be effectively kernelized to further improve the matching performance. Extensive experiments on the VIPeR, GRID, PRID450S and CUHK01 datasets demonstrate the effectiveness and efficiency of our approaches.  相似文献   

9.
Lu  Zeng  Huang  Guoheng  Pun  Chi-Man  Cheng  Lianglun 《Multimedia Tools and Applications》2020,79(29-30):21409-21439

Person re-identification is an image retrieval task, and its task is to perform a person matching in different cameras by a given person target. This research has been noticed and studied by more and more people. However, pose changes and occlusions often occur during a person walking. Especially in the most related methods, local features are not used to simply and effectively solve the problems of occlusion and pose changes. Moreover, the metric loss functions only consider the image-level case, and it cannot adjust the distance between local features well. To tackle the above problems, a novel person re-identification scheme is proposed. Through experiments, we found that we paid more attention to different parts of a person when we look at him from a horizontal or vertical perspective respectively. First, in order to solve the problem of occlusion and pose changes, we propose a Cross Attention Module (CAM). It enables the network to generate a cross attention map and improve the accuracy of person re-identification via the enhancement of the most significant local features of persons. The horizontal and vertical attention vectors of the feature maps are extracted and a cross attention map is generated, and the local key features are enhanced by this attention map. Second, in order to solve the problem of the lack of expression ability of the single-level feature maps, we propose a Multi-Level Feature Complementation Module (MLFCM). In this module, the missing information of high-level features is complemented by low-level features via short skip. Feature selection is also performed among deep features maps. The purpose of this module is to get the feature maps with complete information. Further, this module solves the problem of missing contour features in high-level semantic features. Third, in order to solve the problem that the current metric loss function cannot adjust the distance between local features, we propose Part Triple Loss Function (PTLF). It can reduce both within-class and increase between-class distance of the person parts. Experimental results show that our model achieves high values on Rank-k and mAP on Market-1501, Duke-MTMC and CUHK03-NP.

  相似文献   

10.
11.
朱利  林欣  徐亦飞  刘真  马英 《集成技术》2023,12(1):91-104
在现实的智慧城市安全场景中,传统的行人重识别方法已经难以满足复杂多样的识别任务要求。为实现多层次的行人重识别,该文提出将行人重识别技术与多层次的城市信息单元深度融合。在行人重识别任务中,现有的模型和注意力只关注鲁棒特征的学习,而该文基于特征向量差异,提出了差异注意力模块,以增强深度特征的判别力。结合差异注意力模块,该文开发了与多种骨干模型适配的差异注意力框架。此外,该文还提出了联合训练和单独训练两种训练策略。与其他行人重识别方法相比,差异注意力框架和训练策略在 Market-1501、CUHK03 和 MSMT17 数据集上均取得了更优的性能。  相似文献   

12.
耿圆  谭红臣  李敬华  王立春 《图学学报》2022,43(6):1193-1200
在以往的行人重识别方法中,绝大部分的工作集中于图像注意力区域的学习,却忽视了非注意力区域对最终特征学习的影响,如果在关注图像注意力区域的同时加强非注意力区域的特征学习,可进一步丰富最终的行人特征,有利于行人身份信息的准确识别。基于此,提出了视觉信息积累网络(VIA Net),该网络整体采用两分支结构,一个分支倾向于学习图像的全局特征,另一个分支则拓展为多分支结构,通过结合注意力区域和非注意力区域的特征逐步加强局部特征的学习,实现视觉信息的积累,进一步丰富特征信息。实验结果表明,在Market-1501等行人重识别数据集上,所提出的VIA Net网络达到了较高的实验性能;同时,在In-Shop Clothes Retrieval数据集上的实验证明:该网络也适用于一般的图像检索任务,具有一定的通用性。  相似文献   

13.
Zhang  Tao  Sun  Xing  Li  Xuan  Yi  Zhengming 《Applied Intelligence》2021,51(11):7679-7689

Generative adversarial network is widely used in person re-identification to expand data by generating auxiliary data. However, researchers all believe that using too much generated data in the training phase will reduce the accuracy of re-identification models. In this study, an improved generator and a constrained two-stage fusion network are proposed. A novel gesture discriminator embedded into the generator is used to calculate the completeness of skeleton pose images. The improved generator can make generated images more realistic, which would be conducive to feature extraction. The role of the constrained two-stage fusion network is to extract and utilize the real information of the generated images for person re-identification. Unlike previous studies, the fusion of shallow features is considered in this work. In detail, the proposed network has two branches based on the structure of ResNet50. One branch is for the fusion of images that are generated by the generated adversarial network, the other is applied to fuse the result of the first fusion and the original image. Experimental results show that our method outperforms most existing similar methods on Market-1501 and DukeMTMC-reID.

  相似文献   

14.
针对现有行人再识别算法在处理图像分辨率低、光照差异、姿态和视角多样等情况时,准确率低的问题,提出了基于空间注意力和纹理特征增强的多任务行人再识别算法.算法设计的空间注意力模块更注重与行人属性相关的潜在图像区域,融入属性识别网络,实现属性特征的挖掘;提出的行人再识别网络的纹理特征增强模块通过融合不同空间级别所对应的全局和...  相似文献   

15.
针对目前行人重识别算法在目标外观特征和度量算法方面的问题,提出一种融合BOW模型的多特征子空间行人重识别算法。在行人图像上采用2-D高斯模板将图像背景弱化,然后提取BOW特征描述子和YUV+HSV颜色特征描述子,并将其融合组成最终的特征描述子。在相似性度量方面,采用在原始特征空间学习一个子空间,并在该子空间学习测度矩阵的方法进行相似性度量。在VIPeR和CUHK01两个数据集上的实验结果表明,提出的算法能够明显地提高行人重识别率。  相似文献   

16.
受行人姿态变化、光照视角、背景变换等因素的影响,现有行人再识别模型通常对数据集中的行人分成若干块提取图像的局部特征进行辨识以提高识别精度,但存在人体局部特征不匹配、容易丢失非人体部件的上下文线索等问题。构建一种改进的行人再识别模型,通过将人体语义解析网络的局部特征进行对齐,增强行人语义分割模型对图像中行人任意轮廓的建模能力,利用局部注意力网络捕捉非人体部分丢失的语境线索。实验结果表明,该模型在Market-1501、DukeMTMC和CUHK03数据集上的平均精度均值分别达到83.5%、80.8%和92.4%,在DukeMTMC数据集上的Rank-1为90.2%,相比基于注意力机制、行人语义解析和局部对齐网络的行人再识别模型具有更强的鲁棒性和迁移性。  相似文献   

17.
The goal of human image generation (HIG) is to synthesize a human image in a novel pose. HIG can potentially benefit various computer vision applications and engineering tasks. The recently-developed CNN-based approach applies the attention architecture to vision tasks. However, owing to the locality in CNNs, extracting and maintaining the long-range pixel interactions input images is difficult. Thus, existing human image generation methods face limited content representation. In this paper, we propose a novel human image generation framework called HIGSA that can utilize the position information from the input source image. The proposed HIGSA contains two complementary self-attention blocks to generate photo-realistic human images, named as stripe self-attention block (SSAB) and content attention block (CAB), respectively. In SSAB, this paper establishes global dependencies of human images and computes the attention map for each pixel based on its relative spatial positions concerning other pixels. In CAB, this paper introduces an effective feature extraction module to interactively enhance both person’s appearance and shape feature representations. Therefore, the HIGSA framework inherently preserves the better appearance consistency and shape consistency with sharper details. Extensive experiments on mainstream datasets demonstrate that HIGSA achieves the state-of-the-art (SOTA) results.  相似文献   

18.
Guo  Junliang  Xue  Yanbing  Cai  Jing  Gao  Zan  Xu  Guangping  Zhang  Hua 《Multimedia Tools and Applications》2021,80(11):16425-16440

Bus passenger re-identification is a special case of person re-identification, which aims to establish identity correspondence between the front door camera and the back door camera. In bus environment,it is hard to capture the full body of the passengers. So this paper proposes a bus passenger re-identification dataset,which contains 97,136 head images of 1,720 passengers obtained from hundreds of thousands of video frames with different lighting and perspectives. We also provide a evaluation applied to the dataset based on deep learning and triplet loss. After data augmentation,using ResNet with trihard loss as benchmark network and pre-training on pedestrian re-identification dataset Market-1501, we achieve mAP accuracy of 55.79% and Rank-1 accuracy of 67.91% on passenger re-identification dataset.

  相似文献   

19.
行人重识别是计算机视觉领域一个重要的研究方向。近年来,随着视频监控需求的日益增长,基于视频序列的行人重识别研究受到了广泛的关注。典型的视频序列行人重识别系统由三部分构成:图片特征提取器(例如卷积神经网络)、提取时域信息的时域模型、损失函数。在固定特征提取器和损失函数的前提下,研究不同时域模型对视频行人重识别算法性能的影响,包括时域池化、时域注意力、循环神经网络。在Mars数据集上的实验结果表明:与基于图像的行人重识别基准算法相比,采用时域池化模型、时间注意力模型可以有效改善识别精度,但采用循环神经网络后识别效果比基准算法有所下降。  相似文献   

20.
金字塔场景解析网络存在图像细节信息随着网络深度加深而丢失的问题,导致小目标与物体边缘语义分割效果不佳、像素类别预测不够准确。提出一种基于改进自注意力机制的金字塔场景解析网络方法,将自注意力机制的通道注意力模块与空间注意力模块分别加入到金字塔场景解析网络的主干网络和加强特征提取网络中,使网络中的两个子网络能够分别从通道和空间两个方面提取图像中更重要的特征细节信息。针对现有的图像降维算法无法更好地提高自注意力机制计算效率的问题,在分析“词汇”顺序对自注意力机制计算结果影响的基础上,利用希尔伯特曲线遍历设计新的图像降维算法,并将该算法加入到空间自注意力模块中,以提高其计算能力。仿真实验结果表明,该方法在PASCAL VOC 2012和息肉分割数据集上的精度均有提高,小目标与物体边缘分割更加精细,其中在VOC 2012训练集中平均交并比与平均像素精度分别达到75.48%、85.07%,较基准算法分别提升了0.68、1.35个百分点。  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号