首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 0 毫秒
1.

Video anomaly detection automatically recognizes abnormal events in surveillance videos. Existing works have made advances in recognizing whether a video contains abnormal events; however, they cannot temporally localize the abnormal events within videos. This paper presents a novel anomaly attention-based framework for accurately temporally localize the abnormal events. Benefiting from the proposed framework, we can achieve frame-level VAD using video-level labels, which significantly reduces the burden of data annotation. Our method is an end-to-end deep neural network-based approach, which contains three modules: anomaly attention module (AAM), discriminative anomaly attention module (DAAM) and generative anomaly attention module (GAAM). Specifically, AAM is trained to generate the anomaly attention, which is used to measure the abnormal degree of each frame. Whereas, DAAM and GAAM are used to alternately augmenting AAM from two different aspects. On the one hand, DAAM enhancing AAM by optimizing the video-level video classification. On the other hand, GAAM adopts a conditional variational autoencoder to model the likelihood of each frame given the attention for refining AAM. As a result, AAM can generate higher anomaly scores for abnormal frames while lower anomaly scores for normal frames. Experimental results show that our proposed approach outperforms state-of-the-art methods, which validates the superiority of our AAVAD.

  相似文献   

2.
目的 计算机辅助技术以及显微病理图像处理技术给病理诊断带来了极大的便利。病理图像分割是常用的技术手段,可用于划分病灶和背景组织。开发高精度的分割算法,需要大量精准标注的数字病理图像,但是标注过程耗时费力,具有精准标注的病理图像稀少。而且,病理图像非常复杂,对病理组织分割算法的鲁棒性和泛化性要求极高。因此,本文提出一种基于图网络的病理图像分割框架。方法 该框架有全监督图网络(full supervised graph network,FSGNet)和弱监督图网络(weakly supervised graph network,WSGNet)两种模式,以适应不同标注量的数据集以及多种应用场景的精度需求。通过图网络学习病理组织的不规则形态,FSGNet能达到较高的分割精度;WSGNet采用超像素级推理,仅需要稀疏点标注就能分割病理组织。结果 本文在两个公开数据集GlaS(Gland Segmentation Challenge Dataset)(测试集分为A部分和B部分)、CRAG(colorectal adenocarcinoma gland)和一个私有数据集LUSC(lung squam...  相似文献   

3.
The feasibility of deep convolutional neural network for fabric defect detection has been proven, but the detection performance often depends on large-scale labeled datasets. However, it is troublesome to collect large amounts of fabric defects with pixel-level labeling in industrial production. Although the weakly supervised detection methods can reduce the labeling workload, fabric defect detection is still a challenging task due to the slight difference between defects and complex texture backgrounds, and the diversity of defect types. To alleviate this issue, this paper proposes an effective weakly supervised shallow network, called DLSE-Net, with Link-SE (L-SE) module and Dilation Up-Weight CAM (DUW-CAM) for fabric defect detection. Firstly, the network regards a residual connection as a new branch to alleviate the semantic gap generated by the connection of different layers. Secondly, L-SE module forces the weights to be associated with the overall network in a global optimization manner instead of only within a single layer. Finally, a novel DUW-CAM with an attention mechanism is proposed to improve the adaptability of the network by combining dilated convolution and attention mechanism. Moreover, DUW-CAM can effectively suppress the background and highlight defect regions, even on complex fabric textures. Experimental results demonstrate that our proposed approach can localize the defects with high accuracy, and outperforms the state-of-the-art methods on two distinctive fabric datasets with different textures.  相似文献   

4.
目标检测是计算机视觉领域的基本任务之一,根据标签信息的不同,可分为全监督目标检测、半监督目标检测和弱监督目标检测等。弱监督目标检测旨在仅利用图像级别的类别标记信息训练检测器,从而完成对测试图像中所有目标物体的定位和分类。因能够显著降低数据标记成本,弱监督目标检测愈发受到关注且已取得令人瞩目的进展。本文由弱监督目标检测的研究意义引入,首先介绍了弱监督目标检测的标签设置及问题定义、基于多示例学习的基础框架和面临的局部主导、实例歧义和计算消耗这3大难题,接着按核心网络架构将该领域的典型算法归纳为3大类,分别是基于优化候选框生成的算法、结合图像分割的算法和基于自训练的算法,并分别阐述各类算法的核心贡献。进一步地,本文通过实验在多种评估指标上对比了各类弱监督目标检测算法的检测效果。在VOC2007(visual object classes 2007)数据集中,平均精度均值(mean average precision,mAP)最高的方法为MIST(multiple instance self-training)算法(54.9%),正确定位率(correct localization,CorLo...  相似文献   

5.
目的 弱监督物体检测是一种仅利用图像类别标签训练物体检测器的技术。近年来弱监督物体检测器的精度不断提高,但在如何提升检出物体的完整性、如何从多个同类物体中区分出单一个体的问题上仍面临极大挑战。围绕上述问题,提出了基于物体布局后验概率图进行多物体图像增广的弱监督物体检测方法ProMIS(probability-based multi-object image synthesis)。方法 将检出物体存储到物体候选池,并将候选池中的物体插入到输入图像中,构造带有伪边界框标注的增广图像,进而利用增广后的图像训练弱监督物体检测器。该方法包含图像增广与弱监督物体检测两个相互作用的模块。图像增广模块将候选池中的物体插入一幅输入图像,该过程通过后验概率的估计与采样对插入物体的类别、位置和尺度进行约束,以保证增广图像的合理性;弱监督物体检测模块利用增广后的多物体图像、对应的类别标签、物体伪边界框标签训练物体检测器,并将原始输入图像上检到的高置信度物体储存到物体候选池中。训练过程中,为了避免过拟合,本文在基线算法的基础上增加一个并行的检测分支,即基于增广边界框的检测分支,该分支利用增广得到的伪边界框标注进行训练,原有基线算法的检测分支仍使用图像标签进行训练。测试时,本文方法仅使用基于增广边界框的检测分支产生检测结果。本文提出的增广策略和检测器的分支结构在不同弱监督物体检测器上均适用。结果 在Pascal VOC(pattern analysis, statistical modeling and computational learning visual object classes)2007和Pascal VOC 2012数据集上,将该方法嵌入到多种现有的弱监督物体检测器中,平均精度均值(mean average precision,mAP)平均获得了2.9%和4.2%的提升。结论 本文证明了采用弱监督物体检测伪边界框标签生成的增广图像包含丰富信息,能够辅助弱监督检测器学习物体部件、整体以及多物体簇之间的区别。  相似文献   

6.
罗会兰  陈虎 《计算机应用研究》2021,38(10):3196-3200
大多数弱监督语义分割的解决方案都利用图像级监督信息产生的类激活特征图进行训练学习.类激活特征图只能发现目标最具判别力的部分,它与真实的像素级标签信息存在较大差距,所以训练效果并不理想.对来自原图像及其仿射变化图像的类激活特征图进行对抗学习来达到更好的训练效果.首先将图像及对其进行仿射变化得到的图像输入孪生网络,使用图像级分类标签得到各自的类激活特征图,然后将这两组类激活特征图输入辨别网络进行对抗学习,训练孪生网络使得原图像与其仿射变化图像的类激活特征图逼近,从而有效利用等变注意力机制,学习更多的有效信息并缩小类激活特征图和真实的像素级标签之间的差距,提高弱监督的性能.在PASACAL VOC 2012数据集上,在验证集上的平均交并比为63.7%,测试集上的平均交并比为65.7%,与当前其他先进弱监督语义分割的方法进行对比,验证集与测试集上的平均交并比提高了1.2%和1.3%.该对抗性学习方案能有效利用等变注意力机制,学习更多的有效信息并缩小类激活特征图和真实的像素级标签之间的差距,提高弱监督的性能且达到了良好的分割效果.  相似文献   

7.
图像级标签的弱监督图像语义分割方法是目前比较热门的研究方向,类激活图生成方式是最为常用的解决该类问题的主要工作方法。由于类激活图的稀疏性,导致判别区域的准确性降低。针对上述问题,提出了一种改进的Transformer网络弱监督图像学习方法。首先,引入空间注意力交换层来扩大类激活图的覆盖范围;其次,进一步设计了一个注意力自适应模块,来指导模型增强弱区域的类响应;特别地,在类生成过程中,构建了一个自适应跨域来提高模型分类性能。该方法在Pascal VOC 2012 验证集和测试集上分别达到了73.5%和73.0%。实验结果表明,细化Transformer网络学习方法有助于提高弱监督图像的语义分割性能。  相似文献   

8.
罗萍  丁玲  杨雪  向阳 《计算机应用》2022,42(10):2990-2995
当前的事件检测模型严重依赖于人工标注的数据,在标注数据规模有限的情况下,事件检测任务中基于完全监督方法的深度学习模型经常会出现过拟合的问题,而基于弱监督学习的使用自动标注数据代替耗时的人工标注数据的方法又常常依赖于复杂的预定义规则。为了解决上述问题,就中文事件检测任务提出了一种基于BERT的混合文本对抗训练(BMAD)方法。所提方法基于数据增强和对抗学习设定了弱监督学习场景,并采用跨度抽取模型来完成事件检测任务。首先,为改善数据不足的问题,采用回译、Mix-Text等数据增强方法来增强数据并为事件检测任务创建弱监督学习场景;然后,使用一种对抗训练机制进行噪声学习,力求最大限度地生成近似真实样本的生成样本,并最终提高整个模型的鲁棒性。在广泛使用的真实数据集自动文档抽取(ACE)2005上进行实验,结果表明相较于NPN、TLNN、HCBNN等算法,所提方法在F1分数上获取了至少0.84个百分点的提升。  相似文献   

9.
视觉理解,如物体检测、语义和实例分割以及动作识别等,在人机交互和自动驾驶等领域中有着广泛的应用并发挥着至关重要的作用。近年来,基于全监督学习的深度视觉理解网络取得了显著的性能提升。然而,物体检测、语义和实例分割以及视频动作识别等任务的数据标注往往需要耗费大量的人力和时间成本,已成为限制其广泛应用的一个关键因素。弱监督学习作为一种降低数据标注成本的有效方式,有望对缓解这一问题提供可行的解决方案,因而获得了较多的关注。围绕视觉弱监督学习,本文将以物体检测、语义和实例分割以及动作识别为例综述国内外研究进展,并对其发展方向和应用前景加以讨论分析。在简单回顾通用弱监督学习模型,如多示例学习(multiple instance learning, MIL)和期望—最大化(expectation-maximization, EM)算法的基础上,针对物体检测和定位,从多示例学习、类注意力图机制等方面分别进行总结,并重点回顾了自训练和监督形式转换等方法;针对语义分割任务,根据不同粒度的弱监督形式,如边界框标注、图像级类别标注、线标注或点标注等,对语义分割研究进展进行总结分析,并主要回顾了基于图像级别类别...  相似文献   

10.
Fully convolutional networks (FCN) have achieved great success in human parsing in recent years. In conventional human parsing tasks, pixel-level labeling is required for guiding the training, which usually involves enormous human labeling efforts. To ease the labeling efforts, we propose a novel weakly supervised human parsing method which only requires simple object keypoint annotations for learning. We develop an iterative learning method to generate pseudo part segmentation masks from keypoint labels. With these pseudo masks, we train a FCN network to output pixel-level human parsing predictions. Furthermore, we develop a correlation network to perform joint prediction of part and object segmentation masks and improve the segmentation performance. The experiment results show that our weakly supervised method is able to achieve very competitive human parsing results. Despite that our method only uses simple keypoint annotations for learning, we are able to achieve comparable performance with fully supervised methods which use the expensive pixel-level annotations.  相似文献   

11.
目的 现有图像级标注的弱监督分割方法大多利用卷积神经网络获取伪标签,其覆盖的目标区域往往过小。基于Transformer的方法通常采用自注意力对类激活图进行扩张,然而受其深层注意力不准确性的影响,优化之后得到的伪标签中背景噪声比较多。为了利用该两类特征提取网络的优点,同时结合Transformer不同层级的注意力特性,构建了一种结合卷积特征和Transformer特征的自注意力融合调制网络进行弱监督语义分割。方法 采用卷积增强的Transformer (Conformer)作为特征提取网络,其能够对图像进行更加全面的编码,得到初始的类激活图。设计了一种自注意力层级自适应融合模块,根据自注意力值和层级重要性生成融合权重,融合之后的自注意力能够较好地抑制背景噪声。提出了一种自注意力调制模块,利用像素对之间的注意力关系,设计调制函数,增大前景像素的激活响应。使用调制后的注意力对初始类激活图进行优化,使其覆盖较多的目标区域,同时有效抑制背景噪声。结果 在最常用的PASCAL VOC 2012(pattern analysis,statistical modeling and computational learning visual object classes 2012)数据集和COCO 2014 (common objectes in context 2014)数据集上利用获得的伪标签进行分割网络的训练,在对比实验中本文算法均取得最优结果,在PASCAL VOC验证集上,平均交并比(mean intersection over union,mIoU)达到了70.2%,测试集上mIoU值为70.5%,相比对比算法中最优的Transformer模型,其性能在验证集和测试集上均提升了0.9%,相比于卷积神经网络最优方法,验证集上mIoU提升了0.7%,测试集上mIoU值提升了0.8%。在COCO 2014验证集上结果为40.1%,与对比算法中最优方法相比分割精度提高了0.5%。结论 本文提出的弱监督语义分割模型,结合了卷积神经网络和Transformer的优点,通过对Transformer自注意力进行自适应融合调制,得到了图像级标签下目前最优的语义分割结果,该方法可应用于三维重建、机器人场景理解等应用领域。此外,所构建的自注意力自适应融合模块和自注意力调制模块均可嵌入到Transformer结构中,为具体视觉任务获取更鲁棒、更具鉴别性的特征。  相似文献   

12.
Object detection and location from remote sensing (RS) images is challenging, computationally expensive, and labor intense. Benefiting from research on convolutional neural networks (CNNs), the performance in this field has improved in the recent years. However, object detection methods based on CNNs require a large number of images with annotation information for training. For object location, these annotations must contain bounding boxes. Furthermore, objects in RS images are usually small and densely co-located, leading to a high cost of manual annotation. We tackle the problem of weakly supervised object detection under such conditions, aiming to learn detectors with only image-level annotations, i.e., without bounding box annotations. Based on the fact that the feature maps of a CNN are localizable, we hierarchically fuse the location information from the shallow feature map with the class activation map to obtain accurate object locations. In order to mitigate the loss of small or densely distributed objects, we introduce a divergent activation module and a similarity module into the network. The divergent activation module is used to improve the response strength of the low-response areas in the shallow feature map. Densely distributed objects in RS images, such as aircraft in an airport, often exhibit a certain similarity. The similarity module is used to improve the feature distribution of the shallow feature map and to suppress background noise. Comprehensive experiments on a public dataset and a self-assembled dataset (which we made publicly available) show the superior performance of our method compared to state-of-the-art object detectors.  相似文献   

13.
Multimedia Tools and Applications - Recognizing a person’s affective state from audio-visual signals is an essential capability for intelligent interaction. Insufficient training data and the...  相似文献   

14.
In this paper, we present a weakly supervised learning approach for spoken language understanding in domain-specific dialogue systems. We model the task of spoken language understanding as a two-stage classification problem. Firstly, the topic classifier is used to identify the topic of an input utterance. Secondly, with the restriction of the recognized target topic, the slot classifiers are trained to extract the corresponding slot-value pairs. It is mainly data-driven and requires only minimally annotated corpus for training whilst retaining the understanding robustness and deepness for spoken language. More importantly, it allows that weakly supervised strategies are employed for training the two kinds of classifiers, which could significantly reduce the number of labeled sentences. We investigated active learning and naive self-training for the two kinds of classifiers. Also, we propose a practical method for bootstrapping topic-dependent slot classifiers from a small amount of labeled sentences. Experiments have been conducted in the context of the Chinese public transportation information inquiry domain and the English DARPA Communicator domain. The experimental results show the effectiveness of our proposed SLU framework and demonstrate the possibility to reduce human labeling efforts significantly.  相似文献   

15.
目的 图像级弱监督语义分割方法利用类别标签训练分割网络,可显著降低标注成本。现有方法大多采用类激活图定位目标物体,然而传统类激活图只能挖掘出物体中最具辨识性的区域,直接将其作为伪标签训练的分割网络精度较差。本文提出一种显著性引导的弱监督语义分割算法,可在获取更完整类激活图的基础上提高分割模型的性能。方法 首先通过显著图对目标进行互补随机隐藏,以获得互补图像对,然后融合互补图像对的类激活图作为监督,提高网络获取完整类激活图的能力。其次引入双重注意力修正模块,利用全局信息修正类激活图并生成伪标签训练分割网络。最后使用标签迭代精调策略,结合分割网络的初始预测、类激活图以及显著图生成更精确的伪标签,迭代训练分割网络。结果 在PASCAL VOC 2012(pattern analysis,statistical modeling and computational learning visual object classes 2012)数据集上进行类激活图生成实验与语义分割实验,所生成的类激活图更加完整,平均交并比有10.21%的提升。语义分割结果均优于对比方法,平均交并比提升6.9%。此外在...  相似文献   

16.
Multimedia Tools and Applications - Action recognition in still images is an interesting subject in computer vision. One of the most important problems in still image-based action recognition is...  相似文献   

17.
This paper approaches the relation classification problem in information extraction framework with different machine learning strategies, from strictly supervised to weakly supervised. A number of learning algorithms are presented and empirically evaluated on a standard data set. We show that a supervised SVM classifier using various lexical and syntactic features can achieve competitive classification accuracy. Furthermore, a variety of weakly supervised learning algorithms can be applied to take advantage of large amount of unlabeled data when labeling is expensive. Newly introduced random-subspace-based algorithms demonstrate their empirical advantage over competitors in the context of both active learning and bootstrapping.  相似文献   

18.
提出一种基于密度中心图的弱监督分类方法,利用少量已标注样本,结合大量未知模式样本进行弱监督学习。借助样本空间的密度信息,求出密度中心点来准确地反应数据的空间几何特征,在此基础上建图,利用标记传递方法,使得相似的顶点尽可能赋予相同的类别标记。该方法具备基于图的弱监督算法的良好数学基础,可以发现任意形状的类,对噪音不敏感。并且该方法具有近线性的时间复杂度,更适合处理大规模的数据。将该方法用于UCI机器学习数据集,实验证明,该方法能获得较好的分类效果。  相似文献   

19.
弱监督时序动作定位旨在定位视频中行为实例的起止边界及识别相应的行为。现有方法尽管取得很大进展,但依然存在动作定位不完整及短动作的漏检问题。为此,提出了特征挖掘与区域增强(FMRE)的定位方法。首先,通过基础分支计算视频片段之间的相似分数,并以此分数聚合上下文信息,得到更具有区别性的段分类分数,实现动作的完整定位;然后,添加增强分支,对基础分支定位中持续时间较短的动作提案沿时间维度进行动态上采样,进而采用多头自注意机制对动作提案间的时间结构显式建模,促进具有时间依赖关系的动作定位且防止短动作的漏检;最后,在两个分支之间构建伪标签互监督,逐步改进在训练过程中生成动作提案的质量。该算法在THUMOS14和ActivityNet1.3数据集上分别取得了70.3%和40.7%的检测性能,证明了所提算法的有效性。  相似文献   

20.
目的 医学图像的像素级标注工作需要耗费大量的人力。针对这一问题,本文以医学图像中典型的眼底图像视盘分割为例,提出了一种带尺寸约束的弱监督眼底图像视盘分割算法。方法 对传统卷积神经网络框架进行改进,根据视盘的结构特点设计新的卷积融合层,能够更好地提升分割性能。为了进一步提高视盘分割精度,本文对卷积神经网络的输出进行了尺寸约束,同时用一种新的损失函数对尺寸约束进行优化,所提的损失公式可以用标准随机梯度下降方法来优化。结果 在RIM-ONE视盘数据集上展开实验,并与经典的全监督视盘分割方法进行比较。实验结果表明,本文算法在只使用图像级标签的情况下,平均准确识别率(mAcc)、平均精度(mPre)和平均交并比(mIoU)分别能达到0.852、0.831、0.827。结论 本文算法不需要专家进行像素级标注就能够实现视盘的准确分割,只使用图像级标注就能够得到像素级标注的分割精度。缓解了医学图像中像素级标注难度大的问题。  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号