期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

1.

赵冰李平代明睿马小宁《计算机系统应用》2019,28(6):228-234

铁路检测、监测领域产生海量的图像数据,基于图像场景进行分类对图像后续分析、管理具有重要价值.本文提出一种结合深度卷积神经神经网络DCNN （Deep Convolutional Neural Networks）与梯度类激活映射Grad-CAM （Grad Class Activation Mapping）的可视化场景分类模型,DCNN在铁路场景分类图像数据集进行迁移学习,实现特征提取,Grad-CAM根据梯度全局平均计算权重实现对类别的加权热力图及激活分数计算,提升分类模型可解释性.实验中对比了不同的DCNN网络结构对铁路图像场景分类任务性能影响,对场景分类模型实现可视化解释,基于可视化模型提出了通过降低数据集内部偏差提升模型分类能力的优化流程,验证了深度学习技术对于图像场景分类任务的有效性. 相似文献

2.

基于视觉敏感区域信息增强的室内场景分类算法^*

史静朱虹王婧薛杉《模式识别与人工智能》2017,30(6):520-529

在室内场景分类问题中,由于场景本身结构的复杂性和多样性,存在各种干扰因素,影响分类的准确性.针对上述问题,文中提出基于视觉敏感区域信息增强的室内场景分类算法,通过融合基于视觉敏感区域信息增强的局部特征与全局特征,构成多尺度空间-频率融合特征,实现对室内场景的正确分类.在3个标准测试集上的实验表明,文中算法对多个不同场景分类数据集均有较好的分类结果,适用性较强. 相似文献

3.

Multiple resolution block feature for remote-sensing scene classification

Chen Wang Wei Lin Pengfei Tang 《International journal of remote sensing》2019,40(18):6884-6904

相似文献

4.

基于层次信息融合的声学场景分类

下载免费PDF全文

江港马忠臣《计算机系统应用》2023,32(10):140-146

声学场景分类技术可以通过在公共区域中录制的音频分析出它的录制环境, 在日常生活中发挥着重要的作用. 与传统分类问题类与类之间没有关系不同, 声学场景分类的类别间存在着层次结构关系(父类与子类), 如机场和购物中心的父类为室内. 而现有的方法在设计时并未考虑声学场景分类任务的这一特性, 忽略了父类和子类间的依赖关系. 因此, 本文利用声学场景类别间的层次结构关系, 提出了一种基于层次信息融合的声学场景分类方法. 该方法为父类和子类分别设计了单独的分类器, 在子类分类的过程中融合了父类的信息, 并设计了层次依赖损失来对预测的父类和子类不匹配的情况进行惩罚. 在TAU城市声学场景2020移动开发数据集上的实验结果表明, 基于层次信息融合的方法有效地提升了声学场景分类模型的性能, 分类准确率提升了1.1%. 相似文献

5.

利用立体视觉的室内场景分类

下载免费PDF全文

张蕾赵海霞普杰信刘宏《计算机工程与应用》2012,48(34):203-206

场景分类的目标是为各种视觉处理任务建立语义上下文,尤其是为目标识别。双目视觉系统现已广泛配备在智能机器人上,然而场景分类的任务大多只是使用单目图像。由于室内场景的复杂性,使用单目图像进行场景分类的性能很低。提出了一种基于双目视觉的室内场景分类方法,使用在一些特定区域里拟合出的若干平面的参数作为场景的特征。采用层级的分类方法,依据视差图,场景被分为开放场所类和封闭场所类,利用提出的场景特征和Gist特征对上述两类进行细分。为了验证提出的方法,建立了一个包含四种场景类别的图像数据集。实验结果表明提出的方法取得了较好的分类性能。相似文献

6.

基于脉冲卷积神经网络稀疏表征的高分辨率遥感图像场景分类方法

张哲益曹卫华朱蕊胡文凯吴敏《控制与决策》2022,37(9):2305-2313

遥感图像场景分类对土地资源管理具有重要意义,然而高分辨率遥感图像中地物分布复杂,图像中存在着与当前场景无关的冗余信息,会对场景的精确分类造成影响.对此,提出一种基于脉冲卷积神经网络(SCNN)稀疏表征的场景分类方法.从稀疏表征出发,利用脉冲神经元的稀疏脉冲输出特性,设计脉冲卷积神经网络,去除遥感图像中与场景无关的冗余信息,实现对图像的稀疏表征;提出基于脉冲输出交叉熵损失函数的反向传播算法,在该算法的基础上利用梯度下降训练脉冲卷积神经网络,优化网络参数,实现遥感图像场景分类;通过实验验证方法的有效性,将所提出方法应用于Google和UCM两个遥感图像数据集,并与传统的卷积神经网络(CNN)进行对比.实验结果表明,所提出方法可以对遥感图像进行稀疏表征,实现场景分类;相对于卷积神经网络,所提出方法在遥感图像场景分类任务上更具有优势. 相似文献

7.

深度学习在航拍场景分类中的应用

李晓龙张兆翔王蕴红刘庆杰《计算机科学与探索》2014,(3):305-312

最近几十年来,航拍图片和视频在城市规划、沿海地区监视、军事任务等方面都得到了广泛的运用。因而了解航拍图片中所包含的内容,研究航拍视频所拍摄的场景类型就显得异常重要。目前流行的场景分类算法大多是针对自然场景的,很少有针对高分辨率航拍场景分类的算法。针对高分辨率航拍图片的场景分类给出了一种分层式算法。该算法首先用尺度不变特征转换（scale-invariant feature transform,SIFT）算法提取鲁棒的块局部特征,然后在视觉词袋的基础上,用经局限型波兹曼模型（restricted Boltzmarm machine,RBM）初始化的深层信念网络（deep belief network,DBN）来表示低层特征与高层视频特征之间的关系;同时深层信念网络也起到了分类器的作用。实验结果表明,该算法在处理高分辨率航拍图片场景分类问题时都要略好于目前主流算法。相似文献

8.

Improved scene classification using efficient low-level features and semantic cues

Navid Serrano Author Vitae Andreas E. Savakis^{Author Vitae} 《Pattern recognition》2004,37(9):1773-1784

Prior research in scene classification has focused on mapping a set of classic low-level vision features to semantically meaningful categories using a classifier engine. In this paper, we propose improving the established paradigm by using a simplified low-level feature set to predict multiple semantic scene attributes that are integrated probabilistically to obtain a final indoor/outdoor scene classification. An initial indoor/outdoor prediction is obtained by classifying computationally efficient, low-dimensional color and wavelet texture features using support vector machines. Similar low-level features can also be used to explicitly predict the presence of semantic features including grass and sky. The semantic scene attributes are then integrated using a Bayesian network designed for improved indoor/outdoor scene classification. 相似文献

9.

Local autocorrelation of similarities with subspaces for shift invariant scene classification

Kazuhiro Hotta Author Vitae 《Pattern recognition》2011,44(4):794-799

This paper presents a shift invariant scene classification method based on local autocorrelation of similarities with subspaces. Although conventional scene classification methods used bag-of-visual words for scene classification, superior accuracy of kernel principal component analysis (KPCA) of visual words to bag-of-visual words was reported. Here we also use KPCA of visual words to extract rich information for classification. In the original KPCA of visual words, all local parts mapped into subspace were integrated by summation to be robust to the order, the number, and the shift of local parts. This approach discarded the effective properties for scene classification such as the relation with neighboring regions. To use them, we use (normalized) local autocorrelation (LAC) feature of the similarities with subspaces (outputs of KPCA of visual words). The feature has both the relation with neighboring regions and the robustness to shift of objects in scenes. The proposed method is compared with conventional scene classification methods using the same database and protocol, and we demonstrate the effectiveness of the proposed method. 相似文献

10.

A thousand words in a scene 总被引：2，自引：0，他引：2

Quelhas P Monay F Odobez JM Gatica-Perez D Tuytelaars T 《IEEE transactions on pattern analysis and machine intelligence》2007,29(9):1575-1589

This paper presents a novel approach for visual scene modeling and classification, investigating the combined use of text modeling methods and local invariant features. Our work attempts to elucidate (1) whether a textlike bag-of-visterms (BOV) representation (histogram of quantized local visual features) is suitable for scene (rather than object) classification, (2) whether some analogies between discrete scene representations and text documents exist, and 3) whether unsupervised, latent space models can be used both as feature extractors for the classification task and to discover patterns of visual co-occurrence. Using several data sets, we validate our approach, presenting and discussing experiments on each of these issues. We first show, with extensive experiments on binary and multiclass scene classification tasks using a 9,500-image data set, that the BOV representation consistently outperforms classical scene classification approaches. In other data sets, we show that our approach competes with or outperforms other recent more complex methods. We also show that probabilistic latent semantic analysis (PLSA) generates a compact scene representation, is discriminative for accurate classification, and is more robust than the BOV representation when less labeled training data is available. Finally, through aspect-based image ranking experiments, we show the ability of PLSA to automatically extract visually meaningful scene patterns, making such representation useful for browsing image collections. 相似文献

11.

A template-based baseball video scene classification using efficient playfield segmentation

Chung-Ming Kuo Wei-Han Chang Min-Yuan Fang Ching-Hsuan Lin 《Multimedia Tools and Applications》2011,55(3):399-422

In this paper, we present an effective and efficient framework for baseball video scene classification. The results of scene classification can be able to provide the ground for baseball video abstraction and high-level event extraction. In general, most conventional approaches are shot-based, which shot change detection and key-frame extraction are necessary prerequisite procedures. On the contrary, we propose a frame-based approach. In our scene classification framework, an efficient playfield segmentation technique is proposed, and then the reduced field maps are utilized as scene templates. Because the shot change detection and the key-frame extraction are not required in proposed method, the new framework is very simple and efficient. The experimental results have demonstrated that the effectiveness of our proposed framework for baseball videos scene classification, and it can be easily extended the template-based approach to other kinds of sports videos. 相似文献

12.

CNN-XGBoost混合模型在音频场景分类中的应用

杨立东胡江涛张壮壮《小型微型计算机系统》2021,(1):213-217

在拥有海量数据和强大计算能力的人工智能时代,音频场景分类成为了场景理解的重要研究内容之一.针对音频场景分类建模困难和精确率不高的问题,本文提出一种基于卷积神经网络和极端梯度提升算法相结合的系统模型.首先,将预处理后的音频信号转换成梅尔声谱图,然后输入到卷积神经网络中完成抽象特征提取,最后利用极端梯度提升算法进行分类.为了评估模型的有效性,在城市音频场景UrbanSound8K数据集上进行分类性能测试,结果表明,该混合算法模型对音频场景的分类精确率可以达到89%,优于传统的神经网络算法模型,说明该混合模型对音频场景分类问题的有效性. 相似文献

13.

Learning multi-label scene classification

Matthew R. Boutell Author Vitae Author Vitae Xipeng Shen Author Vitae Author Vitae 《Pattern recognition》2004,37(9):1757-1771

In classic pattern recognition problems, classes are mutually exclusive by definition. Classification errors occur when the classes overlap in the feature space. We examine a different situation, occurring when the classes are, by definition, not mutually exclusive. Such problems arise in semantic scene and document classification and in medical diagnosis. We present a framework to handle such problems and apply it to the problem of semantic scene classification, where a natural scene may contain multiple objects such that the scene can be described by multiple class labels (e.g., a field scene with a mountain in the background). Such a problem poses challenges to the classic pattern recognition paradigm and demands a different treatment. We discuss approaches for training and testing in this scenario and introduce new metrics for evaluating individual examples, class recall and precision, and overall accuracy. Experiments show that our methods are suitable for scene classification; furthermore, our work appears to generalize to other classification problems of the same nature. 相似文献

14.

基于模糊模式识别的场景图像分类方法

王艳玲张玘罗诗途《微计算机信息》2007,23(34):220-221,259

为解决装甲车车载图像跟踪系统中对场景进行分类的问题．提出了一种根据模糊模式识别原理对场景图像进行分类的方法。首先通过建立场景图像的特征向量来对场景进行描述，并根据所选特征参数设计了矢量隶属函数，然后通过计算隶属函数的值来对场景图像进行分类。实验证明，场景分类的结果对跟踪系统中目标提取方法的选取，有较好的指导作用。相似文献

15.

基于半监督特征融合的监控视频场景识别研究

申小虎安居白《计算机仿真》2021,(1)

针对单模态特征条件下监控视频的场景识别精度与鲁棒性不高的问题,提出一种基于特征融合的半监督学习场景识别系统。系统模型首先通过卷积神经网络预训练模型分别提取视频帧与音频的场景描述特征;然后针对场景识别的特点进行视频级特征融合;接着通过深度信念网络进行无监督训练,并通过加入相对熵正则化项代价函数进行有监督调优;最后对模型分类效果进行了仿真分析。仿真结果表明,上述模型可有效提升监控场景分类精度,满足针对海量监控视频进行自动化结构化分析等公安业务需求。相似文献

16.

视频监控系统中小运动目标分类算法 总被引：1，自引：0，他引：1

方帅王东署迟健男徐心和《信息与控制》2005,34(2):201-204

给出了视频监控中的一个小目标分类算法．首先，利用最大互信息获得一组可靠、独立且具辨认力的目标特征集．然后，用有向无环图的多类支持向量机进行分类．分类器的训练分为两步，首先使用场景无关的特征量训练得到基准分类器；然后再利用与场景相关和无关的特征量，进一步训练分类器，以便提高分类器的精度．实验结果证明该算法不仅能满足一定的分类精度，而且对新场景具有很好的适应能力．相似文献

17.

Scene classification using multi-scale deeply described visual words

Wenzhi Zhao 《International journal of remote sensing》2016,37(17):4119-4131

This article presents a deep learning-based Multi-scale Bag-of-Visual Words (MBVW) representation for scene classification of high-resolution aerial imagery. Specifically, the convolutional neural network (CNN) is introduced to learn and characterize the complex local spatial patterns at different scales. Then, the learnt deep features are exploited in a novel way to generate visual words. Moreover, the MBVW representation is constructed using the statistics of the visual word co-occurrences at different scales, which are derived from a training data set. We apply our technique to the challenging aerial scene data set: the University of California (UC) Merced data set consisting of 21 different aerial scene categories with sub-metre resolution. The experimental results show that the statistics of deeply described visual words can characterize the scene well and improve classification accuracy. It demonstrates that the proposed method is highly effective in the scene classification of high-resolution remote-sensing imagery. 相似文献

18.

Joint scene classification and segmentation based on hidden Markov model 总被引：2，自引：0，他引：2

《Multimedia, IEEE Transactions on》2005,7(3):538-550

Scene classification and segmentation are fundamental steps for efficient accessing, retrieving and browsing large amount of video data. We have developed a scene classification scheme using a Hidden Markov Model (HMM)-based classifier. By utilizing the temporal behaviors of different scene classes, HMM classifier can effectively classify presegmented clips into one of the predefined scene classes. In this paper, we describe three approaches for joint classification and segmentation based on HMM, which search for the most likely class transition path by using the dynamic programming technique. All these approaches utilize audio and visual information simultaneously. The first two approaches search optimal scene class transition based on the likelihood values computed for short video segment belonging to a particular class but with different search constrains. The third approach searches the optimal path in a super HMM by concatenating HMM's for different scene classes. 相似文献

19.

数据驱动的层次场景序列识别模型研究（英文）

《自动化学报》2014,(4)

Scene classification is a complicated task, because it includes much content and it is difficult to capture its distribution.A novel hierarchical serial scene classification framework is presented in this paper. At first, we use hierarchical feature to present both the global scene and local patches containing specific objects. Hierarchy is presented by space pyramid match, and our own codebook is built by two different types of words. Secondly, we train the visual words by generative and discriminative methods respectively based on space pyramid match, which could obtain the local patch labels efficiently. Then, we use a neural network to simulate the human decision process, which leads to the final scene category from local labels. Experiments show that the hierarchical serial scene image representation and classification model obtains superior results with respect to accuracy. 相似文献

20.

一种改进的自然场景特征提取方法

下载免费PDF全文

刘宏普杰信《计算机工程》2011,37(21):182-184

基于场景全局语义特征描述符gist的自然场景分类方法在特征提取过程中计算量较大、识别精度较低。为此,提出一种改进的特征提取方法,将3尺度的gist特征与梯度方向直方图特征相结合对场景进行描述,并利用支持向量机实现分类。实验结果表明,改进的方法加快了特征提取速度,提高了分类正确率。相似文献