期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

1.

Hybrid graphical model for semantic image segmentation

《Journal of Visual Communication and Image Representation》2015

To make full use of both non-causal and causal cues in natural images, we propose a hybrid hierarchical Conditional Random Field (HCRF) and Bayesian Network (BN) model for semantic image segmentation in this paper. The HCRF is used to capture non-causal relationship, such as appearance features and inter-class co-occurrence statistics, to produce initial semantic sub-scene predictions. Whereas, the BN is used to model contextual interactions for each semantic sub-scene in the form of class statistics from its neighboring regions, of which its conditional probabilities are learned automatically from training data. The learned BN structure is then used to encode the structure of contextual dependencies for sub-scenes in the initial predictions to generate final refined predictions. Experiments on the Stanford 8-class dataset and the LHI 15-class dataset show that the hybrid model outperforms pure CRF models by 2–4% in average classification accuracy. 相似文献

2.

Multisource classification of remotely sensed data: fusion ofLandsat TM and SAR images

Solberg A.H.S. Jain A.K. Taxt T. 《Geoscience and Remote Sensing, IEEE Transactions on》1994,32(4):768-778

Proposes a new method for statistical classification of multisource data. The method is suited for land-use classification based on the fusion of remotely sensed images of the same scene captured at different dates from multiple sources. It incorporates a priori information about the likelihood of changes between the acquisition of the different images to be fused. A framework for the fusion of remotely sensed data based on a Bayesian formulation is presented. First, a simple fusion model is given, and then the basic model is extended to take into account the temporal attribute if the different data sources are acquired at different dates. The performance of the model is evaluated by fusing Landsat TM images and ERS-1-SAR images for land-use classification. The fusion model gives significant improvements in the classification error rates compared to the conventional single-source classifiers 相似文献

3.

Fusing integrated visual vocabularies-based bag of visual words and weighted colour moments on spatial pyramid layout for natural scene image classification

Yousef Alqasrawi Daniel Neagu Peter I. Cowling 《Signal, Image and Video Processing》2013,7(4):759-775

The bag of visual words (BOW) model is an efficient image representation technique for image categorization and annotation tasks. Building good visual vocabularies, from automatically extracted image feature vectors, produces discriminative visual words, which can improve the accuracy of image categorization tasks. Most approaches that use the BOW model in categorizing images ignore useful information that can be obtained from image classes to build visual vocabularies. Moreover, most BOW models use intensity features extracted from local regions and disregard colour information, which is an important characteristic of any natural scene image. In this paper, we show that integrating visual vocabularies generated from each image category improves the BOW image representation and improves accuracy in natural scene image classification. We use a keypoint density-based weighting method to combine the BOW representation with image colour information on a spatial pyramid layout. In addition, we show that visual vocabularies generated from training images of one scene image dataset can plausibly represent another scene image dataset on the same domain. This helps in reducing time and effort needed to build new visual vocabularies. The proposed approach is evaluated over three well-known scene classification datasets with 6, 8 and 15 scene categories, respectively, using 10-fold cross-validation. The experimental results, using support vector machines with histogram intersection kernel, show that the proposed approach outperforms baseline methods such as Gist features, rgbSIFT features and different configurations of the BOW model. 相似文献

4.

基于二值神经网络的大场景点云分类

章国道刘儒瑜张志勇孔德伟邱飞岳《光电子．激光》2022,33(4):364-372

近年随着3维数据采集技术不断发展,大场景点云数据的获取越来越方便。目前深度学习网络框架在2维图像处理领域越来越成熟,而大场景点云是一种3维无规则化的数据,3维卷积神经网络直接处理大场景3维数据会存在分类精度低和计算复杂等问题。因此为了有效解决基于深度学习的点云分类任务中存在的计算时间长和分类精度低的问题,本文提出基于二值神经网络的大场景点云分类方法,针对不规则的3维点云数据设计特征值计算方法,基于IR -Net二值神经网络处理输入的点云特征图像,进一步采用Dynamic ReLU激活函数,提高神经网络的计算效率,最后得出点云分类结果。实验结果表明,所提出的方法在Oakland数据集上分类精度达到97.6%,在GML数据集中取得了92.3%和97.2%的分类精度,实验结果证明Dy -ResNet 能够有效提升了点云分类的精度,减少计算的复杂度,并提高了训练效率。相似文献

5.

基于卷积特征和贝叶斯决策的双波段场景分类

邱晓华李敏张丽琼董琳《激光与光电子学进展》2021,58(4):358-366

针对可见光和近红外双波段场景分类存在图像标注样本少和特征融合质量低的问题,提出了一种基于卷积神经网络(CNN)特征提取和朴素贝叶斯决策融合的双波段场景分类方法。首先,将基于预训练的CNN模型作为双波段图像的特征提取器,避免标注样本少导致的过拟合问题;然后,通过主成分分析降维和特征归一化方法,提高支持向量机的计算速度和每个波段的分类性能;最后,以双波段后验概率为朴素贝叶斯先验概率,构建了决策融合模型,实现场景融合分类。在公开数据集上的实验结果表明,相比单一波段分类和双波段特征级联融合分类方法,本方法的识别率有明显提升,可达到94.3%;比基于传统特征的最优方法高6.4个百分点,与基于CNN的方法识别率相近,且执行简单高效。相似文献

6.

结合多源特征与高斯过程模型的SAR图像目标识别

辛海燕童有为《电讯技术》2021,61(4):454-460

针对合成孔径雷达(Synthetic Aperture Radar,SAR)图像目标识别问题,提出结合多源特征和高斯过程模型的方法。分别利用主成分分析(Principal Component Analysis,PCA)、非负矩阵分解(Non-negative Matrix Factorization,NMF)以及单演信号提取SAR图像的特征矢量,并将它们串接为单一矢量。三类特征从不同角度描述SAR图像目标特性,从而为目标识别提供更为有效的信息。决策分类过程采用高斯过程模型进行多元分类,基于融合特征矢量获得概率意义上的最佳决策。实验中,采用MSTAR数据集设置3类目标、10类目标、型号差异以及俯仰角差异识别问题,结果验证了提出方法的优越性能。相似文献

7.

基于频谱图转换器的音频场景分类

下载免费PDF全文

袁双杨立东郭勇牛大伟张丹丹《信号处理》2023,39(4):730-736

音频场景分类是场景理解重要的一环,学习音频场景特征并精准分类能加强机器与环境的交互能力,在大数据时代其重要性不言而喻。鉴于分类任务表现依赖数据集规模,但实际任务中又面临数据集严重不足的情况,本文提出了数据增强和网络模型预训练策略,将频谱图转换器模型和音频场景分类任务相结合。首先,提取音频信号对数梅尔能量频谱图输入模型,然后通过模型动态交互能力,加强音频序列空间关系,最后由标记向量完成分类。将本文方法在DCASE2019task1和DCASE2020task1公开数据集上进行测试,分类准确率分别达到了96.489%和93.227%,与已有算法相比有明显的提升,说明本方法适用高精度音频场景分类任务,为高精度智能设备感知环境内容、检测环境动态打下了基础。相似文献

8.

Biologically Inspired Feature Manifold for Scene Classification 总被引：2，自引：0，他引：2

Dongjin Song Dacheng Tao 《IEEE transactions on image processing》2010,19(1):174-184

Biologically inspired feature (BIF) and its variations have been demonstrated to be effective and efficient for scene classification. It is unreasonable to measure the dissimilarity between two BIFs based on their Euclidean distance. This is because BIFs are extrinsically very high dimensional and intrinsically low dimensional, i.e., BIFs are sampled from a low-dimensional manifold and embedded in a high-dimensional space. Therefore, it is essential to find the intrinsic structure of a set of BIFs, obtain a suitable mapping to implement the dimensionality reduction, and measure the dissimilarity between two BIFs in the low-dimensional space based on their Euclidean distance. In this paper, we study the manifold constructed by a set of BIFs utilized for scene classification, form a new dimensionality reduction algorithm by preserving both the geometry of intra BIFs and the discriminative information inter BIFs termed Discriminative and Geometry Preserving Projections (DGPP), and construct a new framework for scene classification. In this framework, we represent an image based on a new BIF, which combines the intensity channel, the color channel, and the C1 unit of a color image; then we project the high-dimensional BIF to a low-dimensional space based on DGPP; and, finally, we conduct the classification based on the multiclass support vector machine (SVM). Thorough empirical studies based on the USC scene dataset demonstrate that the proposed framework improves the classification rates around 100% relatively and the training speed 60 times for different sites in comparing with previous gist proposed by Siagian and Itti in 2007. 相似文献

9.

基于局部特征显著化的场景分类方法

下载免费PDF全文

张家辉谢毓湘郭延明《信号处理》2020,36(11):1804-1810

场景图像分类是机器视觉中一个热门的方向,场景图像具有内容丰富、概念复杂的特点。已有的基于深度网络的场景分类算法,往往是通过改进网络结构或者数据增强等方式提升场景识别效果,但是缺少对图像中场景要素和对象要素之间关系的考虑。基于此,本文在分析现有基于深度网络的场景分类技术的基础上提出了一种局部特征显著化的场景分类算法。该算法旨在结合场景局部特征和对象局部特征的特点,利用两类不同特征存在的互补关系,分别对其进行优化,得到更具判别力的场景特征描述。局部特征显著化算法在MIT Indoor67数据集上得到的测试精度为88.88%,实验结果验证了该算法的有效性。相似文献

10.

一种基于稀疏编码的多核学习图像分类方法 总被引：2，自引：0，他引：2

下载免费PDF全文

亓晓振王庆《电子学报》2012,40(4):773-779

本文提出一种基于稀疏编码的多核学习图像分类方法.传统稀疏编码方法对图像进行分类时,损失了空间信息,本文采用对图像进行空间金字塔多划分方式为特征加入空间信息限制.在利用非线性SVM方法进行图像分类时,空间金字塔的各层分别形成一个核矩阵,本文使用多核学习方法求解各个核矩阵的权重,通过核矩阵的线性组合来获取能够对整个分类集区分能力最强的核矩阵.实验结果表明了本文所提出图像分类方法的有效性和鲁棒性.对Scene Categories场景数据集可以达到83.10%的分类准确率,这是当前该数据集上能达到的最高分类准确率. 相似文献

11.

基于卷积神经网络的SAR图像目标检测算法

杜兰刘彬王燕刘宏伟代慧《电子与信息学报》2016,38(12):3018-3025

该文研究了训练样本不足的情况下利用卷积神经网络(Convolutional Neural Network, CNN)对合成孔径雷达(SAR)图像实现目标检测的问题。利用已有的完备数据集来辅助场景复杂且训练样本不足的数据集进行检测。首先用已有的完备数据集训练得到CNN分类模型,用于对候选区域提取网络和目标检测网络做参数初始化;然后利用完备数据集对训练数据集做扩充;最后通过四步训练法得到候选区域提取模型和目标检测模型。实测数据的实验结果证明,所提方法在SAR图像目标检测中可以获得较好的检测效果。相似文献

12.

Unsupervised Domain Adaptation via Principal Subspace Projection for Acoustic Scene Classification

Mezza Alessandro Ilic Habets Emanuël A. P. Müller Meinard Sarti Augusto 《Journal of Signal Processing Systems》2022,94(2):197-213

Existing acoustic scene classification (ASC) systems often fail to generalize across different recording devices. In this work, we present an unsupervised domain adaptation method for ASC based on data standardization and feature projection. First, log-amplitude spectro-temporal features are standardized in a band-wise fashion over samples and time. Then, both source- and target-domain samples are projected onto the span of the principal eigenvectors of the covariance matrix of source-domain training data. The proposed method, being devised as a preprocessing procedure, is independent of the choice of the classification algorithm and can be readily applied to any ASC model at a minimal cost. Using the TUT Urban Acoustic Scenes 2018 Mobile Development dataset, we show that the proposed method can provide an absolute increment of over 10% compared to state-of-the-art unsupervised adaptation methods. Furthermore, the proposed method consistently outperforms a recent ASC model that ranked first in Task 1-A of the 2021 DCASE Challenge when evaluated on various unseen devices from the TAU Urban Acoustic Scenes 2020 Mobile Development dataset. In addition, our method appears robust even when provided with a small amount of target-domain data, proving effective using as few as 90 seconds of test audio recordings. Finally, we show that the proposed adaptation method can also be employed as a feature extraction stage for shallower neural networks, thus significantly reducing model complexity.

相似文献

13.

基于LSTM与多特征融合的高铁无线信道场景识别

下载免费PDF全文

王英捷周涛陶成《电波科学学报》2021,36(3):453-459,476

为满足5G移动通信系统中用户通信业务质量的需求,提出了一种基于长短时记忆(long short term memory,LSTM)与多特征融合的识别方法准确识别高铁无线信道场景,该方法能够与智能决策系统相结合,提高通信系统的整体性能.首先,对不同信道场景的特点及信道特征参数进行阐述,并对整体数据集进行训练集与测试集的... 相似文献

14.

Object segmentation and classification using 3-D range camera

《Journal of Visual Communication and Image Representation》2014,25(1):74-85

相似文献

15.

A Markov random field model for classification of multisourcesatellite imagery

Solberg A.H.S. Taxt T. Jain A.K. 《Geoscience and Remote Sensing, IEEE Transactions on》1996,34(1):100-113

A general model for multisource classification of remotely sensed data based on Markov random fields (MRF) is proposed. A specific model for fusion of optical images, synthetic aperture radar (SAR) images, and GIS (geographic information systems) ground cover data is presented in detail and tested. The MRF model exploits spatial class dependencies (spatial context) between neighboring pixels in an image, and temporal class dependencies between different images of the same scene. By including the temporal aspect of the data, the proposed model is suitable for detection of class changes between the acquisition dates of different images. The performance of the proposed model is investigated by fusing Landsat TM images, multitemporal ERS-1 SAR images, and GIS ground-cover maps for land-use classification, and on agricultural crop classification based on Landsat TM images, multipolarization SAR images, and GIS crop field border maps. The performance of the MRF model is compared to a simpler reference fusion model. On an average, the MRF model results in slightly higher (2%) classification accuracy when the same data is used as input to the two models. When GIS field border data is included in the MRF model, the classification accuracy of the MRF model improves by 8%. For change detection in agricultural areas, 75% of the actual class changes are detected by the MRF model, compared to 62% for the reference model. Based on the well-founded theoretical basis of Markov random field models for classification tasks and the encouraging experimental results in our small-scale study, the authors conclude that the proposed MRF model is useful for classification of multisource satellite imagery 相似文献

16.

基于迁移学习和通道注意力的遥感图像场景分类

舒薪行温显斌袁立明徐海霞史芙蓉《光电子．激光》2024,35(7):716-722

针对遥感图像场景分类任务中训练样本数量少及遥感图像背景复杂等问题,本文将迁移学习和通道注意力引入到卷积神经网络(convolutional neural network,CNN) 中,提出基于迁移学习和通道注意力的遥感图像场景分类方法。该方法首先选用经过ImageNet自然数据集预训练的两个CNN作为主干,同时引入通道注意力机制,自适应地增强主要特征,抑制次要特征;然后融合这两个网络提取的特征进行分类;最后采用微调迁移学习的方式实现目标域上的学习与分类。提出的方法在几个经典的公共数据集上进行了评估,实验结果证明了本文提出的方法在遥感图像场景分类中达到与其他先进方法相当的性能。相似文献

17.

稀疏特征重用的人脸特征提取网络

下载免费PDF全文

胡超李春国杨绿溪《信号处理》2021,37(7):1153-1163

为了提高人脸特征提取网络的性能,进而提高人脸识别算法的准确率,本文对基于卷积神经网络的人脸特征提取网络进行研究,提出了SFRNet (Sparse Feature Reuse Network)。首先,基于稀疏特征重用、混合特征融合、中心-高斯池化三个创新点,给出了SFRNet的网络结构。然后,在图像分类数据集ImageNet和人脸识别数据集LFW (Labeled Faces in the Wild)、MegaFace上进行实验,分别验证了SFRNet在一般场景和人脸识别这一特定场景下的特征提取能力。实验表明本文所设计的SFRNet不仅计算量和参数量小,还能有效提取到人脸特征并且在一般场景中也有较强的泛化能力。相似文献

18.

结合场景分类的近岸区域SAR舰船目标快速检测方法

下载免费PDF全文

付晓雅王兆成《信号处理》2020,36(12):2123-2130

合成孔径雷达(Synthetic Aperture Radar, SAR)图像场景通常较大,深层卷积网络用于SAR舰船目标检测时通常需要密集滑窗提取子图像预处理,然后利用目标检测网络直接对子图像进行目标检测,该过程存在大量信息冗余,极大影响了目标检测效率的提升。在近岸区域下陆地场景偏多且场景复杂,针对以上问题,本文提出了一种结合场景分类的近岸区域SAR舰船快速目标检测方法（SC-SSD）,该方法主要包含两个阶段:场景分类阶段和目标检测阶段。它们分别是由场景分类网络(Convolutional Neural Network for Scene Classification, SC-CNN)和目标检测网络(Single Shot Detector, SSD)构成。其中SC-CNN可以快速粗略筛选出可能包含舰船的子图像,然后将筛选出的子图像输入到SSD网络中实现精细化的舰船目标检测。基于高分辨率SAR舰船检测数据集AIR-SARShip-1.0的实验结果表明,提出方法相比于传统舰船检测方法,在保持较高的检测精度的同时,具有明显更快的检测速度。相似文献

19.

A real-time detector for parked vehicles based on hybrid background modeling

《Journal of Visual Communication and Image Representation》2016

In this paper, a real-time detection system based on hybrid background modeling is proposed for detecting parked vehicles along the side of a road. The hybrid background model consists of three components: (1) a scene background model, (2) a computed restricted area map, and (3) a dynamic threshold curve for vehicles. By exploiting the motion information of normal activity in the scene, we propose a hybrid background model that determines the location of the road, estimates the roadside and generates the adaptive threshold of the vehicle size. The system triggers a notification when a large vehicle-like foreground object has been stationary for more than a pre-set number of video frames (or time). The proposed method is tested on the AVSS 2007 PV dataset. The results are satisfactory compared to other state-of-the-art methods. 相似文献

20.

基于候选区域定位与HOG-CLBP特征组合的行人检测

尧佼于凤芹《激光与光电子学进展》2021,58(2):157-164

基于方向梯度直方图(HOG)特征和局部二值模式(LBP)算子的行人检测算法采用滑动窗口搜索策略存在扫描区域过大和计算复杂的问题,存在的这些问题会导致检测速度慢。鉴于此,提出一种行人检测算法。首先,采用选择性搜索算法对目标区域进行定位,并将候选区域的高宽比限制在一定范围内以筛选无效窗口。然后,为了弥补LBP算子在纹理表达上的缺陷,引入完备的局部二值模式(CLBP)算子来提高纹理特征的表达能力。接着,考虑到HOG特征和CLBP算子特征维数过高对分类器的识别能力产生影响,采用主成分分析的方法分别对HOG特征和CLBP算子进行降维,降维后再进行串联融合。最后,引入困难样本的挖掘过程训练支持向量机分离器,这可以使模型训练得更充分,进而降低误检率。在INRIA数据集上仿真结果表明,所提算法在识别率和识别速度上都有一定的提高。相似文献