期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

1.

杭昊黄影平张栩瑞罗鑫《光电工程》2024,(1):108-121

道路场景语义分割是自动驾驶环境感知的一项重要任务。近年来,变换神经网络(Transformer)在计算机视觉领域开始应用并取得了很好的效果。针对复杂场景图像语义分割精度低、细小目标识别能力不足等问题,本文提出了一种基于移动窗口Transformer的多尺度特征融合的道路场景语义分割算法。该网络采用编码-解码结构,编码器使用改进后的移动窗口Transformer特征提取器对道路场景图像进行特征提取,解码器由注意力融合模块和特征金字塔网络构成,充分融合多尺度的语义特征。在Cityscapes城市道路场景数据集上进行验证测试,实验结果表明,与多种现有的语义分割算法进行对比,本文方法在分割精度方面有较大的提升。相似文献

2.

基于语义信息的场景识别方法研究

杨清溪张丽红《测试技术学报》2021,35(6):521-528

在场景识别任务中,由于场景图像类内变化大,类间相似度高,不同场景类别之间表现出相似的外观和对象分布,从而容易导致场景识别任务的失败.为解决该问题,本文提出一种基于语义分割及高效网络相结合的场景识别模型.该模型由语义分支和RGB分支两部分组成,语义分支在语义分割基础上进一步提取图像上下文信息,RGB分支采用高效网络来提取图像的全局特征,通过注意力机制将两个分支的输出特征进行融合,最终输入线性分类器以实现场景识别的预测.将提出的网络模型在ADE20K,MIT Indoor 67和SUN3973个数据集进行训练与测试,实验结果表明,提出的模型可以显著减少网络参数数量,同时提高场景识别的准确率. 相似文献

3.

面向智能维护的通信机房机柜图像语义分割技术

《中国测试》2019,(11):126-130

通信机房机柜的智能维护是实现设备无人化、智能化监管的核心工作之一,结合语义分割技术实现设备图像识别、位置检测、检修操作点确定,形成泛用性强的人工智能方法。该文从深度学习语义分割方法入手,提出基于Mask R-CNN的机房机柜设备图像语义分割技术方案,实现不同视野、存在物体遮挡条件下的机房机柜图像识别与分割。通过模拟不同语义分割算法在通信机房机柜检测场景的应用效果,表明基于Mask R-CNN的语义分割技术准确性良好,Top-1错误率为7.1%、像素级分割准确性mIOU达82.3%。相似文献

4.

融合配准的多站室外大场景激光点云分割

徐鹏徐方勇陈辉《计量学报》2022,43(3):325-330

针对室外场景范围广、分割难度大、识别效果不显著等问题,提出了一种融合多站点云配准的室外大场景分割方法.首先,根据室外场景视野大、点云数据量庞大特点,选取多个视角下重叠区域较多的建筑场景点集,结合SAC-IA和ICP方法进行点云自动配准,从而构建出点云密度相对均匀的室外大场景完整结构;然后,选用公共数据集Semantic... 相似文献

5.

基于语义分割的实时车道线检测方法

张冲黄影平郭志阳杨静怡《光电工程》2022,49(5):210378-1-210378-12

车道线识别是自动驾驶环境感知的一项重要任务。近年来,基于卷积神经网络的深度学习方法在目标检测和场景分割中取得了很好的效果。本文借鉴语义分割的思想,设计了一个基于编码解码结构的轻量级车道线分割网络。针对卷积神经网络计算量大的问题,引入深度可分离卷积来替代普通卷积以减少卷积运算量。此外,提出了一种更高效的卷积结构LaneConv和LaneDeconv来进一步提高计算效率。为了获取更好的车道线特征表示能力,在编码阶段本文引入了一种将空间注意力和通道注意力串联的双注意力机制模块(CBAM)来提高车道线分割精度。在Tusimple车道线数据集上进行了大量实验,结果表明,本文方法能够显著提升车道线的分割速度,且在各种条件下都具有良好的分割效果和鲁棒性。与现有的车道线分割模型相比,本文方法在分割精度方面相似甚至更优,而在速度方面则有明显提升。

相似文献

6.

一种基于能量最小化的运动阴影检测方法 总被引：1，自引：1，他引：0

杨源查宇飞毕笃彦《光电工程》2008,35(7):68-72

针对传统方法经常将运动阴影也检测为前景的问题,本文将检测问题表示为能量函数,通过最小化能量函数来检测运动阴影.这种方法先用传统的背景对消方法分别得到静态背景和含有真实前景和运动阴影的运动目标,然后在运动目标中,利用阴影的颜色不变性和纹理不变性,以及阴影和前景的时空一致性,构造出能量函数,最后通过最小化能量函数,将真实前景从运动目标中准确地分割出来,从而达到消除运动阴影的目的.我们在包含运动阴影的视频中,对本文方法进行测试,并和其它方法比较.实验结果表明,本文的方法无论在室内场景,还是在室外场景都可以很好地分割前景和阴影. 相似文献

7.

生产线复杂场景钢坯检测识别的定位方法研究 总被引：1，自引：0，他引：1

俞喆俊洪汉玉章秀华张天序《光电工程》2012,39(1):54-61

在生产线钢坯检测识别过程中,如何准确地从光照复杂场景中确定钢坯端面字符串目标所在位置是一个关键技术问题。为了解决这个问题,本文提出利用 Mean Shift对复杂场景图像进行抑制,采用多级分割滤波与聚类处理突出并找出字符串目标兴趣区域,通过最小二乘法自适应修整倾斜角度,进而完成字符串目标精准定位。实验结果表明,与传统投影定位方法相比,该定位方法能对各类钢坯字符串复杂场景完成精准定位,具有良好的稳定性和准确性,解决了钢坯字符串复杂场景的定位问题,为钢坯字符识别工作提供了关键技术。相似文献

8.

基于多残差网络的遥感图像语义分割方法

杨甜甜郭大波孙佳《测试技术学报》2021,35(3):245-252

高分辨率遥感图像含有许多较为复杂的地物信息,对其进行的语义分割存在分割精度低、分割边界模糊等问题.本文提出一种新型的多尺度语义分割网络模型,旨在提高遥感图像语义分割精度.该模型为编码—解码(Encoder-Decoder)网络结构,编码器利用残差网络对图像特征进行提取;解码器利用反卷积进行上采样;残差连接将提取到的高级... 相似文献

9.

基于复杂结构数据聚类的推荐系统

李琳娜张志平《高技术通讯》2011,21(11):1115-1120

针对目前推荐系统存在的不能处理结构复杂、语义丰富领域的推荐问题以及对项目空间和用户空间本质特征理解的狭窄性和简单性、稀疏性问题、可扩展性问题,研究了基于复杂结构数据聚类的推荐方法,提出了一个新颖、有效、具有高可扩展性的基于复杂结构数据聚类的混合型推荐系统HRSCCSD.该系统能同时融合用户语义、项目语义和项目协同多方面... 相似文献

10.

时空联合的红外运动目标提取算法

杨威李俊山史德琴《光电工程》2008,35(5):50-55

针对红外图像对比度差、边缘模糊的特点,提出了一种基于时空联合的红外序列图像目标提取的新方法.算法充分利用了红外目标的亮度特征、背景信息以及运动信息.时域分割中通过建立帧差图像背景的高斯分布模型,采用变化检测模板来确定红外目标约束区域.然后,构造图像像素与区域之间的空间关系隶属度矩阵并约束到传统的模糊聚类算法中,空域分割则利用该模糊聚类来对目标约束区域进行有效分割.最后将时空分割结果融合便能实现最终的红外目标提取.实验结果表明,该方法简单有效,能准确提取动态场景中的红外目标. 相似文献

11.

Race Classification Using Deep Learning

Khalil Khan Rehan Ullah Khan Jehad Ali Irfan Uddin Sahib Khan Byeong-hee Roh 《计算机、材料和连续体（英文）》2021,68(3):3483-3498

Race classification is a long-standing challenge in the field of face image analysis. The investigation of salient facial features is an important task to avoid processing all face parts. Face segmentation strongly benefits several face analysis tasks, including ethnicity and race classification. We propose a race-classification algorithm using a prior face segmentation framework. A deep convolutional neural network (DCNN) was used to construct a face segmentation model. For training the DCNN, we label face images according to seven different classes, that is, nose, skin, hair, eyes, brows, back, and mouth. The DCNN model developed in the first phase was used to create segmentation results. The probabilistic classification method is used, and probability maps (PMs) are created for each semantic class. We investigated five salient facial features from among seven that help in race classification. Features are extracted from the PMs of five classes, and a new model is trained based on the DCNN. We assessed the performance of the proposed race classification method on four standard face datasets, reporting superior results compared with previous studies. 相似文献

12.

Indexing audiovisual databases through joint audio and video processing

Caterina Saraceno Riccardo Leonardi 《International journal of imaging systems and technology》1998,9(5):320-331

相似文献

13.

Semantic Segmentation by Using Down-Sampling and Subpixel Convolution: DSSC-UNet

Young-Man Kwon Sunghoon Bae Dong-Keun Chung Myung-Jae Lim 《计算机、材料和连续体（英文）》2023,75(1):683-696

Recently, semantic segmentation has been widely applied to image processing, scene understanding, and many others. Especially, in deep learning-based semantic segmentation, the U-Net with convolutional encoder-decoder architecture is a representative model which is proposed for image segmentation in the biomedical field. It used max pooling operation for reducing the size of image and making noise robust. However, instead of reducing the complexity of the model, max pooling has the disadvantage of omitting some information about the image in reducing it. So, this paper used two diagonal elements of down-sampling operation instead of it. We think that the down-sampling feature maps have more information intrinsically than max pooling feature maps because of keeping the Nyquist theorem and extracting the latent information from them. In addition, this paper used two other diagonal elements for the skip connection. In decoding, we used Subpixel Convolution rather than transposed convolution to efficiently decode the encoded feature maps. Including all the ideas, this paper proposed the new encoder-decoder model called Down-Sampling and Subpixel Convolution U-Net (DSSC-UNet). To prove the better performance of the proposed model, this paper measured the performance of the U-Net and DSSC-UNet on the Cityscapes. As a result, DSSC-UNet achieved 89.6% Mean Intersection Over Union (Mean-IoU) and U-Net achieved 85.6% Mean-IoU, confirming that DSSC-UNet achieved better performance. 相似文献

14.

基于DeepLabv3+的图像语义分割优化方法

郑斌军孔玲君《包装工程》2022,43(1):187-194

目的为了实现良好的图像语义分割精度,同时尽可能降低网络的参数量,加快网络训练速度,提出基于DeepLabv3+的图像语义分割优化方法。方法编码器主干网络增加注意力机制模块,并采用更密集的特征池化模块有效聚合多尺度特征,同时使用深度可分离卷积降低网络计算复杂度。结果基于CamVid数据集的对比实验显示,优化后网络的MIoU分数达到了71.03%,在像素精度、平均像素精度等其他方面的评价指标上较原网络有小幅提升,并且网络参数量降低了12%。在Cityscapes的测试数据集上的MIoU分数为75.1%。结论实验结果表明,优化后的网络能够有效提取图像特征信息,提高语义分割精度,同时降低模型复杂度。文中网络使用城市道路场景数据集进行测试,可以为今后的无人驾驶技术的应用提供参考,具有一定的实际意义。相似文献

15.

基于点云与图像交叉融合的道路分割方法

张莹黄影平郭志阳张冲《光电工程》2021,48(12):210340-1-210340-12

道路检测是车辆实现自动驾驶的前提。近年来,基于深度学习的多源数据融合成为当前自动驾驶研究的一个热点。本文采用卷积神经网络对激光雷达点云和图像数据加以融合,实现对交通场景中道路的分割。本文提出了像素级、特征级和决策级多种融合方案,尤其是在特征级融合中设计了四种交叉融合方案,对各种方案进行对比研究,给出最佳融合方案。在网络构架上,采用编码解码结构的语义分割卷积神经网络作为基础网络,将点云法线特征与RGB图像特征在不同的层级进行交叉融合。融合后的数据进入解码器还原,最后使用激活函数得到检测结果。实验使用KITTI数据集进行评估,验证了各种融合方案的性能,实验结果表明,本文提出的融合方案E具有最好的分割性能。与其他道路检测方法的比较实验表明,本文方法可以获得较好的整体性能。相似文献

16.

Adaptive real-time motion segmentation technique based on statistical background model

A. K. S. Kushwaha C. M. Sharma M. Khare O. Prakash 《成像科学杂志》2014,62(5):285-302

Motion segmentation is a crucial step for video analysis and has many applications. This paper proposes a method for motion segmentation, which is based on construction of statistical background model. Variance and Covariance of pixels are computed to construct the model for scene background. We perform average frame differencing with this model to extract the objects of interest from the video frames. Morphological operations are used to smooth the object segmentation results. The proposed technique is adaptive to the dynamically changing background because of change in the lighting conditions and in scene background. The method has the capability to relearn the background to adapt these variations. The immediate advantage of the proposed method is its high processing speed of 30 frames per second on large sized (high resolution) videos. We compared the proposed method with other five popular methods of object segmentation in order to prove the effectiveness of the proposed technique. Experimental results demonstrate the novelty of the proposed method in terms of various performance parameters. The method can segment the video stream in real-time, when background changes, lighting conditions vary, and even in the presence of clutter and occlusion 相似文献

17.

引入反馈注意力的并行式多分辨率语义分割算法

孙红袁巫凯赵迎志《包装工程》2023,44(1):141-150

目的为了进一步提升语义分割精度,解决当前语义分割算法中特征图分辨率低下,低级信息特征随意丢弃,以及上下文重要信息不能顾及等问题,文中尝试提出一种融合反馈注意力模块的并行式多分辨率语义分割算法。方法该算法提出一种并行式网络结构,在其中融合了高低分辨率信息,尽可能多地保留高维信息,减少低级信息要素的丢失,提升分割图像的分辨率。同时还在主干网络中嵌入了带反馈机制的感知注意力模块,从通道、空间、全局3个角度获得每个样本的权重信息,着重加强样本之间的特征重要性。在训练过程中,还使用了改进的损失函数,降低训练和优化难度。结果经实验表明,文中的算法模型在PASCAL VOC2012、Camvid上的MIOU指标分别为77.78%、58.67%,在ADE20K上的也有42.52%,体现了出较好的分割性能。结论文中的算法模型效果相较于之前的分割网络有一定程度的提升,算法中的部分模块嵌入别的主干网络依旧表现出较好的性能,展现了文中算法模型具备一定的有效性和泛化能力。相似文献

18.

Real-Time Recognition and Location of Indoor Objects

Jinxing Niu Qingsheng Hu Yi Niu Tao Zhang Sunil Kumar Jha 《计算机、材料和连续体（英文）》2021,68(2):2221-2229

Object recognition and location has always been one of the research hotspots in machine vision. It is of great value and significance to the development and application of current service robots, industrial automation, unmanned driving and other fields. In order to realize the real-time recognition and location of indoor scene objects, this article proposes an improved YOLOv3 neural network model, which combines densely connected networks and residual networks to construct a new YOLOv3 backbone network, which is applied to the detection and recognition of objects in indoor scenes. In this article, RealSense D415 RGB-D camera is used to obtain the RGB map and depth map, the actual distance value is calculated after each pixel in the scene image is mapped to the real scene. Experiment results proved that the detection and recognition accuracy and real-time performance by the new network are obviously improved compared with the previous YOLOV3 neural network model in the same scene. More objects can be detected after the improvement of network which cannot be detected with the YOLOv3 network before the improvement. The running time of objects detection and recognition is reduced to less than half of the original. This improved network has a certain reference value for practical engineering application. 相似文献

19.

Automatic Terrain Debris Recognition Network Based on 3D Remote Sensing Data

Xu Han Huijun Yang Qiufeng Shen Jiangtao Yang Huihui Liang Cancan Bao Shuang Cang 《计算机、材料和连续体（英文）》2020,65(1):579-596

相似文献