期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

1.

吴士林朱枫《计算机工程与科学》2012,34(3):91-95

为了实现复杂自然场景中多类目标的识别与分割,本文利用条件概率模型(CM)对目标特征进行建模,融合了纹理特征、纹理环境特征和位置特征,并采用场景类别对各类目标间的相互约束关系进行建模,在此基础上研究基于场景类别的条件概率模型(sCM)在多类目标识别与分割中的应用。本文选用Oliva&Torralba数据库对模型进行实验并与国外其他方法进行了比较。实验结果表明,该算法在多类目标识别与分割中取得很好的结果,在提高总体识别率的同时提高了物体边缘部分识别与分割的正确率,更有效地提高了视觉效果。相似文献

2.

基于多尺度注意力机制的道路场景语义分割模型

范润泽刘宇红张荣芬李景玉《计算机工程》2023,49(2):288-295

通过对道路场景进行语义分割可以辅助车辆感知周边环境,达到避让行人、车辆以及各类小目标物体障碍的目的,提高行驶的安全性。针对道路场景语义分割中小目标物体识别精度不高、网络参数量过大等问题,提出一种基于多尺度注意力机制的语义分割模型。利用小波变换的多尺度多频率信息分析特性,设计一种多尺度小波注意力模块,并将其嵌入到编码器结构中,通过融合不同尺度及频率的特征信息,保留更多的边缘轮廓细节。使用编码器与解码器之间的层级连接,以及改进的金字塔池化模块进行多方面特征提取,在保留上下文特征信息的同时获得更多的图像细节。通过设计多级损失函数训练网络模型,从而加快网络收敛。在剑桥驾驶标注视频数据集上的实验结果表明,该模型的平均交并比为60.21%,与DeepLabV3+和DenseASPP模型相比参数量减少近30%,在不额外增加参数量的前提下提升了模型的分割精度,且在不同场景下均具有较好的鲁棒性。相似文献

3.

A multi-temporal masking classification method for vineyard monitoring in central Spain

S. Lanjeri J. Melia D. Segarra 《International journal of remote sensing》2013,34(16):3167-3186

Abstract

The objective of image segmentation in remote sensing is to define regions in an image that correspond to objects in the ground scene. Traditional scene models underlying image segmentation procedures have assumed that objects as manifest in images have internal variances that are both low and equal. This scene model is unrealistically simple. An alternative scene model recognizes different scales of objects in scenes. Each level in the hierarchy is nested, or composed of objects or categories of objects from the preceding level. Different objects may have distinct attributes, allowing for relaxation of assumptions like equal variance.

A multiple-pass, region-based segmentation algorithm improves the segmentation of images from scenes better modelled as a nested hierarchy. A multiple-pass approach allows slow and careful growth of regions while inter-region distances are below a global threshold. Past the global threshold, a minimum region size parameter forces development of regions in areas of high local variance. Maximum and viable region size parameters limit the development of undesirably large regions.

Application of the segmentation algorithm for forest stand delineation in Landsat TM imagery yields regions corresponding to identifiable features in the landscape. The use of a local variance, adaptive-window texture channel in conjunction with spectral bands improves the ability to define regions corresponding to sparsely-stocked forest stands which have high internal variance. 相似文献

4.

特征注意金字塔调制网络的视频目标分割

下载免费PDF全文

汤润发宋慧慧张开华姜斯浩《中国图象图形学报》2019,24(8):1349-1357

目的视频目标分割是在给定第1帧标注对象掩模条件下,实现对整个视频序列中感兴趣目标的分割。但是由于分割对象尺度的多样性,现有的视频目标分割算法缺乏有效的策略来融合不同尺度的特征信息。因此,本文提出一种特征注意金字塔调制网络模块用于视频目标分割。方法首先利用视觉调制器网络和空间调制器网络学习分割对象的视觉和空间信息,并以此为先验引导分割模型适应特定对象的外观。然后通过特征注意金字塔模块挖掘全局上下文信息,解决分割对象多尺度的问题。结果实验表明,在DAVIS 2016数据集上,本文方法在不使用在线微调的情况下,与使用在线微调的最先进方法相比,表现出更具竞争力的结果,J-mean指标达到了78.7%。在使用在线微调后,本文方法的性能在DAVIS 2017数据集上实现了最好的结果,J-mean指标达到了68.8%。结论特征注意金字塔调制网络的视频目标分割算法在对感兴趣对象分割的同时,针对不同尺度的对象掩模能有效结合上下文信息,减少细节信息的丢失,实现高质量视频对象分割。相似文献

5.

基于DenseNet的复杂交通场景语义分割方法

蒋斌涂文轩杨超刘虹雨赵子龙《模式识别与人工智能》2019,32(5):472-480

针对交通场景语义分割方法存在参数量较大、计算效率较低、精度不足等问题,文中提出基于全卷积化DenseNet的多尺度端到端语义分割模型.首先,构建一种含混合空洞卷积的密集连接模块,同时沿通道维度级联各模块,用于提取图像特征.然后,采集多尺度视觉信息并以此作为监督信号回传至原通道中.最后,通过双线性插值法获得预测输出.在CityScapes数据集上的测试实验表明,文中方法对复杂交通场景的解析能力较强,预测精度和分割效率较高. 相似文献

6.

Spatiotemporal just noticeable difference modeling with heterogeneous temporal visual features

《Displays》2021

Developing accurate Just-noticeable difference (JND) models are challenged by complicated HVS characteristics and nonstationary features of video sequence. Great efforts have been devoted to JND modeling, and inspiring performance improvements are witnessed in the literature, especially spatial JND models. However, there are not only urgent requirement but also technical potentiality for improving temporal JND models fully accounting for the temporal perception characteristics. In terms of temporal JND modeling, there are two challenges, one is how to extract perceptual feature parameters of source video, and the other is how to quantitatively characterize the interaction relationship between feature parameters and HVS characteristics? Firstly, this work extracts content-aware temporal feature parameters having predominate impacts on vision perception, including motion (foreground/background), pixel-correspondence duration and inter-frame residue fluctuation intensity along temporal trajectory, and investigates the HVS responses to these four heterogeneous feature parameters. Secondly, this work proposes respective probability density functions (PDF) in the perception sense to quantitatively depict the attention and suppression perception responses of feature parameters, accounting for the temporal perception characteristics. Using these PDF models, we fuse the heterogeneous feature parameters from the viewpoint of uniform dimension,i.e. self-information measured visual attention and information entropy measured masking uncertainty, achieving heterogeneous parameter homogenization. Thirdly, with self-information and entropy results, this work then proposes a temporal weight model, by striking the balance between visual attention and masking suppression, to adjust the spatial JND threshold, and then develops the improved spatiotemporal JND model. Intensive simulation results verity the effectiveness of the proposed spatiotemporal JND profile, with competitive model accuracy compared with the-state-of-the-art candidate models. 相似文献

7.

Efficient region-based motion segmentation for a video monitoring system

《Pattern recognition letters》2003,24(1-3):113-128

This paper presents an efficient region-based motion segmentation method for segmentation of moving objects in a traffic scene with a focus on a video monitoring system (VMS). The presented method consists of two phases: first, in the motion detection phase, the positions of moving objects in a scene are determined using an adaptive thresholding method. To detect varying regions by moving objects, instead of determining the threshold value manually, we use an adaptive thresholding method to automatically choose the threshold value. Second, in the motion segmentation phase, pixels that have similar intensity and motion information are segmented using a weighted k-means clustering algorithm to the binary region of the motion mask obtained in the motion detection. In this way, we need not process a whole image so computation time is reduced. Experimental results demonstrate robustness not only in the variation of luminance conditions and changes in environmental conditions, but also for occlusions among multiple moving objects. 相似文献

8.

基于视差和阈值分割的立体视频对象提取 总被引：1，自引：0，他引：1

安平刘苏醒高欣张兆杨《中国图象图形学报》2006,11(11):1669-1672

视频对象分割和提取是编码、通信以及视频检索等基于内容视频处理中的关键问题，为了从只有单一全局运动、含有重叠多对象的立体视频序列中提取对象，提出了一种基于视差分析和阈值分割的对象提取方法。该方法首先用改进的区域匹配法进行立体视差估计，并通过合理减少匹配窗的运算量及根据视差特性设定搜索路径来加快匹配速度；然后针对图像中不同的对象分别采用迭代阈值法和自适应阈值法进行二次分割；最后从阈值分割结果中提取出各个对象。实验提取出的各深度层视频对象效果良好，表明该方法是一种有效的适用于全局运动的立体视频序列对象提取方法。相似文献

9.

Toward coherent object detection and scene layout understanding

Sid Yingze Bao Min Sun Silvio Savarese 《Image and vision computing》2011,29(9):569-579

Detecting objects in complex scenes while recovering the scene layout is a critical functionality in many vision-based applications. In this work, we advocate the importance of geometric contextual reasoning for object recognition. We start from the intuition that objects' location and pose in the 3D space are not arbitrarily distributed but rather constrained by the fact that objects must lie on one or multiple supporting surfaces. We model such supporting surfaces by means of hidden parameters (i.e. not explicitly observed) and formulate the problem of joint scene reconstruction and object recognition as the one of finding the set of parameters that maximizes the joint probability of having a number of detected objects on K supporting planes given the observations. As a key ingredient for solving this optimization problem, we have demonstrated a novel relationship between object location and pose in the image, and the scene layout parameters (i.e. normal of one or more supporting planes in 3D and camera pose, location and focal length). Using a novel probabilistic formulation and the above relationship our method has the unique ability to jointly: i) reduce false alarm and false negative object detection rate; ii) recover object location and supporting planes within the 3D camera reference system; iii) infer camera parameters (view point and the focal length) from just one single uncalibrated image. Quantitative and qualitative experimental evaluation on two datasets (desk-top dataset [1] and LabelMe [2]) demonstrates our theoretical claims. 相似文献

10.

Synchronization of nonlinear electronic oscillators for neural computation.

J Cosp J Madrenas E Alarcon E Vidal G Villar 《Neural Networks, IEEE Transactions on》2004,15(5):1315-1327

This paper deals with coupled oscillators as the building blocks of a bioinspired computing paradigm and their implementation. In order to accomplish the low-power and fast-processing requirements of autonomous applications, we study the microelectronic analog implementation of physical oscillators, instead of the software computer-simulated implementation. With this aim, the original oscillator has been adapted to a suitable microelectronic form. So as to study the hardware nonlinear oscillators, we propose two macro models, demonstrating that they preserve the synchronization properties. Secondary effects such as mismatch and output delay and their relation to network synchronization are analyzed and discussed. We show the correct operation of the proposed electronic oscillators with simulations and experimental results from a manufactured integrated test circuit. The proposed architecture is intended to perform the scene segmentation stage of an autonomous focal-plane self-contained visual processing system for artificial vision applications. 相似文献

11.

Part‐Based Mesh Segmentation: A Survey

下载免费PDF全文

Rui S. V. Rodrigues José F. M. Morgado Abel J. P. Gomes 《Computer Graphics Forum》2018,37(6):235-274

This paper surveys mesh segmentation techniques and algorithms, with a focus on part‐based segmentation, that is, segmentation that divides a mesh (featuring a 3D object) into meaningful parts. Part‐based segmentation applies to a single object and also to a family of objects (i.e. co‐segmentation). However, we shall not address here chart‐based segmentation, though some mesh co‐segmentation methods employ such chart‐based segmentation in the initial step of their pipeline. Finally, the taxonomy proposed in this paper is new in the sense that one classifies each segmentation algorithm regarding the dimension (i.e. 1D, 2D and 3D) of the representation of object parts. The leading idea behind this survey is to identify the properties and limitations of the state‐of‐the‐art algorithms to shed light on the challenges for future work. 相似文献

12.

A transient-chaotic autoassociative network (TCAN) based on Lee oscillators.

R T Lee 《Neural Networks, IEEE Transactions on》2004,15(5):1228-1243

In the past few decades, neural networks have been extensively adopted in various applications ranging from simple synaptic memory coding to sophisticated pattern recognition problems such as scene analysis. Moreover, current studies on neuroscience and physiology have reported that in a typical scene segmentation problem our major senses of perception (e.g., vision, olfaction, etc.) are highly involved in temporal (or what we call "transient") nonlinear neural dynamics and oscillations. This paper is an extension of the author's previous work on the dynamic neural model (EGDLM) of memory processing and on composite neural oscillators for scene segmentation. Moreover, it is inspired by the work of Aihara et al. and Wang on chaotic neural oscillators in pattern association. In this paper, the author proposes a new transient chaotic neural oscillator, namely the "Lee oscillator," to provide temporal neural coding and an information processing scheme. To illustrate its capability for memory association, a chaotic autoassociative network, namely the Transient-Chaotic Auto-associative Network (TCAN) was constructed based on the Lee oscillator. Different from classical autoassociators such as the celebrated Hopfield network, which provides a "time-independent" pattern association, the TCAN provides a remarkable progressive memory association scheme [what we call "progressive memory recalling" (PMR)] during the transient chaotic memory association. This is exactly consistent with the latest research in psychiatry and perception psychology on dynamic memory recalling schemes. 相似文献

13.

Surveillance and human–computer interaction applications of self-growing models

José García-Rodríguez Juan Manuel García-ChamizoAuthor vitae 《Applied Soft Computing》2011,11(7):4413-4431

The aim of the work is to build self-growing based architectures to support visual surveillance and human–computer interaction systems. The objectives include: identifying and tracking persons or objects in the scene or the interpretation of user gestures for interaction with services, devices and systems implemented in the digital home. The system must address multiple vision tasks of various levels such as segmentation, representation or characterization, analysis and monitoring of the movement to allow the construction of a robust representation of their environment and interpret the elements of the scene.It is also necessary to integrate the vision module into a global system that operates in a complex environment by receiving images from acquisition devices at video frequency and offering results to higher level systems, monitors and take decisions in real time, and must accomplish a set of requirements such as: time constraints, high availability, robustness, high processing speed and re-configurability.Based on our previous work with neural models to represent objects, in particular the Growing Neural Gas (GNG) model and the study of the topology preservation as a function of the parameters election, it is proposed to extend the capabilities of this self-growing model to track objects and represent their motion in image sequences under temporal restrictions.These neural models have various interesting features such as: their ability to readjust to new input patterns without restarting the learning process, adaptability to represent deformable objects and even objects that are divided in different parts or the intrinsic resolution of the problem of matching features for the sequence analysis and monitoring of the movement. It is proposed to build an architecture based on the GNG that has been called GNG-Seq to represent and analyze the motion in image sequences. Several experiments are presented that demonstrate the validity of the architecture to solve problems of target tracking, motion analysis or human–computer interaction. 相似文献

14.

Statistics of natural image categories 总被引：9，自引：0，他引：9

Torralba A Oliva A 《Network (Bristol, England)》2003,14(3):391-412

In this paper we study the statistical properties of natural images belonging to different categories and their relevance for scene and object categorization tasks. We discuss how second-order statistics are correlated with image categories, scene scale and objects. We propose how scene categorization could be computed in a feedforward manner in order to provide top-down and contextual information very early in the visual processing chain. Results show how visual categorization based directly on low-level features, without grouping or segmentation stages, can benefit object localization and identification. We show how simple image statistics can be used to predict the presence and absence of objects in the scene before exploring the image. 相似文献

15.

Automatic Lighting Design using a Perceptual Quality Metric 总被引：1，自引：0，他引：1

Ram Shacked & Dani Lischinski 《Computer Graphics Forum》2001,20(3):215-226

Lighting has a crucial impact on the appearance of 3D objects and on the ability of an image to communicate information about a 3D scene to a human observer. This paper presents a new automatic lighting design approach for comprehensible rendering of 3D objects. Given a geometric model of a 3D object or scene, the material properties of the surfaces in the model, and the desired viewing parameters, our approach automatically determines the values of various lighting parameters by optimizing a perception-based image quality objective function. This objective function is designed to quantify the extent to which an image of a 3D scene succeeds in communicating scene information, such as the 3D shapes of the objects, fine geometric details, and the spatial relationships between the objects.
Our results demonstrate that the proposed approach is an effective lighting design tool, suitable for users without expertise or knowledge in visual perception or in lighting design. 相似文献

16.

Sequential model-based segmentation and recognition of image structures driven by visual features and spatial relations

Geoffroy Fouquier Jamal Atif Isabelle Bloch 《Computer Vision and Image Understanding》2012,116(1):146-165

A sequential segmentation framework, where objects in an image are successively segmented, generally raises some questions about the “best” segmentation sequence to follow and/or how to avoid error propagation. In this work, we propose original approaches to answer these questions in the case where the objects to segment are represented by a model describing the spatial relations between objects. The process is guided by a criterion derived from visual attention, and more precisely from a saliency map, along with some spatial information to focus the attention. This criterion is used to optimize the segmentation sequence. Spatial knowledge is also used to ensure the consistency of the results and to allow backtracking on the segmentation order if needed. The proposed approach was applied for the segmentation of internal brain structures in magnetic resonance images. The results show the relevance of the optimization criteria and the interest of the backtracking procedure to guarantee good and consistent results. 相似文献

17.

基于注意力机制和金字塔融合的RGB-D室内场景语义分割

余娜刘彦魏雄炬万源《计算机应用》2022,42(3):844-853

针对现有RGB-D室内场景语义分割不能有效融合多模态特征的问题,提出一种基于注意力机制和金字塔融合的RGB-D室内场景图像语义分割网络模型APFNet,并为其设计了两个新模块:注意力机制融合模块与金字塔融合模块.其中,注意力机制融合模块分别提取RGB特征和Depth特征的注意力分配权重,充分利用两种特征的互补性,使网络... 相似文献

18.

A Knowledge-Based Approach to Visual Information

Elisa Bertino Ahmed K. Elmagarmid Mohand-Saïd Hacid 《Journal of Intelligent Information Systems》2002,19(3):319-341

相似文献

19.

Exploration trees on highly complex scenes: A new approach for 3D segmentation

P. Merchán A. Adán 《Pattern recognition》2007,40(7):1879-1898

A new strategy for automatic object extraction in highly complex scenes is presented in this paper. The method proposed gives a solution for 3D segmentation avoiding most restrictions imposed in other techniques. Thus, our technique is applicable on unstructured 3D information (i.e. cloud of points), with a single view of the scene, scenes consisting of several objects where contact, occlusion and shadows are allowed, objects with uniform intensity/texture and without restrictions of shape, pose or location. In order to have a fast segmentation stopping criteria, the number of objects in the scene is taken as input. The method is based on a new distributed segmentation technique that explores the 3D data by establishing a set of suitable observation directions. For each exploration viewpoint, a strategy [3D data]-[2D projected data]-[2D segmentation]-[3D segmented data] is accomplished. It can be said that this strategy is different from current 3D segmentation strategies. This method has been successfully tested in our lab on a set of real complex scenes. The results of these experiments, conclusions and future improvements are also shown in the paper. 相似文献

20.

基于多维注意力融合的驾驶场景分割增强算法

刘奕晨章坚武胡晶《计算机应用研究》2023,40(10):3180-3185

针对使用注意力机制的语义分割模型计算资源消耗与精度不平衡的问题,提出一种轻量化的语义分割注意力增强算法。首先,基于驾驶场景中物体的形状特点设计了条状分维注意力机制,使用条形池化代替传统方形卷积,并结合降维操作分维度提取长程语义关联,削减模型计算量。接着融合通道域与空间域上的注意力,形成可叠加与拆解的轻量化多维注意力融合模块,全方位提取特征信息,进一步提升模型精度。最后,将模块插入基于ResNet-101骨干网的编码—解码网络中,指导高低层语义融合,矫正特征图边缘信息,补充预测细节。实验表明,该模块有较强的鲁棒性和泛化能力,与同类型注意力机制相比,削减了约90%的参数量以及80%的计算量,且分割精度依旧取得了稳定的提升。相似文献