首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
A wrapper-based approach to image segmentation and classification.   总被引:1,自引:0,他引:1  
The traditional processing flow of segmentation followed by classification in computer vision assumes that the segmentation is able to successfully extract the object of interest from the background image. It is extremely difficult to obtain a reliable segmentation without any prior knowledge about the object that is being extracted from the scene. This is further complicated by the lack of any clearly defined metrics for evaluating the quality of segmentation or for comparing segmentation algorithms. We propose a method of segmentation that addresses both of these issues, by using the object classification subsystem as an integral part of the segmentation. This will provide contextual information regarding the objects to be segmented, as well as allow us to use the probability of correct classification as a metric to determine the quality of the segmentation. We view traditional segmentation as a filter operating on the image that is independent of the classifier, much like the filter methods for feature selection. We propose a new paradigm for segmentation and classification that follows the wrapper methods of feature selection. Our method wraps the segmentation and classification together, and uses the classification accuracy as the metric to determine the best segmentation. By using shape as the classification feature, we are able to develop a segmentation algorithm that relaxes the requirement that the object of interest to be segmented must be homogeneous in some low-level image parameter, such as texture, color, or grayscale. This represents an improvement over other segmentation methods that have used classification information only to modify the segmenter parameters, since these algorithms still require an underlying homogeneity in some parameter space. Rather than considering our method as, yet, another segmentation algorithm, we propose that our wrapper method can be considered as an image segmentation framework, within which existing image segmentation algorithms may be executed. We show the performance of our proposed wrapper-based segmenter on real-world and complex images of automotive vehicle occupants for the purpose of recognizing infants on the passenger seat and disabling the vehicle airbag. This is an interesting application for testing the robustness of our approach, due to the complexity of the images, and, consequently, we believe the algorithm will be suitable for many other real-world applications.  相似文献   

2.
提出了一种基于色彩距离最小化和最大 色彩差(MCD)的场景文本定位方法。首先,使用多次K均值 聚类和色彩距离最小化的方法,从不同复杂程度的场景图像中提取文本 连通区域;考虑到色彩聚类方法容易受光照影响,使用基于MCD最大色彩差的方法,提取 文本连通区域作为补充,由于将 色彩与梯度信息相结合,在一定程度上能克服光照的影响;将得到的连通区域通过设 定的字符合并规则,构建文本行; 候选文本行中通常包含错误检测的非文本行,为了提高文本检测的正确率,最后采用基于特 征提取和机器学习的方法,验证 候选文本行,得到文本定位结果。将本文方法在ICDAR2011和ICDAR2013公共数 据库上实验,对于ICDAR2011数据集,本文 获得的召回率、准确率和F指标分别为0.66、0.77;对于ICDAR2013数据集,本文获得的召回率、准确率和F 指标分别为0.65、0.77。将本文方法与 其它文本检测算法比较,结果表明本文方法的可行性、有效性。  相似文献   

3.
4.
Natural and Seamless Image Composition With Color Control   总被引:1,自引:0,他引:1  
While the state-of-the-art image composition algorithms subtly handle the object boundary to achieve seamless image copy-and-paste, it is observed that they are unable to preserve the color fidelity of the source object, often require quite an amount of user interactions, and often fail to achieve realism when there exists salient discrepancy between the background textures in the source and destination images. These observations motivate our research towards color controlled natural and seamless image composition with least user interactions. In particular, based on the Poisson image editing framework, we first propose a variational model that considers both the gradient constraint and the color fidelity. The proposed model allows users to control the coloring effect caused by gradient domain fusion. Second, to have less user interactions, we propose a distance-enhanced random walks algorithm, through which we avoid the necessity of accurate image segmentation while still able to highlight the foreground object. Third, we propose a multiresolution framework to perform image compositions at different subbands so as to separate the texture and color components to simultaneously achieve smooth texture transition and desired color control. The experimental results demonstrate that our proposed framework achieves better and more realistic results for images with salient background color or texture differences, while providing comparable results as the state-of-the-art algorithms for images without the need of preserving the object color fidelity and without significant background texture discrepancy.  相似文献   

5.
由于退化条 件的存在,非理想虹膜识别的关键在于正确 分割虹膜区域,这一区域包含能 够用于个体识别的纹理。本文提出了一种基于统计特性的非理想虹膜图像分割方法,包括内 边界定位、外边界定位和眼睑检 测3个阶段。在内边界定位阶段,通过高斯混合(GMM)模型及多弦长均衡策略,实现对瞳 孔及虹膜中心的精确定位;在外边界定 位阶段,利用简化的基于区域信息的曲线演化方法,将其与序统计滤波(OSF)结合,以确保 曲线收敛至虹膜外边界;在 眼睑检测阶段,利用二次曲线对眼睑进行建模。对多个数据库进行实验的结果表明,本 文 方法能够有效克服反光、睫毛和 眼睑遮挡、外边界模糊等不利因素的影响,精确实现了非理想虹膜图像的分割。  相似文献   

6.
Edge-based color constancy.   总被引:3,自引:0,他引:3  
Color constancy is the ability to measure colors of objects independent of the color of the light source. A well-known color constancy method is based on the gray-world assumption which assumes that the average reflectance of surfaces in the world is achromatic. In this paper, we propose a new hypothesis for color constancy namely the gray-edge hypothesis, which assumes that the average edge difference in a scene is achromatic. Based on this hypothesis, we propose an algorithm for color constancy. Contrary to existing color constancy algorithms, which are computed from the zero-order structure of images, our method is based on the derivative structure of images. Furthermore, we propose a framework which unifies a variety of known (gray-world, max-RGB, Minkowski norm) and the newly proposed gray-edge and higher order gray-edge algorithms. The quality of the various instantiations of the framework is tested and compared to the state-of-the-art color constancy methods on two large data sets of images recording objects under a large number of different light sources. The experiments show that the proposed color constancy algorithms obtain comparable results as the state-of-the-art color constancy methods with the merit of being computationally more efficient.  相似文献   

7.
基于多特征扩展pLSA模型的场景图像分类   总被引:2,自引:0,他引:2  
江悦  王润生 《信号处理》2010,26(4):539-544
场景图像分类近年来受到人们的广泛关注,而基于统计模型的方法更是场景分类中的研究热点。我们提出了一种新的基于多特征融合和扩展pLSA模型的场景图像分类框架。对每幅图像首先用多尺度规则分割确定局部基元,然后提取每个局部基元的多分辨率直方图矩特征和SIFT特征,最后用扩展的概率生成模型对图像集进行建模,测试。我们的方法不仅能够很好的表示图像的语义特性而且在模型的训练阶段是无监督的。我们针对目前常用的3个数据库,做了三组对比实验,均取得了比以前的方法更好的识别结果。   相似文献   

8.
Extraction of foreground contents in complex background document images is very difficult as background texture, color and foreground font, size, color, tilt are not known in advance. In this work, we propose a RGB color model for the input of complex color document images. An algorithm to detect the text regions using Gabor filters followed by extraction of text using color feature luminance is developed too. The proposed approach consists of three stages. Based on the Gabor features, the candidate image segments containing text are detected in stage-1. Because of complex background, certain amount of high frequency non-text objects in the background are also detected as text objects in stage-1. In stage-2, certain amount of false text objects is dropped by performing the connected component analysis. In stage-3, the image segments containing textual information, which are obtained from the previous stage are binarized to extract the foreground text. The color feature luminance is extracted from the input color document image. The threshold value is derived automatically using this color feature. The proposed approach handles both printed and handwritten color document images with foreground text in any color, font, size and orientation. For experimental evaluations, we have considered a variety of document images having non-uniform/uniform textured and multicolored background. Performance of segmentation of foreground text is evaluated on a commercially available OCR. Evaluation results show better recognition accuracy of foreground characters in the processed document images against unprocessed document images.  相似文献   

9.
遥感影像检测分割技术通常需提取影像特征并通过深度学习算法挖掘影像的深层特征来实现.然而传统特征(如颜色特征、纹理特征、空间关系特征等)不能充分描述影像语义信息,而单一结构或串联算法无法充分挖掘影像的深层特征和上下文语义信息.针对上述问题,本文通过词嵌入将空间关系特征映射成实数密集向量,与颜色、纹理特征的结合.其次,本文构建基于注意力机制下图卷积网络和独立循环神经网络的遥感影像检测分割并联算法(Attention Graph Convolution Networks and Independently Recurrent Neural Network,ATGIR).该算法首先通过注意力机制对结合后的特征进行概率权重分配;然后利用图卷积网络(GCNs)算法对高权重的特征进一步挖掘并生成方向标签,同时使用独立循环神经网络(IndRNN)算法挖掘影像特征中的上下文信息,最后用Sigmoid分类器完成影像检测分割任务.以胡杨林遥感影像检测分割任务为例,我们验证了提出的特征提取方法和ATGIR算法能有效提升胡杨林检测分割任务的性能.  相似文献   

10.
Segmenting semantically meaningful whole objects from images is a challenging problem, and it becomes especially so without higher level common sense reasoning. In this paper, we present an interactive segmentation framework that integrates image appearance and boundary constraints in a principled way to address this problem. In particular, we assume that small sets of pixels, which are referred to as seed pixels, are labeled as the object and background. The seed pixels are used to estimate the labels of the unlabeled pixels using Dirichlet process multiple-view learning, which leverages 1) multiple-view learning that integrates appearance and boundary constraints and 2) Dirichlet process mixture-based nonlinear classification that simultaneously models image features and discriminates between the object and background classes. With the proposed learning and inference algorithms, our segmentation framework is experimentally shown to produce both quantitatively and qualitatively promising results on a standard dataset of images. In particular, our proposed framework is able to segment whole objects from images given insufficient seeds.  相似文献   

11.
In this paper, we present the Touch Text exTractor (Touch TT), an interactive text segmentation tool for the extraction of scene text from camera‐based images. Touch TT provides a natural interface for a user to simply indicate the location of text regions with a simple touchline. Touch TT then automatically estimates the text color and roughly locates the text regions. By inferring text characteristics from the estimated text color and text region, Touch TT can extract text components. Touch TT can also handle partially drawn lines which cover only a small section of text area. The proposed system achieves reasonable accuracy for text extraction from moderately difficult examples from the ICDAR 2003 database and our own database.  相似文献   

12.
In this paper, we propose an interactive color natural image segmentation method. The method integrates color feature with multiscale nonlinear structure tensor texture (MSNST) feature and then uses GrabCut method to obtain the segmentations. The MSNST feature is used to describe the texture feature of an image and integrated into GrabCut framework to overcome the problem of the scale difference of textured images. In addition, we extend the Gaussian Mixture Model (GMM) to MSNST feature and GMM based on MSNST is constructed to describe the energy function so that the texture feature can be suitably integrated into GrabCut framework and fused with the color feature to achieve the more superior image segmentation performance than the original GrabCut method. For easier implementation and more efficient computation, the symmetric KL divergence is chosen to produce the estimates of the tensor statistics instead of the Riemannian structure of the space of tensor. The Conjugate norm was employed using Locality Preserving Projections (LPP) technique as the distance measure in the color space for more discriminating power. An adaptive fusing strategy is presented to effectively adjust the mixing factor so that the color and MSNST texture features are efficiently integrated to achieve more robust segmentation performance. Last, an iteration convergence criterion is proposed to reduce the time of the iteration of GrabCut algorithm dramatically with satisfied segmentation accuracy. Experiments using synthesis texture images and real natural scene images demonstrate the superior performance of our proposed method.  相似文献   

13.
Overlay text brings important semantic clues in video content analysis such as video information retrieval and summarization, since the content of the scene or the editor's intention can be well represented by using inserted text. Most of the previous approaches to extracting overlay text from videos are based on low-level features, such as edge, color, and texture information. However, existing methods experience difficulties in handling texts with various contrasts or inserted in a complex background. In this paper, we propose a novel framework to detect and extract the overlay text from the video scene. Based on our observation that there exist transient colors between inserted text and its adjacent background, a transition map is first generated. Then candidate regions are extracted by a reshaping method and the overlay text regions are determined based on the occurrence of overlay text in each candidate. The detected overlay text regions are localized accurately using the projection of overlay text pixels in the transition map and the text extraction is finally conducted. The proposed method is robust to different character size, position, contrast, and color. It is also language independent. Overlay text region update between frames is also employed to reduce the processing time. Experiments are performed on diverse videos to confirm the efficiency of the proposed method.  相似文献   

14.
A deep learning framework for 3D point cloud processing is proposed in this work. In a point cloud, local neighborhoods have various shapes, and the semantic meaning of each point is determined within the local shape context. Thus, we propose shape-adaptive filters (SAFs), which are dynamically generated from the distributions of local points. The proposed SAFs can extract robust features against noise or outliers, by employing local shape contexts to suppress them. Also, we develop the SAF-Nets for classification and segmentation using multiple SAF layers. Extensive experimental results demonstrate that the proposed SAF-Nets significantly outperform the state-of-the-art conventional algorithms on several benchmark datasets. Moreover, it is shown that SAFs can improve scene flow estimation performance as well.  相似文献   

15.
结合颜色和MGD特征及MRF模型的场景文本分割   总被引:1,自引:1,他引:0  
针对场景文本受到光照、复杂背景等因素影响而难以进行有效分割的问题,提出了一种融合颜色和最大梯度差(MGD,maximum gradient difference)特征及马尔科夫随机场(MRF,Markov random field)的场景文本分割方法。首先提取能够有效表达文本纹理特性的MGD特征,通过概率框架将其和颜色特征结合起来对观测图像进行建模;然后结合空间关系和邻域像素属性差异对传统势函数进行改进;最后建立场景文本分割的MRF模型,利用图割(graph cut)算法快速地求解该模型。实验结果表明,采用颜色和MGD特征相结合以及改进的势函数对分割结果具有较大地改善,尤其在光照不均匀及背景复杂情况下相比其他算法取得了较好的性能。  相似文献   

16.
基于小波模糊聚类区域分割的图像检索   总被引:3,自引:0,他引:3  
吴冬升  吴乐南  黄波 《信号处理》2002,18(5):422-426
基于内容的图像检索是近年来的研究热点,本文给出一种基于区域分割的图像检索算法。算法首先对图像按JPEG2000标准进行小波变换,对变换得到的低频子带提取一定的颜色和纹理特征用于模糊聚类,从而将图像的低频子带分割为一定的区域,将分割结果映射回整幅图像,提取整幅图像各个区域的特征矢量,用于区域相似度比较,最后按照一定的区域匹配准则得到整幅图像之间的相似度。实验结果表明,本文算法具有良好的图像检索性能。  相似文献   

17.
基于全卷积神经网络的非对称并行语义分割模型   总被引:1,自引:0,他引:1       下载免费PDF全文
李宝奇  贺昱曜  何灵蛟  强伟 《电子学报》2019,47(5):1058-1064
针对RGB图像具有丰富的色彩细节特征,红外图像对目标轮廓、尺寸、边界等外形特征有较高敏感度的特点,提出了一种非对称并行语义分割模型APFCN(Asymmetric Parallelism Fully Convolutional Networks).APFCN上路设计了一个卷积核尺寸非统一的五层空洞卷积网络来提取红外图像目标高层轮廓特征;下路沿用卷积加池化网络提取RGB图像三个尺度上的细节特征;后端将红外图像高层特征与RGB图像三个尺度的细节特征进行融合,并将4倍上采样后的融合特征作为语义分割输出.结果表明,APFCN在像素精度和交并比等方面均优于FCN(输入为RGB图像或红外图像),适用于背景一致下地面目标的语义分割任务.  相似文献   

18.
Describes a visual monitoring system that performs scene segmentation based on color and texture information. Color information is combined with texture, and corresponding segmentation algorithms are developed to detect and measure changes (loss/gain) in a given scene or environment over a period of time. The xyY color space is used to represent the color information. The two chromaticity coordinates (x, y) are combined into one, thus providing the chrominance (spectral) part of the image, while Y describes the luminance (intensity) information. The proposed color/texture segmentation system processes luminance and chrominance separately. Luminance is processed in three stages: filtering, smoothing and boundary detection. Chrominance is processed in two stages: histogram multi-thresholding and region growing. Two or more images may be combined at the end in order to detect scene changes, using logical pixel operators. As a case study, the methodology is used to determine wetland loss/gain. For comparison purposes, results in both the xyY and HIS (hue, intensity, saturation) color spaces are presented  相似文献   

19.
提出了一种基于边缘的视频文字检测算法.利用Canny算子对图像进行边缘检测,然后根据文字边缘线条的特征,过滤非字符的边缘线条.最后利用文字线条区域的相似性,设置综合阈值,得到最终的文字区域.实验结果表明该算法不仅对规则排列的文字有较高的查全率.对不规则排列及扭曲的文字也能够准确定位.并对光照、阴影等条件不敏感.  相似文献   

20.
熊余  单德明  姚玉  张宇 《红外技术》2022,44(1):9-20
针对现有高光谱遥感图像卷积神经网络分类算法空谱特征利用率不足的问题,提出一种多特征融合下基于混合卷积胶囊网络的高光谱图像分类策略。首先,联合使用主成分分析和非负矩阵分解对高光谱数据集进行降维;然后,将降维所得主成分通过超像素分割和余弦聚类生成一个多维特征集;最后,将叠加后的特征集通过二维、三维多尺度混合卷积网络进行空谱特征提取,并使用胶囊网络对其进行分类。通过在不同高光谱数据集下的实验结果表明,在相同20维光谱维度下,所提策略相比于传统分类策略在总体精度、平均精度以及Kappa系数上均有明显提升。  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号