首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 46 毫秒
1.
基于图理论聚类的彩色图像文本提取   总被引:3,自引:0,他引:3  
本文提出了一种在彩色图像中进行文本区域的自动提取的方法。首先,应用色彩的统计模型,大大减小了图像的彩色空间的大小;其次,使用基于图理论进行彩色聚类。将图像分解成对应各类的多幅二值图;然后,在这些二值图的基础上进行连通分量分析,提取可能的文本区域,并对这些区域进行鉴别;最后,综合各二值图的提取结果,得到原始彩色图像中的文本区域。对于特定的应用,提取出的文本区域经过进一步的处理,可以输入字符识别(0CR)系统中进行识别。实验结果显示了本文提出的方法的有效性.  相似文献   

2.
Automatic detection and recognition of signs from natural scenes   总被引:5,自引:0,他引:5  
In this paper, we present an approach to automatic detection and recognition of signs from natural scenes, and its application to a sign translation task. The proposed approach embeds multiresolution and multiscale edge detection, adaptive searching, color analysis, and affine rectification in a hierarchical framework for sign detection, with different emphases at each phase to handle the text in different sizes, orientations, color distributions and backgrounds. We use affine rectification to recover deformation of the text regions caused by an inappropriate camera view angle. The procedure can significantly improve text detection rate and optical character recognition (OCR) accuracy. Instead of using binary information for OCR, we extract features from an intensity image directly. We propose a local intensity normalization method to effectively handle lighting variations, followed by a Gabor transform to obtain local features, and finally a linear discriminant analysis (LDA) method for feature selection. We have applied the approach in developing a Chinese sign translation system, which can automatically detect and recognize Chinese signs as input from a camera, and translate the recognized text into English.  相似文献   

3.
Extraction of foreground contents in complex background document images is very difficult as background texture, color and foreground font, size, color, tilt are not known in advance. In this work, we propose a RGB color model for the input of complex color document images. An algorithm to detect the text regions using Gabor filters followed by extraction of text using color feature luminance is developed too. The proposed approach consists of three stages. Based on the Gabor features, the candidate image segments containing text are detected in stage-1. Because of complex background, certain amount of high frequency non-text objects in the background are also detected as text objects in stage-1. In stage-2, certain amount of false text objects is dropped by performing the connected component analysis. In stage-3, the image segments containing textual information, which are obtained from the previous stage are binarized to extract the foreground text. The color feature luminance is extracted from the input color document image. The threshold value is derived automatically using this color feature. The proposed approach handles both printed and handwritten color document images with foreground text in any color, font, size and orientation. For experimental evaluations, we have considered a variety of document images having non-uniform/uniform textured and multicolored background. Performance of segmentation of foreground text is evaluated on a commercially available OCR. Evaluation results show better recognition accuracy of foreground characters in the processed document images against unprocessed document images.  相似文献   

4.
当前较多图像篡改检测方法主要通过对图像特征间的距离进行测量来完成特征匹配,忽略了图像的色彩信息,导致检测结果中存在较多的误检测和漏检测现象。对此,本文将色彩信息引入到图像特征匹配过程中,设计了一种采用色彩制约模型的篡改检测算法。利用Laplacian算子与Harris算子提取图像特征,并利用像素点的红(R)、绿(G)、蓝(B)三原色信息,结合特征描述符建立色彩制约模型,对特征点间的色彩信息进行度量,再借助该度量值与特征点间的距离测量值共同完成图像特征匹配,充分剔除误匹配现象,有效提高匹配准确度。该算法还根据特征点间距离方差构造距离惩罚模型,对匹配后的图像特征进行聚类,准确识别篡改内容。通过实验结果发现,与其他篡改检测算法相比,本文算法不仅对伪造内容具备更高的检测准确度,而且对模糊及旋转等内容操作也具有更好的适应性。  相似文献   

5.
We present a two-pass image retrieval system in which retrieval techniques for text and image documents are combined in a novel approach. In the first pass, the text-based initial query is matched against the text captions of the images in the database to obtain the initial retrieved set. In the second pass, text and image features obtained from this initial retrieved set are used to expand the initial query. Additional images from the database are then retrieved based on the expanded query. The image features that we have used are color histograms, DC coefficients from the discrete cosine transform, and two texture features: multiresolution simultaneous autoregressive model and local binary pattern. These are low-level statistical image features that can be easily computed. Extensive experiments have been performed on 1019 color pictures of mixed variety with captions, relevance judgments and queries supplied by a national archives agency. Objective precision-recall results have been obtained with various combinations of text and image features. The results show that the image features do not perform well when used on their own. However, when image features are used in query expansion, they increase the average precision more significantly than text annotations. Moreover, these findings are valid at all precision levels and are not sensitive to the image feature acquisition parameters.  相似文献   

6.
Vehicle detection using normalized color and edge map.   总被引:4,自引:0,他引:4  
This paper presents a novel vehicle detection approach for detecting vehicles from static images using color and edges. Different from traditional methods, which use motion features to detect vehicles, this method introduces a new color transform model to find important "vehicle color" for quickly locating possible vehicle candidates. Since vehicles have various colors under different weather and lighting conditions, seldom works were proposed for the detection of vehicles using colors. The proposed new color transform model has excellent capabilities to identify vehicle pixels from background, even though the pixels are lighted under varying illuminations. After finding possible vehicle candidates, three important features, including corners, edge maps, and coefficients of wavelet transforms, are used for constructing a cascade multichannel classifier. According to this classifier, an effective scanning can be performed to verify all possible candidates quickly. The scanning process can be quickly achieved because most background pixels are eliminated in advance by the color feature. Experimental results show that the integration of global color features and local edge features is powerful in the detection of vehicles. The average accuracy rate of vehicle detection is 94.9%.  相似文献   

7.
二值化是光学文字识别(OCR)的重要步骤,直接影响到光学文字识别的成功率。目前基 于亮度分割局域二值化算法效果好,但是过程复杂、运算耗时。快速二值化算法流程简单、 噪声敏感。低亮度图片一般有不可忽略的噪声,并且文字对比度低。为获取低对比度文字, 快速二值化算法需对亮度梯度敏感,但是也会导致快速二值化结果文字断裂、丢失、背景噪 声大。为实现高质量快速二值化,本文采取非局域均值滤波算法抑制噪声,同时避免过度平 滑图片。采用改进的Bradley算法提取低对比度文字,并解决了文字断裂等问题。最后采用 膨胀腐蚀算法抑制二值化噪声。本方法适用于非均匀低亮度和高亮度的图片。实验结果表明 ,本方法在非均匀高亮度下,表现和其他快速二值化算法相同。在非均匀低亮度下,提取文 字更多、文字断裂更少、噪声更小。本方法二值化结果的OCR召回率达到了93.5%。  相似文献   

8.
提出了一种基于色彩距离最小化和最大 色彩差(MCD)的场景文本定位方法。首先,使用多次K均值 聚类和色彩距离最小化的方法,从不同复杂程度的场景图像中提取文本 连通区域;考虑到色彩聚类方法容易受光照影响,使用基于MCD最大色彩差的方法,提取 文本连通区域作为补充,由于将 色彩与梯度信息相结合,在一定程度上能克服光照的影响;将得到的连通区域通过设 定的字符合并规则,构建文本行; 候选文本行中通常包含错误检测的非文本行,为了提高文本检测的正确率,最后采用基于特 征提取和机器学习的方法,验证 候选文本行,得到文本定位结果。将本文方法在ICDAR2011和ICDAR2013公共数 据库上实验,对于ICDAR2011数据集,本文 获得的召回率、准确率和F指标分别为0.66、0.77;对于ICDAR2013数据集,本文获得的召回率、准确率和F 指标分别为0.65、0.77。将本文方法与 其它文本检测算法比较,结果表明本文方法的可行性、有效性。  相似文献   

9.
提出了一种基于RGB颜色空间和随机矩形区域的显著性检测方法.该方法以R、G、B作为图像特征,然后随机产生不同位置和大小的矩形区域,并统计每个矩形区域内各像素特征值与该区域的特征均值之间的距离,再综合所有矩形区域和所有特征得到最终的显著图.因不需进行颜色空间转换,可大幅减少计算时间;同时,RGB颜色空间三通道的亮度变化比较一致,使得在特征融合时能够充分利用所有特征的信息,因而取得了更好的检测效果.实验结果表明该方法能更快速、更有效地检测出图像中的显著性区域.  相似文献   

10.
朱国康  王运锋 《信号处理》2011,27(10):1616-1620
在道路交通标志的检测中,针对自然实景情况中拍摄到的图像存在的交通标志大小和位置不确定等困难问题,本文提出一种基于实景图像的多特征融合的道路交通标志检测方法。论文把样本分为了训练和测试样本,首先对训练样本图像进行盲复原处理;其次对复原处理后的图像进行自适应性的形状区域裁剪,提取裁剪区域图像的颜色、纹理和形状特征;再次分别对颜色、纹理和形状特征进行SVM分类检测,从而获得颜色、纹理和形状三个分类模型;最后对模型的权值进行自适应性计算,得到加权的特征融合模型。通过测试样本对模型的检测,结果表明特征融合识别方法有很高的准确度,另外对比实验得到的对比数据显示融合模型提高了道路交通检测的准确度和鲁棒性。   相似文献   

11.
提出了一种基于颜色和边缘特征的新闻视频标题条检测和文字识别方法.该方法首先利用颜色和边缘特征检测出新闻视频中含有标题条的帧,然后由先验知识得出标题条帧中的字幕区域,对字幕区域进行预处理和光学字符识别(OCR),得到文字内容.实验结果表明该方法具有较高的查全率和查准率.  相似文献   

12.
In this paper, we propose a novel framework to extract text regions from scene images with complex backgrounds and multiple text appearances. This framework consists of three main steps: boundary clustering (BC), stroke segmentation, and string fragment classification. In BC, we propose a new bigram-color-uniformity-based method to model both text and attachment surface, and cluster edge pixels based on color pairs and spatial positions into boundary layers. Then, stroke segmentation is performed at each boundary layer by color assignment to extract character candidates. We propose two algorithms to combine the structural analysis of text stroke with color assignment and filter out background interferences. Further, we design a robust string fragment classification based on Gabor-based text features. The features are obtained from feature maps of gradient, stroke distribution, and stroke width. The proposed framework of text localization is evaluated on scene images, born-digital images, broadcast video images, and images of handheld objects captured by blind persons. Experimental results on respective datasets demonstrate that the framework outperforms state-of-the-art localization algorithms.  相似文献   

13.
基于类别分布差异和VPRS特征选择的文本分类方法   总被引:3,自引:0,他引:3  
权值计算和特征降维是影响文本分类的精度和效率的两个重要步骤。该文首先根据特征词的类别分布差异进行特征过滤;然后,分析传统的权值公式TF-IDF的缺点,采用改进的权值计算公式简记为TF-CDF,依据TF-CDF公式计算每个特征词的权值,生成文档集的向量空间模型VSM;接着,提出了一种基于可变精度粗糙理论(VPRS)的特征选择进一步选择对分类贡献度大的特征,并用SQL实现。最后利用支持向量机LibSVM分类器进行实验,实验结果表明特征过滤和选择方法及TF-CDF权值公式有助于提高分类精度和分类效率。  相似文献   

14.
豆增发 《现代导航》2014,5(3):214-218
为了在文本数据中选择有效的文本特征,本文提出一种新的基于改进二进制粒子群优化的特征选择算法,该算法利用翻转角度,局部翻转因子和全局翻转因子来决定粒子群的进化,通过求解目标函数的最优解,得到二进制特征选择系数,选择特征选择系数为1的特征为有效特征。实验证明,该方法不仅有效地降低了运算开销,而且提高了文本分类的准确度。  相似文献   

15.
综合语义与颜色特征的图像检索技术研究   总被引:2,自引:2,他引:0  
针对多媒体搜索引擎系统中的图像检索技术,本文提出了应用图像的高层语义特征和底层颜色特征作为图像检索的综合指标,将图像文本和视觉信息融合起来,给出了一种综合语义和颜色特征的图像检索系统的体系架构.以填补多媒体底层特征和高层语义之间的差异,并在此基础上提出了相关算法,使图像检索能够满足用户的需求.提高图像检索的效率和精度。  相似文献   

16.
Color image segmentation, an ill-posed problem, can be treated as a process of dividing a color image into some constituent regions and each region is homogeneous. In this study, a saliency-directed color image segmentation approach using “simple” modified particle swarm optimization (PSO) is proposed, in which both low-level features and high-level image semantics extracted from each color image are employed. To extract high-level image semantics from each color image, the visual attention saliency map for each color image is generated by three (color, intensity, and orientation) feature maps, which is used to guide region merging using “simple” modified PSO and a hybrid fitness function for color image segmentation. The proposed approach contains four stages, namely, color quantization, feature extraction, small region elimination, and region merging using “simple” modified PSO. Based on the experimental results obtained in this study, as compared with four comparison approaches, the proposed approach usually provides the better color image segmentation results.  相似文献   

17.
Player Information Extraction for Semantic Annotation in Golf Videos   总被引:1,自引:0,他引:1  
In sports videos, text provides semantic information about the game such as scores and players. This paper provides an accurate extraction method of the player information in golf. First, a new detection method of the key captions containing the player information is presented. Since the location of the key captions containing the player information is not fixed during a game in golf, we use a color pattern of captions and its temporal repetition property instead of the location property to decide the key captions. Second, a dual binarization method is presented to segment texts with different color polarities (i.e. dark and bright texts) easily from the background in the key captions. Finally, the binarization results are recognized by OCR and converted to plain texts. The player is recognized by comparing the plain texts with the pre-reserved player name database. Experiments on a large database show that our method can extract the player information efficiently in golf videos.   相似文献   

18.
为了解决面向话题的搜索问题,提出一种新的面向话题的检索技术。首先分析了面向话题的搜索技术所面临的问题,然后基于数据挖掘技术提出了解决方案。利用数据挖掘技术抽取文本的多层次语义特征,形成对文本的多精度表示,抽取的特征不仅包括单个词特征也包括多词特征。建立了一个示例检索系统,实验表明利用多层次文本特征能够很好地实现面向话题的文本检索。  相似文献   

19.
本文提出了一种场景文本检测方法,用于应对复杂自然场景中文本检测的挑战。该方法采用了双重注意力和多尺度特征融合的策略,通过双重注意力融合机制增强了文本特征通道之间的关联性,提升了整体检测性能。在考虑到深层特征图上下采样可能引发的语义信息损失的基础上,提出了空洞卷积多尺度特征融合金字塔(dilated convolution multi-scale feature fusion pyramid structure,MFPN) ,它采用双融合机制来增强语义特征,有助于加强语义特征,克服尺度变化的影响。针对不同密度信息融合引发的语义冲突和多尺度特征表达受限问题,创新性地引入了多尺度特征融合模块(multi-scale feature fusion module,MFFM )。 此外,针对容易被冲突信息掩盖的小文本问题,引入了特征细化模块(feature refinement module,FRM ) 。实验表明,本文的方法对复杂 场景中文本检测有效,其F值在CTW1500、ICDAR2015和Total-Text 3个数据集上分别达到了85.6%、87.1%和86.3%。  相似文献   

20.
Due to the rapid development of mobile devices equipped with cameras, instant translation of any text seen in any context is possible. Mobile devices can serve as a translation tool by recognizing the texts presented in the captured scenes. Images captured by cameras will embed more external or unwanted effects which need not to be considered in traditional optical character recognition (OCR). In this paper, we segment a text image captured by mobile devices into individual single characters to facilitate OCR kernel processing. Before proceeding with character segmentation, text detection and text line construction need to be performed in advance. A novel character segmentation method which integrates touched character filters is employed on text images captured by cameras. In addition, periphery features are extracted from the segmented images of touched characters and fed as inputs to support vector machines to calculate the confident values. In our experiment, the accuracy rate of the proposed character segmentation system is 94.90%, which demonstrates the effectiveness of the proposed method.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号