首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 0 毫秒
1.
In this paper, we present an accelerated system for segmenting flower images based on graph-cut technique which formulates the segmentation problem as an energy function minimization. The contribution of this paper consists to propose an improvement of the classical used energy function, which is composed of a data-consistent term and a boundary term. For this, we integrate an additional data-consistent term based on the spatial prior and we add gradient information in the boundary term. Then, we propose an automated coarse-to-fine segmentation method composed mainly of two levels: coarse segmentation and fine segmentation. First, the coarse segmentation level is based on minimizing the proposed energy function. Then, the fine segmentation is done by optimizing the energy function through the standard graph-cut technique. Experiments were performed on a subset of Oxford flower database and the obtained results are compared to the reimplemented method of Nilsback et al. [1]. The evaluation shows that our method consumes less CPU time and it has a satisfactory accuracy compared with the mentioned method above [1].  相似文献   

2.
Multimedia Tools and Applications - An optimal contour segmentation for ultrasonic liver cyst image is presented through combining graph-based method with particle swarm optimization (PSO) in this...  相似文献   

3.
刘晓佩 《控制与决策》2015,30(11):1987-1992

针对复杂场景文本难以有效分割的问题, 提出一种复杂场景文本分割方法. 首先, 使用简单的线性迭代聚类(SLIC) 算法将原始图像分割为若干局部区域, 并在其区域邻接图上构建图割模型; 然后, 采用高斯混合模型(GMMs) 和支持向量机(SVM) 后验概率模型对场景文本进行建模, 并引入每个局部区域与模型之间的匹配度用于计算似然能. 为了增强GMMs的鉴别力, 在参数学习中引入模型性能描述子, 自适应地获得模型参数. 实验结果表明,所提出的算法能够较好地处理复杂场景文本分割问题, 文本的识别率得到了明显的提升.

  相似文献   

4.
根据对自然场景图像分割后具有标志牌和背景明显分开等特点,提出了一种基于边框删除的标志牌文本提取算法,首先在二值化图像中采用基于边缘检测和投影算法对标志牌区域进行定位,然后采用边框删除算法提取标志牌文本.大量实验结果表明该方法能够准确定位并提取非规则的标志牌文本.  相似文献   

5.
In multi-view reconstruction systems, the recovered point cloud often consists of numerous unwanted background points. We propose a graph-cut based method for automatically segmenting point clouds from multi-view reconstruction. Based on the observation that the object of interest is likely to be central to the intended multi-view images, our method requires no user interaction except two roughly estimated parameters of objects covering in the central area of images. The proposed segmentation process is carried out in two steps: first, we build a weighted graph whose nodes represent points and edges that connect each point to its k-nearest neighbors. The potentials of each point being object and background are estimated according to distances between its projections in images and the corresponding image centers. The pairwise potentials between each point and its neighbors are computed according to their positions, colors and normals. Graph-cut optimization is then used to find the initial binary segmentation of object and background points. Second, to refine the initial segmentation, Gaussian mixture models (GMMs) are created from the color and density features of points in object and background classes, respectively. The potentials of each point being object and background are re-calculated based on the learned GMMs. The graph is updated and the segmentation of point clouds is improved by graph-cut optimization. The second step is iterated until convergence. Our method requires no manual labeling points and employs available information of point clouds from multi-view systems. We test the approach on real-world data generated by multi-view reconstruction systems.  相似文献   

6.
目的 场景文本检测是场景理解和文字识别领域的重要任务之一,尽管基于深度学习的算法显著提升了检测精度,但现有的方法由于对文字局部语义和文字实例间的全局语义的提取能力不足,导致缺乏文字多层语义的建模,从而检测精度不理想。针对此问题,提出了一种层级语义融合的场景文本检测算法。方法 该方法包括基于文本片段的局部语义理解模块和基于文本实例的全局语义理解模块,以分别引导网络关注文字局部和文字实例间的多层级语义信息。首先,基于文本片段的局部语义理解模块根据相对位置将文本划分为多个片段,在细粒度优化目标的监督下增强网络对局部语义的感知能力。然后,基于文本实例的全局语义理解模块利用文本片段粗分割结果过滤背景区域并提取可靠的文字区域特征,进而通过注意力机制自适应地捕获任意形状文本的全局语义信息并得到最终分割结果。此外,为了降低边界区域的预测噪声对层级语义信息聚合的干扰,提出边界感知损失函数以降低边界区域特征的歧义性。结果 算法在3个常用的场景文字检测数据集上实验并与其他算法进行了比较,所提方法在性能上获得了显著提升,在Totoal-Text数据集上,F值为87.0%,相比其他模型提升了1.0%;在MSRA-TD500(MSRA text detection 500 database)数据集上,F值为88.2%,相比其他模型提升了1.0%;在ICDAR 2015(International Conference on Document Analysis and Recognition)数据集上,F值为87.0%。结论 提出的模型通过分别构建不同层级下的语义上下文和对歧义特征额外的惩罚解决了层级语义提取不充分的问题,获得了更高的检测精度。  相似文献   

7.
International Journal on Document Analysis and Recognition (IJDAR) - Recently scene text detection has become a hot research topic. Arbitrary-shaped text detection is more challenging due to the...  相似文献   

8.
席志红  韩双全  王洪旭 《计算机应用》2019,39(10):2847-2851
针对动态物体在室内同步定位与地图构建(SLAM)系统中影响位姿估计的问题,提出一种动态场景下基于语义分割的SLAM系统。在相机捕获图像后,首先用PSPNet(Pyramid Scene Parsing Network)对图像进行语义分割;之后提取图像特征点,剔除分布在动态物体内的特征点,并用静态的特征点进行相机位姿估计;最后完成语义点云图和语义八叉树地图的构建。在公开数据集上的五个动态序列进行多次对比测试的结果表明,相对于使用SegNet网络的SLAM系统,所提系统的绝对轨迹误差的标准偏差有6.9%~89.8%的下降,平移和旋转漂移的标准偏差在高动态场景中的最佳效果也能分别提升73.61%和72.90%。结果表明,改进的系统能够显著减小动态场景下位姿估计的误差,准确地在动态场景中进行相机位姿估计。  相似文献   

9.
Saliency text is that characters are ordered with visibility and expressivity. It also contains important clues for video analysis, indexing, and retrieval. Thus, in order to localize the saliency text, a critical stage is to collect key points from real text pixels. In this paper, we propose an evidence-based model of saliency feature extraction (SFE) to probe saliency text points (STPs), which have strong text signal structure in multi-observations simultaneously and always appear between text and its background. Through the multi-observations, each signal structure with rhythms of signal segments is extracted at every location in the visual field. It supports source of evidences for our evidence-based model, where evidences are measured to effectively estimate the degrees of plausibility for obtaining the STP. The evaluation results on benchmark datasets also demonstrate that our proposed approach achieves the state-of-the-art performance on exploring real text pixels and significantly outperforms the existing algorithms for detecting text candidates. The STPs can be the extremely reliable text candidates for future text detectors.  相似文献   

10.
Due to the exponential growth of documents on the Internet and the emergent need to organize them, the automated categorization of documents into predefined labels has received an ever-increased attention in the recent years. A wide range of supervised learning algorithms has been introduced to deal with text classification. Among all these classifiers, K-Nearest Neighbors (KNN) is a widely used classifier in text categorization community because of its simplicity and efficiency. However, KNN still suffers from inductive biases or model misfits that result from its assumptions, such as the presumption that training data are evenly distributed among all categories. In this paper, we propose a new refinement strategy, which we called as DragPushing, for the KNN Classifier. The experiments on three benchmark evaluation collections show that DragPushing achieved a significant improvement on the performance of the KNN Classifier.  相似文献   

11.
目的 目前基于卷积神经网络(CNN)的文本检测方法对自然场景中小尺度文本的定位非常困难。但自然场景图像中文本目标与其他目标存在很强的关联性,即自然场景中的文本通常伴随特定物体如广告牌、路牌等同时出现,基于此本文提出了一种顾及目标关联的级联CNN自然场景文本检测方法。方法 首先利用CNN检测文本目标及包含文本的关联物体目标,得到文本候选框及包含文本的关联物体候选框;再扩大包含文本的关联物体候选框区域,并从原始图像中裁剪,然后以该裁剪图像作为CNN的输入再精确检测文本候选框;最后采用非极大值抑制方法融合上述两步生成的文本候选框,得到文本检测结果。结果 本文方法能够有效地检测小尺度文本,在ICDAR-2013数据集上召回率、准确率和F值分别为0.817、0.880和0.847。结论 本文方法顾及自然场景中文本目标与包含文本的物体目标的强关联性,提高了自然场景图像中小尺度文本检测的召回率。  相似文献   

12.
Error measures for scene segmentation   总被引:8,自引:0,他引:8  
Scene segmentation is an important problem in pattern recognition. Current subjective methods for evaluation and comparison of scene segmentation techniques are inadequate and objective quantitative measures are desirable. Two error measures, the percentage area misclassified (p) and a new pixel distance error (ε) were defined and evaluated in terms of their correlation with human observation for comparison of multiple segmentations of the same scene and multiple scenes segmented by the same technique. The results indicate that both these measures can be helpful in the evaluation and comparison of scene segmentation procedures.  相似文献   

13.
Standard methods of image segmentation do not take into account the three-dimensional nature of the underlying scene. For example, histogram-based segmentation tacitly assumes that the image intensity is piecewise constant, and this is not true when the scene contains curved surfaces. This paper introduces a method of taking 3D information into account in the segmentation process. The image intensities are adjusted to compensate for the effects of estimated surface orientation; the adjusted intensities can be regarded as reflectivity estimates. When histogram-based segmentation is applied to these new values, the image is segmented into parts corresponding to surfaces of contant reflectivity in the scene.  相似文献   

14.
牛钦 《计算机时代》2021,(6):19-21,25
场景文本检测是计算机视觉领域研究的主要方向.文章介绍了近几年深度学习技术在场景文本检测上的应用,包括对场景文本图像检测中存在问题的描述,对近些年场景文本检测算法的分类和分析,以及场景文本检测数据集的介绍.最后总结并展望了未来场景文本检测的发展趋势.  相似文献   

15.
Image segmentation is a challenging problem in computer vision with wide application. It is a process which considers the similarity criterion required to separate an image into different homogenous connected regions. First, an Optimized Adaptive Connectivity and Shape Prior in Modified Graph Cut Segmentation method has been applied to handle the structural irregularities in images. Second, an Optimized Adaptive Connectivity and Shape Prior in Modified Fuzzy Graph Cut Segmentation (Opac-MFGseg) is proposed to partition the images based on feature values. In this method, a fuzzy rule based system is used with optimization algorithm to provide the information on how much a specific feature is involved in image boundaries. The graph obtained from this fuzzy approach is further used in adaptive shape prior in modified graph cuts framework. Moreover, this method supports moving images (videos). In such a situation, a fully dynamic method called Optimized Adaptive Connectivity and Shape Prior in Dynamic Fuzzy Graph Cut Segmentation (Opac-DFGseg) method is proposed for the image segmentation. The effectiveness of the Opac-MFGseg and Opac-DFGseg methods is tested in terms of average sensitivity, precision, area overlap measure, relative error, and accuracy and computation time.  相似文献   

16.
The capability of extracting and recognizing characters printed in color documents will widen immensely the applications of OCR systems. This paper describes a new method of color segmentation to extract character areas from a color document. At first glance, the characters seem to be printed in a single color, but actual measurements reveal that the color image has a distribution of components. Compared with clustering algorithms, our method prevents oversegmentation and fusion with the background while maintaining real-time usability. It extracts the representative colors based on a histogram analysis of the color space. Our method also contains a selective local color averaging technique that removes the problem of mesh noise on high-resolution color images.Received: 25 July 2003, Revised: 10 August 2003, Published online: 6 February 2004Correspondence to: Hiroyuki Hase. Current address: 3-9-1 Bunkyo, Fukui-shi 910-8507, Japan  相似文献   

17.
Text recognition in natural scene images is a challenging task that has recently been garnering increased research attention. In this paper, we propose a method for recognizing text by utilizing the layout consistency of a text string. We estimate the layout (four lines of a text string) using initial character extraction and recognition result. On the basis of the layout consistency across a word, we perform character extraction and recognition again using four lines, which is more accurate than the first process. Our layout estimation method is different from previous methods in terms of exploiting character recognition results and its use of a class-conditional layout model. More accurate and robust estimation is achieved, and it can be used to refine character extraction and recognition. We call this two-way process—from extraction and recognition to layout, and from layout to extraction and recognition—“bidirectional” to discriminate it from previous feedback refinement approaches. Experimental results demonstrate that our bidirectional processes provide a boost to the performance of word recognition.  相似文献   

18.
Automatic text segmentation and text recognition for video indexing   总被引:13,自引:0,他引:13  
Efficient indexing and retrieval of digital video is an important function of video databases. One powerful index for retrieval is the text appearing in them. It enables content-based browsing. We present our new methods for automatic segmentation of text in digital videos. The algorithms we propose make use of typical characteristics of text in videos in order to enable and enhance segmentation performance. The unique features of our approach are the tracking of characters and words over their complete duration of occurrence in a video and the integration of the multiple bitmaps of a character over time into a single bitmap. The output of the text segmentation step is then directly passed to a standard OCR software package in order to translate the segmented text into ASCII. Also, a straightforward indexing and retrieval scheme is introduced. It is used in the experiments to demonstrate that the proposed text segmentation algorithms together with existing text recognition algorithms are suitable for indexing and retrieval of relevant video sequences in and from a video database. Our experimental results are very encouraging and suggest that these algorithms can be used in video retrieval applications as well as to recognize higher level semantics in videos.  相似文献   

19.
Scene extraction is the first step toward semantic understanding of a video. It also provides improved browsing and retrieval facilities to users of video database. This paper presents an effective approach to movie scene extraction based on the analysis of background images. Our approach exploits the fact that shots belonging to one particular scene often have similar backgrounds. Although part of the video frame is covered by foreground objects, the background scene can still be reconstructed by a mosaic technique. The proposed scene extraction algorithm consists of two main components: determination of the shot similarity measure and a shot grouping process. In our approach, several low-level visual features are integrated to compute the similarity measure between two shots. On the other hand, the rules of film-making are used to guide the shot grouping process. Experimental results show that our approach is promising and outperforms some existing techniques.  相似文献   

20.
Recently, segmentation-based scene text detection has drawn a wide research interest due to its flexibility in describing scene text instance of arbitrary shapes such as curved texts. However, existing methods usually need complex post-processing stages to process ambiguous labels, i.e., the labels of the pixels near the text boundary, which may belong to the text or background. In this paper, we present a framework for segmentation-based scene text detection by learning from ambiguous labels. We use the label distribution learning method to process the label ambiguity of text annotation, which achieves a good performance without using additional post-processing stage. Experiments on benchmark datasets demonstrate that our method produces better results than state-of-the-art methods for segmentation-based scene text detection.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号