首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
2.
Text line segmentation in handwritten documents is an important task in the recognition of historical documents. Handwritten document images contain text lines with multiple orientations, touching and overlapping characters between consecutive text lines and different document structures, making line segmentation a difficult task. In this paper, we present a new approach for handwritten text line segmentation solving the problems of touching components, curvilinear text lines and horizontally overlapping components. The proposed algorithm formulates line segmentation as finding the central path in the area between two consecutive lines. This is solved as a graph traversal problem. A graph is constructed using the skeleton of the image. Then, a path-finding algorithm is used to find the optimum path between text lines. The proposed algorithm has been evaluated on a comprehensive dataset consisting of five databases: ICDAR2009, ICDAR2013, UMD, the George Washington and the Barcelona Marriages Database. The proposed method outperforms the state-of-the-art considering the different types and difficulties of the benchmarking data.  相似文献   

3.
4.
目的 手写文本行提取是文档图像处理中的重要基础步骤,对于无约束手写文本图像,文本行都会有不同程度的倾斜、弯曲、交叉、粘连等问题。利用传统的几何分割或聚类的方法往往无法保证文本行边缘的精确分割。针对这些问题提出一种基于文本行回归-聚类联合框架的手写文本行提取方法。方法 首先,采用各向异性高斯滤波器组对图像进行多尺度、多方向分析,利用拖尾效应检测脊形结构提取文本行主体区域,并对其骨架化得到文本行回归模型。然后,以连通域为基本图像单元建立超像素表示,为实现超像素的聚类,建立了像素-超像素-文本行关联层级随机场模型,利用能量函数优化的方法实现超像素的聚类与所属文本行标注。在此基础上,检测出所有的行间粘连字符块,采用基于回归线的k-means聚类算法由回归模型引导粘连字符像素聚类,实现粘连字符分割与所属文本行标注。最后,利用文本行标签开关实现了文本行像素的操控显示与定向提取,而不再需要几何分割。结果 在HIT-MW脱机手写中文文档数据集上进行文本行提取测试,检测率DR为99.83%,识别准确率RA为99.92%。结论 实验表明,提出的文本行回归-聚类联合分析框架相比于传统的分段投影分析、最小生成树聚类、Seam Carving等方法提高了文本行边缘的可控性与分割精度。在高效手写文本行提取的同时,最大程度地避免了相邻文本行的干扰,具有较高的准确率和鲁棒性。  相似文献   

5.
In this paper, we present a segmentation methodology of handwritten documents in their distinct entities, namely, text lines and words. Text line segmentation is achieved by applying Hough transform on a subset of the document image connected components. A post-processing step includes the correction of possible false alarms, the detection of text lines that Hough transform failed to create and finally the efficient separation of vertically connected characters using a novel method based on skeletonization. Word segmentation is addressed as a two class problem. The distances between adjacent overlapped components in a text line are calculated using the combination of two distance metrics and each of them is categorized either as an inter- or an intra-word distance in a Gaussian mixture modeling framework. The performance of the proposed methodology is based on a consistent and concrete evaluation methodology that uses suitable performance measures in order to compare the text line segmentation and word segmentation results against the corresponding ground truth annotation. The efficiency of the proposed methodology is demonstrated by experimentation conducted on two different datasets: (a) on the test set of the ICDAR2007 handwriting segmentation competition and (b) on a set of historical handwritten documents.  相似文献   

6.
In this paper, we present an effective approach for grouping text lines in online handwritten Japanese documents by combining temporal and spatial information. With decision functions optimized by supervised learning, the approach has few artificial parameters and utilizes little prior knowledge. First, the strokes in the document are grouped into text line strings according to off-stroke distances. Each text line string, which may contain multiple lines, is segmented by optimizing a cost function trained by the minimum classification error (MCE) method. At the temporal merge stage, over-segmented text lines (caused by stroke classification errors) are merged with a support vector machine (SVM) classifier for making merge/non-merge decisions. Last, a spatial merge module corrects the segmentation errors caused by delayed strokes. Misclassified text/non-text strokes (stroke type classification precedes text line grouping) can be corrected at the temporal merge stage. To evaluate the performance of text line grouping, we provide a set of performance metrics for evaluating from multiple aspects. In experiments on a large number of free form documents in the Tokyo University of Agriculture and Technology (TUAT) Kondate database, the proposed approach achieves the entity detection metric (EDM) rate of 0.8992 and the edit-distance rate (EDR) of 0.1114. For grouping of pure text strokes, the performance reaches EDM of 0.9591 and EDR of 0.0669.  相似文献   

7.
Variations in inter-line gaps and skewed or curled text-lines are some of the challenging issues in segmentation of handwritten text-lines. Moreover, overlapping and touching text-lines that frequently appear in unconstrained handwritten text documents significantly increase segmentation complexities. In this paper, we propose a novel approach for unconstrained handwritten text-line segmentation. A new painting technique is employed to smear the foreground portion of the document image. The painting technique enhances the separability between the foreground and background portions enabling easy detection of text-lines. A dilation operation is employed on the foreground portion of the painted image to obtain a single component for each text-line. Thinning of the background portion of the dilated image and subsequently some trimming operations are performed to obtain a number of separating lines, called candidate line separators. By using the starting and ending points of the candidate line separators and analyzing the distances among them, related candidate line separators are connected to obtain segmented text-lines. Furthermore, the problems of overlapping and touching components are addressed using some novel techniques. We tested the proposed scheme on text-pages of English, French, German, Greek, Persian, Oriya and Bangla and remarkable results were obtained.  相似文献   

8.
康厚良  杨玉婷 《图学学报》2022,43(5):865-874
以卷积神经网络(CNN)为代表的深度学习技术在图像分类和识别领域表现出了非常优异的性能。但东巴象形文字未有标准、公开的数据集,无法借鉴或使用已有的深度学习算法。为了快速建立权威、有效的东巴文字库,分析已出版东巴文档的版面结构,从文档中提取文本行、东巴字成为了当前的首要任务。因此,结合东巴象形文字文档图像的结构特点,给出了东巴文档图像的文本行自动分割算法。首先利用基于密度和距离的k-均值聚类算法确定了文本行的分类数量和分类标准;然后,通过文字块的二次处理矫正了分割中的错误结果,提高了算法的准确率。在充分利用东巴字文档结构特征的同时,保留了机器学习模型客观、无主观经验影响的优势。通过实验表明,该算法可用于东巴文档图像、脱机手写汉字、东巴经的文本行分割,以及文本行中东巴字和汉字的分割,具有实现简单、准确性高、适应性强的特点,从而为东巴文字库的建立奠定基础。  相似文献   

9.
10.
Two novel approaches to extract text lines and words from handwritten document are presented. The line segmentation algorithm is based on locating the optimal succession of text and gap areas within vertical zones by applying Viterbi algorithm. Then, a text-line separator drawing technique is applied and finally the connected components are assigned to text lines. Word segmentation is based on a gap metric that exploits the objective function of a soft-margin linear SVM that separates successive connected components. The algorithms tested on the benchmarking datasets of ICDAR07 handwriting segmentation contest and outperformed the participating algorithms.  相似文献   

11.
This paper presents a new Bayesian-based method of unconstrained handwritten offline Chinese text line recognition. In this method, a sample of a real character or non-character in realistic handwritten text lines is jointly recognized by a traditional isolated character recognizer and a character verifier, which requires just a moderate number of handwritten text lines for training. To improve its ability to distinguish between real characters and non-characters, the isolated character recognizer is negatively trained using a linear discriminant analysis (LDA)-based strategy, which employs the outputs of a traditional MQDF classifier and the LDA transform to re-compute the posterior probability of isolated character recognition. In tests with 383 text lines in HIT-MW database, the proposed method achieved the character-level recognition rates of 71.37% without any language model, and 80.15% with a bi-gram language model, respectively. These promising results have shown the effectiveness of the proposed method for unconstrained handwritten offline Chinese text line recognition.  相似文献   

12.
Learning a proper distance metric is an important problem in document classification, because the similarities of samples in many problems are usually measured by distance metric. In this paper, we address the nonlinear metric leaning problem with applying in the document classification. First, we propose a new representation about nonlinear metric by using a linear combination of some basic kernels. Second, we give a linear metric learning method by a triplet constraint and k-nearest neighbors, and then we develop it to a nonlinear method based on multiple kernel by above nonlinear metric. Further, the corresponding problem can be rewritten as an unconstrained optimization problem on positive definite matrices groups. At last, to ensure the learned distance matrix must be a positive definite matrix, we provide an improved intrinsic steepest descent algorithm with adaptive step-size to solve this unconstrained optimization. The experimental results show that our proposed method is effective on some document classification problems.  相似文献   

13.
针对脱机手写维吾尔文本行图像中单词切分问题,提出了FCM融合K-means的聚类算法。通过该算法得到单词内距离和单词间距离两种分类。以聚类结果为依据,对文字区域进行合并,得到切分点,再对切分点内的文字进行连通域标注,进行着色处理。以50幅不同的人书写的维吾尔脱机手写文本图像为实验对象,共有536行和4?002个单词,正确切分率达到80.68%。实验结果表明,该方法解决了手写维吾尔文在切分过程中,单词间距离不规律带来的切分困难的问题和一些单词间重叠的问题。同时实现了大篇幅手写文本图像的整体处理。  相似文献   

14.
各种文档中经常包含有各种特殊作用的横线、手划线等,当这些文档通过扫描等数字化方式存入计算机并需要进一步识别处理成文字编码时,这些线条却成为OCR的干扰因素,降低了文档内容的识别率.为此,本文提出一种新的文档干扰线去除算法,先将文档图像二值化,二值化过程考虑了不均匀光照带来的影响;然后将前景细化为单像素,减少线条粗细造成的影响;接着通过一种改进的贪婪算法计算横、竖两个方向线段的权重,判断权重较高的线段为干扰线;最后通过与干扰线距离的大小判断图像中每个前景像素的归属,从而获得一个完整的文档恢复图.仿真实验表明,本文提出的算法能够有效去除干扰线,特别在干扰线与文字粘连的情况下,去除干扰线的同时较少地影响文档图像的质量,且具有较高的计算速度和较好的去除效果,为图像进一步OCR识别提供了良好的基础.  相似文献   

15.
16.
手写文档的非结构化,导致对手写文档的编辑很困难。文本行是手写文档中一个显著的结构,它的可靠提取对于更高级别结构化文档(图形与文字分离,段结构的提取,文字的提取)及编辑文档非常重要。目前关于手写文档的结构化,分为联机和脱机两种。使用联机算法提取文本行,然后讨论文本行的提取对手势设计的影响。  相似文献   

17.
A boosted tree classifier is proposed to segment machine printed, handwritten and overlapping text from documents with handwritten annotations. Each node of the tree-structured classifier is a binary weak learner. Unlike a standard decision tree (DT) which only considers a subset of training data at each node and is susceptible to over-fitting, we boost the tree using all available training data at each node with different weights. The proposed method is evaluated on a set of machine-printed documents which have been annotated by multiple writers in an office/collaborative environment. The experimental results show that the proposed algorithm outperforms other methods on an imbalanced data set.  相似文献   

18.
Automatic character recognition and image understanding of a given paper document are the main objectives of the computer vision field. For these problems, a basic step is to isolate characters and group words from these isolated characters. In this paper, we propose a new method for extracting characters from a mixed text/graphic machine-printed document and an algorithm for distinguishing words from the isolated characters. For extracting characters, we exploit several features (size, elongation, and density) of characters and propose a characteristic value for classification using the run-length frequency of the image component. In the context of word grouping, previous works have largely been concerned with words which are placed on a horizontal or vertical line. Our word grouping algorithm can group words which are on inclined lines, intersecting lines, and even curved lines. To do this, we introduce the 3D neighborhood graph model which is very useful and efficient for character classification and word grouping. In the 3D neighborhood graph model, each connected component of a text image segment is mapped onto 3D space according to the area of the bounding box and positional information from the document. We conducted tests with more than 20 English documents and more than ten oriental documents scanned from books, brochures, and magazines. Experimental results show that more than 95% of words are successfully extracted from general documents, even in very complicated oriental documents. Received August 3, 2001 / Accepted August 8, 2001  相似文献   

19.
Offline handwritten Amharic word recognition   总被引:1,自引:0,他引:1  
This paper describes two approaches for Amharic word recognition in unconstrained handwritten text using HMMs. The first approach builds word models from concatenated features of constituent characters and in the second method HMMs of constituent characters are concatenated to form word model. In both cases, the features used for training and recognition are a set of primitive strokes and their spatial relationships. The recognition system does not require segmentation of characters but requires text line detection and extraction of structural features, which is done by making use of direction field tensor. The performance of the recognition system is tested by a dataset of unconstrained handwritten documents collected from various sources, and promising results are obtained.  相似文献   

20.
Despite several decades of research in document analysis, recognition of unconstrained handwritten documents is still considered a challenging task. Previous research in this area has shown that word recognizers perform adequately on constrained handwritten documents which typically use a restricted vocabulary (lexicon). But in the case of unconstrained handwritten documents, state-of-the-art word recognition accuracy is still below the acceptable limits. The objective of this research is to improve word recognition accuracy on unconstrained handwritten documents by applying a post-processing or OCR correction technique to the word recognition output. In this paper, we present two different methods for this purpose. First, we describe a lexicon reduction-based method by topic categorization of handwritten documents which is used to generate smaller topic-specific lexicons for improving the recognition accuracy. Second, we describe a method which uses topic-specific language models and a maximum-entropy based topic categorization model to refine the recognition output. We present the relative merits of each of these methods and report results on the publicly available IAM database.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号