期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

1.

Word searching in unconstrained layout using character pair coding

Partha Pratim Roy Umapada Pal Josep Lladós 《International Journal on Document Analysis and Recognition》2014,17(4):343-358

Word searching in non-structural layout such as graphical documents is a difficult task due to arbitrary orientations of text words and the presence of graphical symbols. This paper presents an efficient approach for word searching in documents of non-structural layout using an efficient indexing and retrieval approach. The proposed indexing scheme stores spatial information of text characters of a document using a character spatial feature table (CSFT). The spatial feature of text component is derived from the neighbor component information. The character labeling of a multi-scaled and multi-oriented component is performed using support vector machines. For searching purpose, the positional information of characters is obtained from the query string by splitting it into possible combinations of character pairs. Each of these character pairs searches the position of corresponding text in document with the help of CSFT. Next, the searched text components are joined and formed into sequence by spatial information matching. String matching algorithm is performed to match the query word with the character pair sequence in documents. The experimental results are presented on two different datasets of graphical documents: maps dataset and seal/logo image dataset. The results show that the method is efficient to search query word from unconstrained document layouts of arbitrary orientation. 相似文献

2.

Language identification for handwritten document images using a shape codebook

Guangyu Zhu Xiaodong Yu Yi Li David DoermannAuthor vitae 《Pattern recognition》2009,42(12):3184-3191

相似文献

3.

Multi-oriented Bangla and Devnagari text recognition

Umapada Pal Partha Pratim Roy Nilamadhaba Tripathy Josep LladósAuthor vitae 《Pattern recognition》2010,43(12):4124-4136

相似文献

4.

A Character Flow Framework for Multi-Oriented Scene Text Detection

下载免费PDF全文

Wen-Jun Yang Bei-Ji Zou Kai-Wen Li Shu Liu 《计算机科学技术学报》2021,36(3):465-477

Scene text detection plays a significant role in various applications,such as object recognition,document management,and visual navigation.The instance segmentation based method has been mostly used in existing research due to its advantages in dealing with multi-oriented texts.However,a large number of non-text pixels exist in the labels during the model training,leading to text mis-segmentation.In this paper,we propose a novel multi-oriented scene text detection framework,which includes two main modules:character instance segmentation (one instance corresponds to one character),and character flow construction (one character flow corresponds to one word).We use feature pyramid network(FPN) to predict character and non-character instances with arbitrary directions.A joint network of FPN and bidirectional long short-term memory (BLSTM) is developed to explore the context information among isolated characters,which are finally grouped into character flows.Extensive experiments are conducted on ICDAR2013,ICDAR2015,MSRA-TD500 and MLT datasets to demonstrate the effectiveness of our approach.The F-measures are 92.62％,88.02％,83.69％ and 77.81％,respectively. 相似文献

5.

Multi-oriented touching text character segmentation in graphical documents using dynamic programming

Partha Pratim Roy Umapada Pal Josep Lladós Mathieu Delalandre 《Pattern recognition》2012,45(5):1972-1983

The touching character segmentation problem becomes complex when touching strings are multi-oriented. Moreover in graphical documents sometimes characters in a single-touching string have different orientations. Segmentation of such complex touching is more challenging. In this paper, we present a scheme towards the segmentation of English multi-oriented touching strings into individual characters. When two or more characters touch, they generate a big cavity region in the background portion. Based on the convex hull information, at first, we use this background information to find some initial points for segmentation of a touching string into possible primitives (a primitive consists of a single character or part of a character). Next, the primitives are merged to get optimum segmentation. A dynamic programming algorithm is applied for this purpose using the total likelihood of characters as the objective function. A SVM classifier is used to find the likelihood of a character. To consider multi-oriented touching strings the features used in the SVM are invariant to character orientation. Experiments were performed in different databases of real and synthetic touching characters and the results show that the method is efficient in segmenting touching characters of arbitrary orientations and sizes. 相似文献

6.

一种直接高效的自然场景汉字逼近定位方法

下载免费PDF全文

赵凡张琳闻治泉杨林林蔺广逢《计算机工程与应用》2021,57(6):159-167

为了提高经典目标检测算法对自然场景文本定位的准确性,以及克服传统字符检测模型由于笔画间存在非连通性引起的汉字错误分割问题,提出了一种直接高效的自然场景汉字逼近定位方法。采用经典的EAST算法对场景图像中的文字进行检测。对初检的文字框进行调整使其更紧凑和更完整地包含文字,主要由提取各连通笔画成分、汉字分割和文字形状逼近三部分组成。矫正文字区域和识别文字内容。实验结果表明,提出的算法在保持平均帧率为3.1 帧/s的同时,对ICDAR2015、ICDAR2017-MLT和MSRA-TD500三个多方向数据集上文本定位任务中的F-score分别达到83.5%、72.8%和81.1%;消融实验验证了算法中各模块的有效性。在ICDAR2015数据集上的检测和识别综合评估任务中的性能也验证了该方法相比一些最新方法取得了更好的性能。相似文献

7.

Rough-fuzzy clustering and multiresolution image analysis for text-graphics segmentation

《Applied Soft Computing》2015

相似文献

8.

An intelligent character recognizer for Telugu scripts using multiresolution analysis and associative memory

Arun K. Pujari C. Dhanunjaya Naidu M. Sreenivasa Rao B. C. Jinaga 《Image and vision computing》2004,22(14):V278-1227

The present work is an attempt to develop a robust character recognizer for Telugu texts. We aim at designing a recognizer, which exploits the inherent characteristics of the Telugu Script. Our proposed method uses wavelet multi-resolution analysis for the purpose extracting features and associative memory model to accomplish the recognition tasks. Our system learns the style and font from the document itself and then it recognizes the remaining characters in the document. The major contribution of the present study can be outlined as follows. It is a robust OCR system for Telugu printed text. It avoids feature extraction process and it exploits the inherent characteristics of the Telugu character by a clever selection of Wavelet Basis function, which extracts the invariant features of the characters. It has a Hopfield-based Dynamic Neural Network for the purpose of learning and recognition. This is important because it overcomes the inherent difficulties of memory limitation and spurious states in the Hopfield Network. The DNN has been demonstrated to be efficient for associative memory recall. However, though it is normally not suitable for image processing application, the multi-resolution analysis reduces the sizes of the images to make the DNN applicable to the present domain. Our experimental results show extremely promising results. 相似文献

9.

Skew detection in document images based on rectangular active contour

Huijie Fan Linlin Zhu Yandong Tang 《International Journal on Document Analysis and Recognition》2010,13(4):261-269

The digitalization processes of documents produce frequently images with small rotation angles. The skew angles in document images degrade the performance of optical character recognition (OCR) tools. Therefore, skew detection of document images plays an important role in automatic document analysis systems. In this paper, we propose a Rectangular Active Contour Model (RAC Model) for content region detection and skew angle calculation by imposing a rectangular shape constraint on the zero-level set in Chan–Vese Model (C-V Model) according to the rectangular feature of content regions in document images. Our algorithm differs from other skew detection methods in that it does not rely on local image features. Instead, it uses global image features and shape constraint to obtain a strong robustness in detecting skew angles of document images. We experimented on different types of document images. Comparing the results with other skew detection algorithms, our algorithm is more accurate in detecting the skews of the complex document images with different fonts, tables, illustrations, and layouts. We do not need to pre-process the original image, even if it is noisy, and at the same time the rectangular content region of a document image is also detected. 相似文献

10.

Texture sparseness for pixel classification of business document images

Melissa Cote Alexandra Branzan Albu 《International Journal on Document Analysis and Recognition》2014,17(3):257-273

相似文献

11.

Signature Detection and Matching for Document Image Retrieval

Guangyu Zhu Yefeng Zheng Doermann D. Jaeger S. 《IEEE transactions on pattern analysis and machine intelligence》2009,31(11):2015-2031

As one of the most pervasive methods of individual identification and document authentication, signatures present convincing evidence and provide an important form of indexing for effective document image processing and retrieval in a broad range of applications. However, detection and segmentation of free-form objects such as signatures from clustered background is currently an open document analysis problem. In this paper, we focus on two fundamental problems in signature-based document image retrieval. First, we propose a novel multiscale approach to jointly detecting and segmenting signatures from document images. Rather than focusing on local features that typically have large variations, our approach captures the structural saliency using a signature production model and computes the dynamic curvature of 2D contour fragments over multiple scales. This detection framework is general and computationally tractable. Second, we treat the problem of signature retrieval in the unconstrained setting of translation, scale, and rotation invariant nonrigid shape matching. We propose two novel measures of shape dissimilarity based on anisotropic scaling and registration residual error and present a supervised learning framework for combining complementary shape information from different dissimilarity metrics using LDA. We quantitatively study state-of-the-art shape representations, shape matching algorithms, measures of dissimilarity, and the use of multiple instances as query in document image retrieval. We further demonstrate our matching techniques in offline signature verification. Extensive experiments using large real-world collections of English and Arabic machine-printed and handwritten documents demonstrate the excellent performance of our approaches. 相似文献

12.

Text detection in street level images

Jonathan Fabrizio Beatriz Marcotegui Matthieu Cord 《Pattern Analysis & Applications》2013,16(4):519-533

相似文献

13.

基于多边形偏移蒙版和边界增强的场景文本检测

张智秦瑶顾进广《计算机应用研究》2021,38(8):2474-2478,2484

目前,多方向文本检测方法已经在各种数据集上取得了不错的性能,但是任意形状文本检测仍然存在一些困难,尤其是具有不同大小、形状、方向、颜色和样式的文本实例.为了更好地区分连续任意形状的文本实例和周边非文本区域,提出了一种基于分段的文本检测器,通过使用多边形偏移蒙版和边界增强来检测任意形状的场景文本.为了评估该方法的有效性,在ICDAR2015和Total-Text等公开数据集上进行了多组对比实验,实验结果证明该方法有着更卓越的性能. 相似文献

14.

基于神经网络的自然场景方向文本检测器

周铂焱杨鹏《计算机与数字工程》2020,48(1):163-166

场景文本检测是场景文本识别中重要的一步,也是一个具有挑战性的问题。不同于一般的目标检测,场景文本检测的主要挑战在于自然场景图像中的文本具有任意方向,小的尺寸,以及多种宽高比。论文在TextBoxes[8]的基础上进行改进,提出了一个适用于任意方向文本的检测器,命名为OSTD(Oriented Scene Text Detector),可以有效且准确地检测自然场景中任意方向的文本。论文在公共数据集上对提出OSTD的进行评估。所有实验结果都表明,无论在准确性,还是实时性方面OSTD都是极具竞争力的方法。在1024×1024的ICDAR2015 Incidental Text数据集[16]上,OSTD的F-Measure=0.794,FPS=10.7。相似文献

15.

Text feature selection with a robust weight scheme and dynamic dimension reduction to text document clustering

《Expert systems with applications》2017

This paper proposes three feature selection algorithms with feature weight scheme and dynamic dimension reduction for the text document clustering problem. Text document clustering is a new trend in text mining; in this process, text documents are separated into several coherent clusters according to carefully selected informative features by using proper evaluation function, which usually depends on term frequency. Informative features in each document are selected using feature selection methods. Genetic algorithm (GA), harmony search (HS) algorithm, and particle swarm optimization (PSO) algorithm are the most successful feature selection methods established using a novel weighting scheme, namely, length feature weight (LFW), which depends on term frequency and appearance of features in other documents. A new dynamic dimension reduction (DDR) method is also provided to reduce the number of features used in clustering and thus improve the performance of the algorithms. Finally, k-mean, which is a popular clustering method, is used to cluster the set of text documents based on the terms (or features) obtained by dynamic reduction. Seven text mining benchmark text datasets of different sizes and complexities are evaluated. Analysis with k-mean shows that particle swarm optimization with length feature weight and dynamic reduction produces the optimal outcomes for almost all datasets tested. This paper provides new alternatives for text mining community to cluster text documents by using cohesive and informative features. 相似文献

16.

Logo and seal based administrative document image retrieval: A survey

《Computer Science Review》2016

With the advance of technology, business offices and organizations together with their clients create a massive amount of administrative documents every day. Administrative documents commonly contain some salient entities such as logos, stamps or seals as the means of their authentication and proprietorship. These salient entities provide quite discriminative information, which can effectively be used for different tasks of document image retrieval, classification and recognition in document-based applications. Thus, proper detection/recognition of these entities in document images increases the performance of such applications in terms of document retrieval, classification, and recognition. To present the state-of-the-art research on the retrieval of administrative document images, this paper deals with a survey of administrative document image retrieval in relation to seals and logos. All the available datasets, feature extraction and classification techniques for logo and seal detection/recognition are discussed systematically. The shortcomings of the present technologies on logo and seal based document processing are also highlighted. Avenues of the future works are further given for the benefit of readers. To the best of authors’ knowledge, there is no survey on administrative document image retrieval and hence the authors hope that this work will be helpful to the researchers of the document analysis community. 相似文献

17.

利用能量特征进行条烟识别 总被引：1，自引：0，他引：1

孙冬明军臧小溪《微机发展》2006,16(4):132-134

针对条烟识别中的图像特征提取问题,定义了一种新的图像特征。此特征描述了图像能量谱的分布特性,能够综合地反映图像的颜色、纹理、形状特点,并且不随图像的旋转和平移而改变。将此特征应用到条烟识别中,获得了令人满意的结果。相似文献

18.

Detection and analysis of table of contents based on content association

Xiaofan Lin Yan Xiong 《International Journal on Document Analysis and Recognition》2006,8(2-3):132-143

As a special type of table understanding, the detection and analysis of tables of contents (TOCs) play an important role in the digitization of multi-page documents. Most previous TOC analysis methods only concentrate on the TOC itself without taking into account the other pages in the same document. Besides, they often require manual coding or at least machine learning of document-specific models. This paper introduces a new method to detect and analyze TOCs based on content association. It fully leverages the text information throughout the whole multi-page document and can be directly applied to a wide range of documents without the need to build or learn the models for individual documents. In addition, the associations of general text and page numbers are combined to make the TOC analysis more accurate. Natural language processing and layout analysis are integrated to improve the TOC functional tagging. The applications of the proposed method in a large-scale digital library project are also discussed. 相似文献

19.

Offline handwritten Amharic word recognition 总被引：1，自引：0，他引：1

Yaregal Assabie Josef Bigun 《Pattern recognition letters》2011,32(8):1089-1099

This paper describes two approaches for Amharic word recognition in unconstrained handwritten text using HMMs. The first approach builds word models from concatenated features of constituent characters and in the second method HMMs of constituent characters are concatenated to form word model. In both cases, the features used for training and recognition are a set of primitive strokes and their spatial relationships. The recognition system does not require segmentation of characters but requires text line detection and extraction of structural features, which is done by making use of direction field tensor. The performance of the recognition system is tested by a dataset of unconstrained handwritten documents collected from various sources, and promising results are obtained. 相似文献

20.

Shape based local thresholding for binarization of document images

Jichuan ShiNilanjan Ray Hong Zhang 《Pattern recognition letters》2012,33(1):24-32

This paper presents a novel local threshold algorithm for the binarization of document images. Stroke width of handwritten and printed characters in documents is utilized as the shape feature. As a result, in addition to the intensity analysis, the proposed algorithm introduces the stroke width as shape information into local thresholding. Experimental results for both synthetic and practical document images show that the proposed local threshold algorithm is superior in terms of segmentation quality to the threshold approaches that solely use intensity information. 相似文献