期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

1.

Jui-Chen Wu Jun-Wei Hsieh Yung-Sheng Chen 《Machine Vision and Applications》2008,19(3):195-207

This paper presents a morphology-based text line extraction algorithm for extracting text regions from cluttered images. First of all, the method defines a novel set of morphological operations for extracting important contrast regions as possible text line candidates. The contrast feature is robust to lighting changes and invariant against different image transformations like image scaling, translation, and skewing. In order to detect skewed text lines, a moment-based method is then used for estimating their orientations. According to the orientation, an x-projection technique can be applied to extract various text geometries from the text-analogue segments for text verification. However, due to noise, a text line region is often fragmented to different pieces of segments. Therefore, after the projection, a novel recovery algorithm is then proposed for recovering a complete text line from its pieces of segments. After that, a verification scheme is then proposed for verifying all extracted potential text lines according to their text geometries. Experimental results show that the proposed method improves the state-of-the-art work in terms of effectiveness and robustness for text line detection. 相似文献

2.

Efficient and flexible text extraction from document pages

Pietro Parodi Roberto Fontana 《International Journal on Document Analysis and Recognition》1999,2(2-3):67-79

This paper describes a novel method for extracting text from document pages of mixed content. The method works by detecting pieces of text lines in small overlapping columns of width , shifted with respect to each other by image elements (good default values are: of the image width, ) and by merging these pieces in a bottom-up fashion to form complete text lines and blocks of text lines. The algorithm requires about 1.3 s for a 300 dpi image on a PC with a Pentium II CPU, 300 MHz, MotherBoard Intel440LX. The algorithm is largely independent of the layout of the document, the shape of the text regions, and the font size and style. The main assumptions are that the background be uniform and that the text sit approximately horizontally. For a skew of up to about 10 degrees no skew correction mechanism is necessary. The algorithm has been tested on the UW English Document Database I of the University of Washington and its performance has been evaluated by a suitable measure of segmentation accuracy. Also, a detailed analysis of the segmentation accuracy achieved by the algorithm as a function of noise and skew has been carried out. Received April 4, 1999 / Revised June 1, 1999 相似文献

3.

A word extraction algorithm for machine-printed documents using a 3D neighborhood graph model

Hwan-Chul Park Se-Young Ok Young-Jung Yu Hwan-Gue Cho 《International Journal on Document Analysis and Recognition》2001,4(2):115-130

Automatic character recognition and image understanding of a given paper document are the main objectives of the computer vision field. For these problems, a basic step is to isolate characters and group words from these isolated characters. In this paper, we propose a new method for extracting characters from a mixed text/graphic machine-printed document and an algorithm for distinguishing words from the isolated characters. For extracting characters, we exploit several features (size, elongation, and density) of characters and propose a characteristic value for classification using the run-length frequency of the image component. In the context of word grouping, previous works have largely been concerned with words which are placed on a horizontal or vertical line. Our word grouping algorithm can group words which are on inclined lines, intersecting lines, and even curved lines. To do this, we introduce the 3D neighborhood graph model which is very useful and efficient for character classification and word grouping. In the 3D neighborhood graph model, each connected component of a text image segment is mapped onto 3D space according to the area of the bounding box and positional information from the document. We conducted tests with more than 20 English documents and more than ten oriental documents scanned from books, brochures, and magazines. Experimental results show that more than 95% of words are successfully extracted from general documents, even in very complicated oriental documents. Received August 3, 2001 / Accepted August 8, 2001 相似文献

4.

Extracting characters of license plates from video sequences

Yuntao Cui Qian Huang 《Machine Vision and Applications》1998,10(5-6):308-320

In this paper, we present a new approach to extract characters on a license plate of a moving vehicle, given a sequence of perspective-distortion-corrected license plate images. Different from many existing single-frame approaches, our method simultaneously utilizes spatial and temporal information. We first model the extraction of characters as a Markov random field (MRF), where the randomness is used to describe the uncertainty in pixel label assignment. With the MRF modeling, the extraction of characters is formulated as the problem of maximizing a posteriori probability based on a given prior knowledge and observations. A genetic algorithm with local greedy mutation operator is employed to optimize the objective function. Experiments and comparison study were conducted and some of our experimental results are presented in the paper. It is shown that our approach provides better performance than other single frame methods. Received: 13 August 1997 / Accepted: 7 October 1997 相似文献

5.

Extraction of special effects caption text events from digital video 总被引：2，自引：1，他引：1

David Crandall Sameer Antani Rangachar Kasturi 《International Journal on Document Analysis and Recognition》2003,5(2-3):138-157

Abstract. The popularity of digital video is increasing rapidly. To help users navigate libraries of video, algorithms that automatically index video based on content are needed. One approach is to extract text appearing in video, which often reflects a scene's semantic content. This is a difficult problem due to the unconstrained nature of general-purpose video. Text can have arbitrary color, size, and orientation. Backgrounds may be complex and changing. Most work so far has made restrictive assumptions about the nature of text occurring in video. Such work is therefore not directly applicable to unconstrained, general-purpose video. In addition, most work so far has focused only on detecting the spatial extent of text in individual video frames. However, text occurring in video usually persists for several seconds. This constitutes a text event that should be entered only once in the video index. Therefore it is also necessary to determine the temporal extent of text events. This is a non-trivial problem because text may move, rotate, grow, shrink, or otherwise change over time. Such text effects are common in television programs and commercials but so far have received little attention in the literature. This paper discusses detecting, binarizing, and tracking caption text in general-purpose MPEG-1 video. Solutions are proposed for each of these problems and compared with existing work found in the literature. Received: January 29, 2002 / Accepted: September 13, 2002 D. Crandall is now with Eastman Kodak Company, 1700 Dewey Avenue, Rochester, NY 14650-1816, USA; e-mail: david.crandall@kodak.com S. Antani is now with the National Library of Medicine, 8600 Rockville Pike, Bethesda, MD 20894, USA; e-mail: antani@nlm.nih.gov Correspondence to: David Crandall 相似文献

6.

Recognising text in real scenes

Paul Clark Majid Mirmehdi 《International Journal on Document Analysis and Recognition》2002,4(4):243-257

We present two different approaches to the location and recovery of text in images of real scenes. The techniques we describe are invariant to the scale and 3D orientation of the text, and allow recovery of text in cluttered scenes. The first approach uses page edges and other rectangular boundaries around text to locate a surface containing text, and to recover a fronto-parallel view. This is performed using line detection, perceptual grouping, and comparison of potential text regions using a confidence measure. The second approach uses low-level texture measures with a neural network classifier to locate regions of text in an image. Then we recover a fronto-parallel view of each located paragraph of text by separating the individual lines of text and determining the vanishing points of the text plane. We illustrate our results using a number of images. Received May 20, 2001 / Accepted June 19, 2001 相似文献

7.

Efficient extraction of primitives from line drawings composed of horizontal and vertical lines 总被引：6，自引：0，他引：6

Juan F. Arias Rangachar Kasturi 《Machine Vision and Applications》1997,10(4):214-221

The performance of the algorithms for the extraction of primitives for the interpretation of line drawings is usually affected by the degradation of the information contained in the document due to factors such as low print contrast, defocusing, skew, etc. In this paper, we are proposing two algorithms for the extraction of primitives with good performance under degradation. The application of the algorithms is restricted to line drawings composed of horizontal and vertical lines. The performance of the algorithms has been evaluated by using a protocol described in the literature. Received: 6 August 1996 / Accepted: 16 July 1997 相似文献

8.

Automatic text segmentation and text recognition for video indexing 总被引：13，自引：0，他引：13

Rainer Lienhart Wolfgang Effelsberg 《Multimedia Systems》2000,8(1):69-81

Efficient indexing and retrieval of digital video is an important function of video databases. One powerful index for retrieval is the text appearing in them. It enables content-based browsing. We present our new methods for automatic segmentation of text in digital videos. The algorithms we propose make use of typical characteristics of text in videos in order to enable and enhance segmentation performance. The unique features of our approach are the tracking of characters and words over their complete duration of occurrence in a video and the integration of the multiple bitmaps of a character over time into a single bitmap. The output of the text segmentation step is then directly passed to a standard OCR software package in order to translate the segmented text into ASCII. Also, a straightforward indexing and retrieval scheme is introduced. It is used in the experiments to demonstrate that the proposed text segmentation algorithms together with existing text recognition algorithms are suitable for indexing and retrieval of relevant video sequences in and from a video database. Our experimental results are very encouraging and suggest that these algorithms can be used in video retrieval applications as well as to recognize higher level semantics in videos. 相似文献

9.

Identifying facsimile duplicates using radial pixel densities 总被引：2，自引：0，他引：2

P. Chatelain 《International Journal on Document Analysis and Recognition》2002,4(4):219-225

A method for detecting full layout facsimile duplicates based on radial pixel densities is proposed. It caters for facsimiles, including text and/or graphics. Pages may be positioned upright or inverted on the scanner bed. The method is not dependent on the computation of text skew or text orientation. Using a database of original documents, 92% of non-duplicates and upright duplicates as well as 89% of inverted duplicates could be correctly identified. The method is vulnerable to double scanning. This occurs when documents are copied using a photocopier and the copies are subsequently transmitted using a facsimile machine. Received September 29, 2000 / Revised: August 23, 2001 相似文献

10.

A fast algorithm for skew detection of document images using morphology 总被引：1，自引：0，他引：1

A.K. Das B. Chanda 《International Journal on Document Analysis and Recognition》2001,4(2):109-114

相似文献

11.

An experimental evaluation of OCR text representations for learning document classifiers

Markus Junker Rainer Hoch 《International Journal on Document Analysis and Recognition》1998,1(2):116-122

In the literature, many feature types are proposed for document classification. However, an extensive and systematic evaluation of the various approaches has not yet been done. In particular, evaluations on OCR documents are very rare. In this paper we investigate seven text representations based on n-grams and single words. We compare their effectiveness in classifying OCR texts and the corresponding correct ASCII texts in two domains: business letters and abstracts of technical reports. Our results indicate that the use of n-grams is an attractive technique which can even compare to techniques relying on a morphological analysis. This holds for OCR texts as well as for correct ASCII texts. Received February 17, 1998 / Revised April 8, 1998 相似文献

12.

Range image segmentation using local approximation of scan lines with application to CAD model acquisition

Inas Khalifa Medhat Moussa Mohamed Kamel 《Machine Vision and Applications》2003,13(5-6):263-274

Abstract. Automatic acquisition of CAD models from existing objects requires accurate extraction of geometric and topological information from the input data. This paper presents a range image segmentation method based on local approximation of scan lines. The method employs edge models that are capable of detecting noise pixels as well as position and orientation discontinuities of varying strengths. Region-based techniques are then used to achieve a complete segmentation. Finally, a geometric representation of the scene, in the form of a surface CAD model, is produced. Experimental results on a large number of real range images acquired by different range sensors demonstrate the efficiency and robustness of the method. Received: 1 August 2000 / Accepted: 23 January 2002 Correspondence to: I. Khalifa 相似文献

13.

An empirical measure of the performance of a document image segmentation algorithm

Amit Kumar Das Sanjoy Kumar Saha Bhabatosh Chanda 《International Journal on Document Analysis and Recognition》2002,4(3):183-190

Document image segmentation is the first step in document image analysis and understanding. One major problem centres on the performance analysis of the evolving segmentation algorithms. The use of a standard document database maintained at the Universities/Research Laboratories helps to solve the problem of getting authentic data sources and other information, but some methodologies have to be used for performance analysis of the segmentation. We describe a new document model in terms of a bounding box representation of its constituent parts and suggest an empirical measure of performance of a segmentation algorithm based on this new graph-like model of the document. Besides the global error measures, the proposed method also produces segment-wise details of common segmentation problems such as horizontal and vertical split and merge as well as invalid and mismatched regions. Received July 14, 2000 / Revised June 12, 2001[-1mm] 相似文献

14.

Fast generation of curved perspectives for ultra-wide-angle lenses in VR applications 总被引：1，自引：0，他引：1

Georg Glaeser Eduard Gröller 《The Visual computer》1999,15(7-8):365-376

相似文献

15.

Leveraging corporate context within knowledge-based document analysis and understanding

Claudia Wenzel Heiko Maus 《International Journal on Document Analysis and Recognition》2001,3(4):248-260

Knowledge-based systems for document analysis and understanding (DAU) are quite useful whenever analysis has to deal with the changing of free-form document types which require different analysis components. In this case, declarative modeling is a good way to achieve flexibility. An important application domain for such systems is the business letter domain. Here, high accuracy and the correct assignment to the right people and the right processes is a crucial success factor. Our solution to this proposes a comprehensive knowledge-centered approach: we model not only comparatively static knowledge concerning document properties and analysis results within the same declarative formalism, but we also include the analysis task and the current context of the system environment within the same formalism. This allows an easy definition of new analysis tasks and also an efficient and accurate analysis by using expectations about incoming documents as context information. The approach described has been implemented within the VOPR (VOPR is an acronym for the Virtual Office PRototype.) system. This DAU system gains the required context information from a commercial workflow management system (WfMS) by constant exchanges of expectations and analysis tasks. Further interaction between these two systems covers the delivery of results from DAU to the WfMS and the delivery of corrected results vice versa. Received June 19, 1999 / Revised November 8, 2000 相似文献

16.

Using local deviations of vectorization to enhance the performance of raster-to-vector conversion systems 总被引：1，自引：0，他引：1

Eugene Bodansky Morakot Pilouk 《International Journal on Document Analysis and Recognition》2000,3(2):67-72

This paper presents a method of quantitatively measuring local vectorization errors that evaluates the deviation of the vectorization of arbitrary (regular and irregular) raster linear objects. This measurement of the deviation does not depend on the thickness of the linear object. One of the most time-consuming procedures of raster-to-vector conversion of large linear drawings is manually verifying the results. Performance of raster-to-vector conversion systems can be enhanced with auto- localization of places that have to be corrected. The local deviations can be used for testing results and automatically showing the parts of resulting curves where deviations are greater than a threshold value and have to be corrected. 相似文献

17.

Automatic extraction of printed mathematical formulas using fuzzy logic and propagation of context 总被引：7，自引：0，他引：7

A. Kacem A. Belaïd M. Ben Ahmed 《International Journal on Document Analysis and Recognition》2001,4(2):97-108

相似文献

18.

Using citing information to understand the logical structure of document images

Shuhua Wang Yang Cao Shijie Cai 《International Journal on Document Analysis and Recognition》2001,4(1):27-34

The most noticeable characteristic of a construction tender document is that its hierarchical architecture is not obviously expressed but is implied in the citing information. Currently available methods cannot deal with such documents. In this paper, the intra-page and inter-page relationships are analyzed in detail. The creation of citing relationships is essential to extracting the logical structure of tender documents. The hierarchy of tender documents naturally leads to extracting and displaying the logical structure as tree structure. This method is successfully implemented in VHTender, and is the key to the efficiency and flexibility of the whole system. Received February 28, 2000 / Revised October 20, 2000 相似文献

19.

Rule-based document structure understanding with a fuzzy combination of layout and textual features

Stefan Klink Thomas Kieninger 《International Journal on Document Analysis and Recognition》2001,4(1):18-26

Document image processing is a crucial process in office automation and begins at the ‘OCR’ phase with difficulties in document ‘analysis’ and ‘understanding’. This paper presents a hybrid and comprehensive approach to document structure analysis. Hybrid in the sense that it makes use of layout (geometrical) as well as textual features of a given document. These features are the base for potential conditions which in turn are used to express fuzzy matched rules of an underlying rule base. Rules can be formulated based on features which might be observed within one specific layout object. However, rules can also express dependencies between different layout objects. In addition to its rule driven analysis, which allows an easy adaptation to specific domains with their specific logical objects, the system contains domain-independent markup algorithms for common objects (e.g., lists). Received June 19, 2000 / Revised November 8, 2000 相似文献

20.

Classification of document pages using structure-based features

Christian Shin David Doermann Azriel Rosenfeld 《International Journal on Document Analysis and Recognition》2001,3(4):232-247

Searching for documents by their type or genre is a natural way to enhance the effectiveness of document retrieval. The layout of a document contains a significant amount of information that can be used to classify it by type in the absence of domain-specific models. Our approach to classification is based on “visual similarity” of layout structure and is implemented by building a supervised classifier, given examples of each class. We use image features such as percentages of text and non-text (graphics, images, tables, and rulings) content regions, column structures, relative point sizes of fonts, density of content area, and statistics of features of connected components which can be derived without class knowledge. In order to obtain class labels for training samples, we conducted a study where subjects ranked document pages with respect to their resemblance to representative page images. Class labels can also be assigned based on known document types, or can be defined by the user. We implemented our classification scheme using decision tree classifiers and self-organizing maps. Received June 15, 2000 / Revised November 15, 2000 相似文献