Overlapping and multi-touching text-line segmentation by Block Covering analysis |
| |
Authors: | Abderrazak Zahour Brunco Taconet Laurence Likforman-Sulem Wafa Boussellaa |
| |
Affiliation: | 1. IUT, Université du Havre/GED, Place Robert Schuman, 76610, Le Havre, France 2. TELECOM ParisTech/TSI and CNRS-LTCI, 46 rue Barrault, 75013, Paris, France 3. Université de Sfax, REGIM, ENIS Route Soukra, 3038, Sfax (BPW), Tunisia
|
| |
Abstract: | This paper presents a new approach for text-line segmentation based on Block Covering which solves the problem of overlapping and multi-touching components. Block Covering is the core of a system which processes
a set of ancient Arabic documents from historical archives. The system is designed for separating text-lines even if they
are overlapping and multi-touching. We exploit the Block Covering technique in three steps: a new fractal analysis (Block Counting) for document classification, a statistical analysis of block heights for block classification and a neighboring analysis
for building text-lines. The Block Counting fractal analysis, associated with a fuzzy C-means scheme, is performed on document
images in order to classify them according to their complexity: tightly (closely) spaced documents (TSD) or widely spaced
documents (WSD). An optimal Block Covering is applied on TSD documents which include overlapping and multi-touching lines.
The large blocks generated by the covering are then segmented by relying on the statistical analysis of block heights. The
final labeling into text-lines is based on a block neighboring analysis. Experimental results provided on images of the Tunisian
Historical Archives reveal the feasibility of the Block Covering technique for segmenting ancient Arabic documents. |
| |
Keywords: | |
本文献已被 SpringerLink 等数据库收录! |
|