Affiliation: | (1) Department of Computer Science, University of Fukui, 3-9-1 Bunkyo, 910-8507 Fukui-shi, Japan;(2) Department of Intellectual Information Systems Engineering, Toyama University, Gofuku, 930-8555 Toyama-shi, Japan;(3) Department of Systems and Social Informatics, Graduate School of Information Science, Nagoya University, Furo-cho, Chigusa-ku, 464-8603 Nagoya-shi, Japan;(4) CENPARMI, Concordia University, Suite GM-606, 1455 de Maisonneuve Blvd, H3G 1M8 West Montréal, Québec, Canada |
Abstract: | The capability of extracting and recognizing characters printed in color documents will widen immensely the applications of OCR systems. This paper describes a new method of color segmentation to extract character areas from a color document. At first glance, the characters seem to be printed in a single color, but actual measurements reveal that the color image has a distribution of components. Compared with clustering algorithms, our method prevents oversegmentation and fusion with the background while maintaining real-time usability. It extracts the representative colors based on a histogram analysis of the color space. Our method also contains a selective local color averaging technique that removes the problem of mesh noise on high-resolution color images.Received: 25 July 2003, Revised: 10 August 2003, Published online: 6 February 2004Correspondence to: Hiroyuki Hase. Current address: 3-9-1 Bunkyo, Fukui-shi 910-8507, Japan |