首页 | 本学科首页   官方微博 | 高级检索  
     


Segmentation and classification of mixed text/graphics/image documents
Authors:Kuo-Chin Fan  Chi-Hwa Liu  Yuan-Kai Wang
Affiliation:

Institute of Computer Science and Electronic Engineering, National Central University, Chung-Li, Taiwan, ROC

Abstract:In this paper, a feature-based document analysis system is presented which utilizes domain knowledge to segment and classify mixed text/graphics/image documents. In our approach, we first perform a run-length smearing operation followed by the stripe merging procedure to segment the blocks embedded in a document. The classification task is then performed based on the domain knowledge induced from the primitives associated with each type of medium. Proper use of domain knowledge is proved to be effective in accelerating the segmentation speed and decreasing the classification error. The experimental study reveals the feasibility of the new technique in segmenting and classifying mixed text/graphics/image documents.
Keywords:Document segmentation  Block classification  Projection feature  Connectivity histogram
本文献已被 ScienceDirect 等数据库收录!
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号