首页 | 本学科首页   官方微博 | 高级检索  
     


Text extraction method for historical Tibetan document images based on block projections
Authors:DUAN Li-juan  ZHANGXi-qun  MALong-long  WUJian
Affiliation:Faculty of Information Technology, Beijing University of Technology, Beijing 100124, China;Beijing Key Laboratory on Integration and Analysis of Large-scale Stream Data, Beijing University of Technology, Beijing 100124, China,Faculty of Information Technology, Beijing University of Technology, Beijing 100124, China;Beijing Key Laboratory of Trusted Computing, Beijing University of Technology, Beijing 100124, China,Chinese Information Processing Laboratory, Institute of Software, Chinese Academy of Sciences, Beijing 100190, China and Chinese Information Processing Laboratory, Institute of Software, Chinese Academy of Sciences, Beijing 100190, China
Abstract:Text extraction is an important initial step in digitizing the historical documents. In this paper, we present a text extraction method for historical Tibetan document images based on block projections. The task of text extraction is considered as text area detection and location problem. The images are divided equally into blocks and the blocks are filtered by the information of the categories of connected components and corner point density. By analyzing the filtered blocks’ projections, the approximate text areas can be located, and the text regions are extracted. Experiments on the dataset of historical Tibetan documents demonstrate the effectiveness of the proposed method.
Keywords:
本文献已被 SpringerLink 等数据库收录!
点击此处可从《光电子快报》浏览原始摘要信息
点击此处可从《光电子快报》下载全文
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号