首页 | 本学科首页   官方微博 | 高级检索  
     

基于底层图像特征组合的文本图像分类研究
引用本文:曾东红,黄朝志,黄细妹.基于底层图像特征组合的文本图像分类研究[J].南方冶金学院学报,2013(5):82-87.
作者姓名:曾东红  黄朝志  黄细妹
作者单位:(江西理工大学,a.电气工程与自动化学院;b.信息工程学院,江西 赣州 341000)
基金项目:国家自然科学基金 (70971043);江西省自然科学基金 (2008GZS0028)
摘    要:针对文本图像特有的图像特征.提出了一种基于底层图像特征组合的文本图像分类方法,该方法使用了两层C4.5决策树分类器,能将文本图像有效地分为标题文本图像、文档图像和场景文本图像.首先将样本图像转换为灰度图像,提取灰度直方图的特征,根据灰度直方图特征的不同。可以先区分文档图像:然后把余下的图像转换为二值图像,提取图像的GLCM纹理特征,根据GLCM特征区分场景文本I和标题文本图像.在开源的WEKA数据挖掘软件环境下进行仿真实验,结果表明该方法是可行的。并能够得到较高的查全率和查准率.

关 键 词:文本图像  C4  5决策树分类器  灰度直方图  图像纹理

A text image classification research based on the underlying image feature combination
ZENG Dong-hong,HUANG Chao-zhi,HUANG Xi-mei.A text image classification research based on the underlying image feature combination[J].Journal of Southern Institute of Metallurgy,2013(5):82-87.
Authors:ZENG Dong-hong  HUANG Chao-zhi  HUANG Xi-mei
Affiliation:(a. School of Electrical Engineering and Automation; b. Faculty of Information Engineering, Jiangxi University of Science and Technology, Ganzhou 341000, China)
Abstract:A text image classification method based on the combination of underlying image feature was proposed in this paper. With two layers of C4.5 decision tree classifier, the method can divide the text image into caption text image, document image and scene text image. The text image classification is a two-step process. In the first place, the sample image is converted into gray image for histogram feature extraction. Document images could then be well distinguished according to the variable characteristics of the gray histogram. In the second place, the rest of the images are converted intb binary images to extract their GLCM features, according to which the scene text and caption text images are distinguished. Simulation experiments were carried out in the open source WEKA data mining software, the results showed that the method is feasible, and is able to get favorable recall and good precision ratio.
Keywords:text image  C4  5 decision tree classifier  gradation histogram  image texture
本文献已被 CNKI 维普 等数据库收录!
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号