首页 | 本学科首页   官方微博 | 高级检索  
     

一种基于关键词的中文文档图像检索方法
引用本文:黄祥林,高芸,杨丽芳,王鹏鹏.一种基于关键词的中文文档图像检索方法[J].中文信息学报,2007,21(4):61-64.
作者姓名:黄祥林  高芸  杨丽芳  王鹏鹏
作者单位:中国传媒大学 计算机学院,北京100024
基金项目:国家发改委CNGI项目
摘    要:本文提出了一种基于关键词的中文文档图像检索方法,能在不经OCR(Optical Character Recognition)识别的情况下,直接利用中文字符的图像特征进行关键词检索。首先将文档图像分割成单个中文字符图像,接着对字符图像进行汉字笔画的特征数据提取,然后在特征数据间进行基于WMHD(Weighted Modified Hausdorff Distance)的相似性测量。该方法不受字号的影响,也有一定的抗字体能力,实验证明其具有较高的检索效果。

关 键 词:计算机应用  中文信息处理  中文文档图像  关键词检索  加权的修正Hausdorff距离(WMHD)  
文章编号:1003-0077(2007)04-0061-04
收稿时间:2006-10-18
修稿时间:2006-10-182007-04-25

A Chinese Document Image Retrieval Method by Keywords
HUANG Xiang-lin,GAO Yun,YANG Li-fang,WANG Peng-peng.A Chinese Document Image Retrieval Method by Keywords[J].Journal of Chinese Information Processing,2007,21(4):61-64.
Authors:HUANG Xiang-lin  GAO Yun  YANG Li-fang  WANG Peng-peng
Affiliation:Computer College, Communication University of China, Beijing 100024, China
Abstract:A Chinese document image retrieval method by keywords is proposed, which retrieved Chinese character directly from Chinese character image without OCR (Optical Character Recognition). At first, Chinese character image was segmented from Chinese document image. Then the feature data of Chinese stroke were extracted from the Chinese character image. At last, the similarity of the Chinese character images were measured by weighted modified Hausdorff distance between their feature data. That retrieval method is robust to character size and font. The experimental results show good performance.
Keywords:computer application  chinese information processing  chinese document image  retrieval by keywords  WMHD (Weighted Modified Hausdorff Distance)  
本文献已被 维普 万方数据 等数据库收录!
点击此处可从《中文信息学报》浏览原始摘要信息
点击此处可从《中文信息学报》下载全文
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号