首页 | 本学科首页   官方微博 | 高级检索  
     

基于弹性网格的西夏文字识别
引用本文:门光福,潘晨,柳长青.基于弹性网格的西夏文字识别[J].中文信息学报,2011,25(5):109-114.
作者姓名:门光福  潘晨  柳长青
作者单位:1. 宁夏大学 数学计算机学院,宁夏 银川 750021; 2. 中国计量学院 信息工程学院,浙江 杭州 310018
基金项目:国家自然科学基金资助项目(60803104);宁夏大学自然科学基金资助项目(ndrz09-34)
摘    要:随着国内外对西夏学研究的不断深入,收藏于世界各地的大批西夏古籍文献通过影印方式陆续出版。如何将这些西夏古籍文献数字化、文本化则有着极其重要的意义。该文采用弹性网格方法及线性判别分析(Linear Discriminant Analysis,LDA)方法对西夏文字识别进行了研究。首先对西夏影印文献进行预处理、细化,然后根据西夏文字笔画分布构造非均匀的弹性网格,将弹性网格分别作用于西夏文字的四个方向分量上,统计像素点在网格内的概率分布作为特征,最后使用LDA方法对提取的特征降维处理。对240类共9 600个西夏文字做4重交叉验证,平均识别率可达87.99%,实验表明该方法是有效的。

关 键 词:西夏字  弹性网格  方向特征  线性判别分析(LDA)  

Xixia Characters Recognition Based on Elastic Mesh
MEN Guangfu,PAN Chen,LIU Changqing.Xixia Characters Recognition Based on Elastic Mesh[J].Journal of Chinese Information Processing,2011,25(5):109-114.
Authors:MEN Guangfu  PAN Chen  LIU Changqing
Affiliation:1. School of Mathematics and Computer Science, Ningxia University, Yinchuan, Ningxia 750021, China;
2. College of Information Engineering, China Jiliang University, Hangzhou, Zhejiang 310018,China
Abstract:Recently, research on Xixia characters developed deeply and a large number of Xixia documents have been published with their original forms at home and abroad. How to carry out the fast digitalization of those documents is of great importance. We first preprocess those documents by smooth and thin algorithm, then the elastic meshes are applied to each of the directional pattern and probability distribution of pixels within each mesh is computed as the features for this character. Finally, a lower dimension features are extracted by Linear Discriminant Analysis (LDA) method. Experiment on total 9600 samples of 240 categories of Xixia characters by 4-fold cross validation produces a result of recognition rate of 87.99%.
Key wordsxixia characters; elastic mesh; direction features; LDA
Keywords:xixia characters  elastic mesh  direction features  LDA  
本文献已被 万方数据 等数据库收录!
点击此处可从《中文信息学报》浏览原始摘要信息
点击此处可从《中文信息学报》下载全文
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号