Discovering similar Chinese characters in online handwriting with deep convolutional neural networks期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

Discovering similar Chinese characters in online handwriting with deep convolutional neural networks

Authors:	Shuye Zhang Lianwen Jin Liang Lin

Affiliation:	1.School of Electronic and Information Engineering,South China University of Technology,Guangzhou,People’s Republic of China;2.School of Data and Computer Science,Sun Yat-Sen University,Guangzhou,People’s Republic of China

Abstract:	A primary reason for performance degradation in unconstrained online handwritten Chinese character recognition is the subtle differences between similar characters. Various methods have been proposed in previous works to address the problem of generating similar characters. These methods are basically comprised of two components—similar character discovery and cascaded classifiers. The goal of similar character discovery is to make similar character pairs/sets cover as many misclassified samples as possible. It is observed that the confidence of convolutional neural network (CNN) is output by an end-to-end manner and it can be understood as one type of probability metric. In this paper, we propose an algorithm by leveraging CNN confidence for discovering similar character pairs/sets. Specifically, a deep CNN is applied to output the top ranked candidates and the corresponding confidence scores, followed by an accumulating and averaging procedure. We experimentally found that the number of similar character pairs for each class is diverse and the confusion degree of similar character pairs is varied. To address these problems, we propose an entropy- based similarity measurement to rank these similar character pairs/sets and reject those with low similarity. The experimental results indicate that by using 30,000 similar character pairs, our method achieves the hit rates of 98.44 and 98.05 % on CASIA-OLHWDB1.0 and CASIA-OLHWDB1.0–1.2 datasets, respectively, which are significantly higher than corresponding results produced by MQDF-based method (95.42 and 94.49 %). Furthermore, recognition of ten randomly selected similar character subsets with a two-stage classification scheme results in a relative error reduction of 30.11 % comparing with traditional single stage scheme, showing the potential usage of the proposed method.

Keywords:
本文献已被 SpringerLink 等数据库收录！

设为首页 | 免责声明 | 关于勤云 | 加入收藏