Impact of online handwriting recognition performance on text categorization |
| |
Authors: | Sebastián Peña Saldarriaga Christian Viard-Gaudin Emmanuel Morin |
| |
Affiliation: | 1.LINA UMR CNRS 6241,Université de Nantes,Nantes,France;2.IRCCyN UMR CNRS 6597,Université de Nantes,Nantes,France |
| |
Abstract: | Today, there is an increasing demand of efficient archival and retrieval methods for online handwritten data. For such tasks,
text categorization is of particular interest. The textual data available in online documents can be extracted through online
handwriting recognition; however, this process produces errors in the resulting text. This work reports experiments on the
categorization of online handwritten documents based on their textual contents. We analyze the effect of word recognition
errors on the categorization performances, by comparing the performances of a categorization system with the texts obtained
through online handwriting recognition and the same texts available as ground truth. Two well-known categorization algorithms
(kNN and SVM) are compared in this work. A subset of the Reuters-21578 corpus consisting of more than 2,000 handwritten documents
has been collected for this study. Results show that classification rate loss is not significant, and precision loss is only
significant for recall values of 60–80% depending on the noise levels. |
| |
Keywords: | |
本文献已被 SpringerLink 等数据库收录! |
|