A video-based framework for the analysis of presentations/posters |
| |
Authors: | A Zandifar R Duraiswami L S Davis |
| |
Affiliation: | (1) Perceptual Interfaces and Reality Lab (PIRL), University of Maryland, College Park, MD 20742, USA |
| |
Abstract: | Detection and recognition of textual information in an image or video sequence is important for many applications. The increased resolution and capabilities of digital cameras and faster mobile processing allow for the development of interesting systems. We present an application based on the capture of information presented at a slide-show presentation or at a poster session. We describe the development of a system to process the textual and graphical information in such presentations. The application integrates video and image processing, document layout understanding, optical character recognition (OCR), and pattern recognition. The digital imaging device captures slides/poster images, and the computing module preprocesses and annotates the content. Various problems related to metric rectification, key-frame extraction, text detection, enhancement, and system integration are addressed. The results are promising for applications such as a mobile text reader for the visually impaired. By using powerful text-processing algorithms, we can extend this framework to other applications, e.g., document and conference archiving, camera-based semantics extraction, and ontology creation.Received: 18 December 2003, Revised: 1 November 2004, Published online: 2 February 2005 |
| |
Keywords: | |
本文献已被 SpringerLink 等数据库收录! |
|