A generalized temporal context model for classifying image collections |
| |
Authors: | Matthew Boutell Jiebo Luo Christopher Brown |
| |
Affiliation: | (1) Department of Computer Science, University of Rochester, Rochester, New York, USA;(2) Research and Development Laboratories, Eastman Kodak Company, New York, USA |
| |
Abstract: | Semantic scene classification is an open problem in computer vision, especially when information from only a single image
is employed. In applications involving image collections, however, images are clustered sequentially, allowing surrounding
images to be used as temporal context. We present a general probabilistic temporal context model in which the first-order
Markov property is used to integrate content-based and temporal context cues. The model uses elapsed time-dependent transition probabilities between images to enforce the fact that images captured within a shorter period of time are more
likely to be related. This model is generalized in that it allows arbitrary elapsed time between images, making it suitable
for classifying image collections. In addition, we derived a variant of this model to use in ordered image collections for
which no timestamp information is available, such as film scans. We applied the proposed context models to two problems, achieving
significant gains in accuracy in both cases. The two algorithms used to implement inference within the context model, Viterbi
and belief propagation, yielded similar results with a slight edge to belief propagation.
Matthew Boutell received the BS degree in Mathematical Science from Worcester Polytechnic Institute, Massachusetts, in 1993, the MEd degree
from University of Massachusetts at Amherst in 1994, and the PhD degree in Computer Science from the University of Rochester,
Rochester, NY, in 2005. He served for several years as a mathematics and computer science instructor at Norton High School
and Stonehill College and as a research intern/consultant at Eastman Kodak Company. Currently, he is Assistant Professor of
Computer Science and Software Engineering at Rose-Hulman Institute of Technology in Terre Haute, Indiana. His research interests
include image understanding, machine learning, and probabilistic modeling.
Jiebo Luo received his PhD degree in Electrical Engineering from the University of Rochester, Rochester, NY in 1995. He is a Senior
Principal Scientist with the Kodak Research Laboratories.
He was a member of the Organizing Committee of the 2002 IEEE International Conference on Image Processing and 2006 IEEE International
Conference on Multimedia and Expo, a guest editor for the Journal of Wireless Communications and Mobile Computing Special
Issue on Multimedia Over Mobile IP and the Pattern Recognition journal Special Issue on Image Understanding for Digital Photos,
and a Member of the Kodak Research Scientific Council.
He is on the editorial boards of the IEEE Transactions on Multimedia, Pattern Recognition, and Journal of Electronic Imaging.
His research interests include image processing, pattern recognition, computer vision, medical imaging, and multimedia communication.
He has authored over 100 technical papers and holds over 30 granted US patents. He is a Kodak Distinguished Inventor and a
Senior Member of the IEEE.
Chris Brown (BA Oberlin 1967, PhD University of Chicago 1972) is Professor of Computer Science at the University of Rochester.
He has published in many areas of computer vision and robotics. He wrote COMPUTER VISION with his colleague Dana Ballard,
and influential work on the “active vision” paradigm was reported in two special issues of the International Journal of Computer
Vision. He edited the first two volumes of ADVANCES IN COMPUTER VISION for Erlbaum and (with D. Terzopoulos) REAL-TIME COMPUTER
VISION, from Cambridge University Press. He is the co-editor of VIDERE, the first entirely on-line refereed computer vision
journal (MIT Press).
His most recent PhD students have done research in infrared tracking and face recognition, features and strategies for image
understanding, augmented reality, and three-dimensional reconstruction algorithms.
He supervised the undergraduate team that twice won the AAAI Host Robot competition (and came third in the Robot Rescue competition
in 2003). |
| |
Keywords: | Semantic scene classification Content-based cues Temporal context cues Hidden Markov Model Camera metadata |
本文献已被 SpringerLink 等数据库收录! |
|