首页 | 本学科首页   官方微博 | 高级检索  
     


A generalized temporal context model for classifying image collections
Authors:Matthew Boutell  Jiebo Luo  Christopher Brown
Affiliation:(1) Department of Computer Science, University of Rochester, Rochester, New York, USA;(2) Research and Development Laboratories, Eastman Kodak Company, New York, USA
Abstract:Semantic scene classification is an open problem in computer vision, especially when information from only a single image is employed. In applications involving image collections, however, images are clustered sequentially, allowing surrounding images to be used as temporal context. We present a general probabilistic temporal context model in which the first-order Markov property is used to integrate content-based and temporal context cues. The model uses elapsed time-dependent transition probabilities between images to enforce the fact that images captured within a shorter period of time are more likely to be related. This model is generalized in that it allows arbitrary elapsed time between images, making it suitable for classifying image collections. In addition, we derived a variant of this model to use in ordered image collections for which no timestamp information is available, such as film scans. We applied the proposed context models to two problems, achieving significant gains in accuracy in both cases. The two algorithms used to implement inference within the context model, Viterbi and belief propagation, yielded similar results with a slight edge to belief propagation. Matthew Boutell received the BS degree in Mathematical Science from Worcester Polytechnic Institute, Massachusetts, in 1993, the MEd degree from University of Massachusetts at Amherst in 1994, and the PhD degree in Computer Science from the University of Rochester, Rochester, NY, in 2005. He served for several years as a mathematics and computer science instructor at Norton High School and Stonehill College and as a research intern/consultant at Eastman Kodak Company. Currently, he is Assistant Professor of Computer Science and Software Engineering at Rose-Hulman Institute of Technology in Terre Haute, Indiana. His research interests include image understanding, machine learning, and probabilistic modeling. Jiebo Luo received his PhD degree in Electrical Engineering from the University of Rochester, Rochester, NY in 1995. He is a Senior Principal Scientist with the Kodak Research Laboratories. He was a member of the Organizing Committee of the 2002 IEEE International Conference on Image Processing and 2006 IEEE International Conference on Multimedia and Expo, a guest editor for the Journal of Wireless Communications and Mobile Computing Special Issue on Multimedia Over Mobile IP and the Pattern Recognition journal Special Issue on Image Understanding for Digital Photos, and a Member of the Kodak Research Scientific Council. He is on the editorial boards of the IEEE Transactions on Multimedia, Pattern Recognition, and Journal of Electronic Imaging. His research interests include image processing, pattern recognition, computer vision, medical imaging, and multimedia communication. He has authored over 100 technical papers and holds over 30 granted US patents. He is a Kodak Distinguished Inventor and a Senior Member of the IEEE. Chris Brown (BA Oberlin 1967, PhD University of Chicago 1972) is Professor of Computer Science at the University of Rochester. He has published in many areas of computer vision and robotics. He wrote COMPUTER VISION with his colleague Dana Ballard, and influential work on the “active vision” paradigm was reported in two special issues of the International Journal of Computer Vision. He edited the first two volumes of ADVANCES IN COMPUTER VISION for Erlbaum and (with D. Terzopoulos) REAL-TIME COMPUTER VISION, from Cambridge University Press. He is the co-editor of VIDERE, the first entirely on-line refereed computer vision journal (MIT Press). His most recent PhD students have done research in infrared tracking and face recognition, features and strategies for image understanding, augmented reality, and three-dimensional reconstruction algorithms. He supervised the undergraduate team that twice won the AAAI Host Robot competition (and came third in the Robot Rescue competition in 2003).
Keywords:Semantic scene classification  Content-based cues  Temporal context cues  Hidden Markov Model  Camera metadata
本文献已被 SpringerLink 等数据库收录!
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号