Finding the most unusual time series subsequence: algorithms and applications |
| |
Authors: | Eamonn Keogh Jessica Lin Sang-Hee Lee Helga Van Herle |
| |
Affiliation: | (1) Department of Computer Science and Engineering, University of California, Riverside, CA, USA;(2) Department of Information and Software Engineering, George Mason University, Fairfax, VA, USA;(3) Anthropology Department, University of California, Riverside, CA, USA;(4) David Geffen School of Medicine, University of California, Los Angeles, CA, USA |
| |
Abstract: | In this work we introduce the new problem of finding time seriesdiscords. Time series discords are subsequences of longer time series that are maximally different to all the rest of the time series subsequences. They thus capture the sense of the most unusual subsequence within a time series. While discords have many uses for data mining, they are particularly attractive as anomaly detectors because they only require one intuitive parameter (the length of the subsequence) unlike most anomaly detection algorithms that typically require many parameters. While the brute force algorithm to discover time series discords is quadratic in the length of the time series, we show a simple algorithm that is three to four orders of magnitude faster than brute force, while guaranteed to produce identical results. We evaluate our work with a comprehensive set of experiments on diverse data sources including electrocardiograms, space telemetry, respiration physiology, anthropological and video datasets.
Eamonn Keogh is an Assistant Professor of computer science at the University of California, Riverside. His research interests include data mining, machine learning and information retrieval. Several of his papers have won best paper awards, including papers at SIGKDD and SIGMOD. Dr. Keogh is the recipient of a 5-year NSF Career Award for “Efficient discovery of previously unknown patterns and relationships in massive time series databases.”
Jessica Lin is an Assistant Professor of information and software engineering at George Mason University. She received her Ph.D. from the University of California, Riverside. Her research interests include data mining and informational retrieval.
Sang-Hee Lee is a paleoanthropologist at the University of California, Riverside. Her research interests include the evolution of human morphological variation and how different mechanisms (such as taxonomy, sex, age, and time) explain what is observed in fossil data. Dr. Lee obtained her Ph.D. in anthropology from the University of Michigan in 1999.
Helga Van Herle is an Assistant Clinical Professor of medicine at the Division of Cardiology of the Geffen School of Medicine at UCLA. She received her M.D. from UCLA in 1993; completed her residency in internal medicine at the New York Hospital (Cornell University, 1993–1996) and her cardiology fellowship at UCLA (1997–2001). Dr. Van Herle holds a M.Sc. in bioengineering from Columbia University (1987) and a B.Sc. in Chemical Engineering from UCLA (1985) |
| |
Keywords: | Time series data mining Anomaly detection Clustering |
本文献已被 SpringerLink 等数据库收录! |
|