首页 | 本学科首页   官方微博 | 高级检索  
     


Real-time data mining of massive data streams from synoptic sky surveys
Affiliation:1. California Institute of Technology, Pasadena, CA 91125, USA;2. Jet Propulsion Laboratory, California Institute of Technology, Pasadena, CA 91109, USA;1. Process Simulation and Control Laboratory, Department of Chemical Engineering, Iran University of Science and Technology (IUST), Narmak, 16765-163 Tehran, Iran;2. Department of Chemical Engineering, Iran University of Science and Technology(IUST), Tehran, Iran;3. Process & Equipment Technology Development Division, Research Institute of Petroleum Industry (RIPI), Tehran, Iran;1. Graduate Program of Biological Sciences (Botany), Institute of Biosciences, UNESP – São Paulo State University, Rua Prof. Dr. Antonio Celso Wagner Zanin, s/nº, 18618-689, Botucatu, SP, Brazil;2. Department of Botany, Federal University of São Carlos, PO Box 676, 13565-905, São Carlos, São Paulo, Brazil;3. Department of Botany, Institute of Biosciences, UNESP – São Paulo State University, Rua Prof. Dr. Antonio Celso Wagner Zanin, s/nº, 18618-689, Botucatu, SP, Brazil;1. University of Maryland, College Park, MD 20742, USA;2. Stanford University, 452 Lomita Mall, Stanford, CA 94305, USA;3. Space Science Institute, 4750 Walnut Street, Boulder, CO 80301, USA
Abstract:The nature of scientific and technological data collection is evolving rapidly: data volumes and rates grow exponentially, with increasing complexity and information content, and there has been a transition from static data sets to data streams that must be analyzed in real time. Interesting or anomalous phenomena must be quickly characterized and followed up with additional measurements via optimal deployment of limited assets. Modern astronomy presents a variety of such phenomena in the form of transient events in digital synoptic sky surveys, including cosmic explosions (supernovae, gamma ray bursts), relativistic phenomena (black hole formation, jets), potentially hazardous asteroids, etc. We have been developing a set of machine learning tools to detect, classify and plan a response to transient events for astronomy applications, using the Catalina Real-time Transient Survey (CRTS) as a scientific and methodological testbed. The ability to respond rapidly to the potentially most interesting events is a key bottleneck that limits the scientific returns from the current and anticipated synoptic sky surveys. Similar challenge arises in other contexts, from environmental monitoring using sensor networks to autonomous spacecraft systems. Given the exponential growth of data rates, and the time-critical response, we need a fully automated and robust approach. We describe the results obtained to date, and the possible future developments.
Keywords:Sky surveys  Massive data streams  Machine learning  Bayesian methods  Automated decision making
本文献已被 ScienceDirect 等数据库收录!
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号