首页 | 本学科首页   官方微博 | 高级检索  
     


Multiple query scheduling for distributed semantic caches
Authors:Beomseok Nam  Minho Shin  Henrique Andrade  Alan Sussman
Affiliation:1. Oracle, 100 Oracle Parkway, Redwood Shores, CA 94065, United States;2. ISTS, Dartmouth College, Hanover, NH 03755, United States;3. IBM T. J. Watson Research Center, 19 Skyline Drive, Hawthorne, NY 10532, United States;4. Department of Computer Science, University of Maryland, College Park, MD 20742, United States
Abstract:In distributed query processing systems, load balancing plays an important role in maximizing system throughput. When queries can leverage cached intermediate results, improving the cache hit ratio becomes as important as load balancing in query scheduling, especially when dealing with computationally expensive queries. The scheduling policies must be designed to take into consideration the dynamic contents of the distributed caching infrastructure. In this paper, we propose and discuss several distributed query scheduling policies that directly consider the available cache contents by employing distributed multidimensional indexing structures and an exponential moving average approach to predicting cache contents. These approaches are shown to produce better query plans and faster query response times than traditional scheduling policies that do not predict dynamic contents in distributed caches. We experimentally demonstrate the utility of the scheduling policies using MQO, which is a distributed, Grid-enabled, multiple query processing middleware system we developed to optimize query processing for data analysis and visualization applications.
Keywords:Multiple query optimization  Distributed query scheduling  Data intensive computing
本文献已被 ScienceDirect 等数据库收录!
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号