首页 | 本学科首页   官方微博 | 高级检索  
     


Optimizing entity join queries when data transmission cost dominates
Authors:Pauray S. M. Tsai  Arbee L. P. Chen  
Affiliation:

a Department of Information Management, Ming Hsin Institute of Technology & Comerce, Hsin-Feng, 304, Hsinchu, Taiwan, ROC

b Department of Computer Science, National Tsing Hua University, 300, Hsinchu, Taiwan, ROC

Abstract:Heterogeneities exist in a multidatabase environment. For example, a real world entity may be differently represented in relations of different databases. In particular, keys of these relations may be incompatible. In this paper, we consider processing entity join queries when data transmission cost dominates. An entity join operation ‘integrates’ tuples representing the same entities from different relations in which inconsistent data may exist. A natural way to process the entity join is to transmit both relations to a site, resolve the possible conflicts between corresponding attributes and process the join, which is very costly. In this paper, an approach is proposed to correctly transform a global query into local subqueries to preprocess entity join queries in multiple sites with an attempt to lower the cost of data transmission. Besides, an extension of the traditional semijoin, named extended semijoin, is proposed to further reduce the cost of data transmission for entity join query processing.
Keywords:Entity join   Extended semijoin   Inconsistent data   Local processing   Multidatabase   Query optimization   Query transformation   Selectivity
本文献已被 ScienceDirect 等数据库收录!
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号