首页 | 本学科首页   官方微博 | 高级检索  
     

高速流环境下近似连续k代表轮廓查询算法
引用本文:朱睿,宋栿尧,王斌,杨晓春,张安珍,夏秀峰.高速流环境下近似连续k代表轮廓查询算法[J].软件学报,2023,34(3):1425-1450.
作者姓名:朱睿  宋栿尧  王斌  杨晓春  张安珍  夏秀峰
作者单位:沈阳航空航天大学 计算机学院, 辽宁 沈阳 110136;东北大学 计算机科学与工程学院, 辽宁 沈阳 110169
基金项目:国家自然科学基金(62102271,62072088,61991404);国家重点研发计划(2020YFB1707901);沈阳市创新人才项目(RC200439)
摘    要:k代表轮廓查询是从传统轮廓查询中衍生出来的一类查询.给定多维数据集合D,轮廓查询从D中找到所有不被其他对象支配的对象,将其返回给用户,便于用户结合自身偏好选择高质量对象.然而,轮廓对象规模通常较大,用户需要从大量数据中进行选择,导致选择速度和质量无法得到保证.与传统轮廓查询相比,k代表轮廓查询从所有轮廓对象中选择“代表性”最强的k个对象返回给用户,有效地解决了传统轮廓查询存在的这一问题.给定滑动窗口W和连续查询q,q监听窗口中的数据.当窗口滑动时,查询q返回窗口中,组合支配面积最大的k个对象.现有算法的核心思想是:实时监测当前窗口中的轮廓对象集合,当轮廓对象集合更新时,算法更新k代表轮廓.然而,实时监测窗口中,轮廓集合的计算代价通常较大.此外,当轮廓集合规模较大时,从中选择k代表轮廓的计算代价是同样巨大的,导致已有算法无法在高速流环境下使用.针对上述问题,提出了ρ-近似k代表轮廓查询.为了支持该查询,提出了查询处理框架PAKRS(predict-basedapproximatekrepresentativeskyline).首先,PAKRS利用高速流的特性对当前窗口进行划分,根据划分结...

关 键 词:轮廓查询  k代表轮廓查询  滑动窗口  分片  高速流
收稿时间:2021/11/25 0:00:00
修稿时间:2022/4/27 0:00:00

Approximate Continuous k Representative Skyline Query Algorithm over High-Speed Streaming Data Environment
ZHU Rui,SONG Fu-Yao,WANG Bin,YANG Xiao-Chun,ZHANG An-Zhen,XIA Xiu-Feng.Approximate Continuous k Representative Skyline Query Algorithm over High-Speed Streaming Data Environment[J].Journal of Software,2023,34(3):1425-1450.
Authors:ZHU Rui  SONG Fu-Yao  WANG Bin  YANG Xiao-Chun  ZHANG An-Zhen  XIA Xiu-Feng
Affiliation:School of Computer Science, Shenyang Aerospace University, Shenyang 110136, China;School of Computer Science and Engineering, Northeastern University, Shenyang 110169, China
Abstract:k representative skyline query is a type of query derived from traditional skyline query. Given a set of d-dimensional dataset D, a skyline query finds all objects in D that are not dominated by other ones, which helps users to select high-quality objects based on their preference. However, the scale of skyline objects may be large in many cases, users have to choose target objects from a large number of objects, leading that both the selection speed and quality cannot be guaranteed. Compared with traditional skyline query, k representative skyline query chooses the most "representative" k objects from all skyline objects, which effectively solves such problem causes by traditional skyline query. Given the sliding window W and a continuous query q, q monitors objects in the window. When the window slides, q returns k skyline objects with the largest group dominance size in the window. The key behind existing algorithms is to monitor skyline objects in the current window. When the skyline set is updated, the algorithm updates k representative skyline set. However, the cost of monitoring skyline set is usually high. When the skyline set scale is large, the computational cost of choosing k representative skyline objects is also high. Thus, existing algorithms cannot efficiently work under high-speed stream environment. This study proposes a query named r-approximate k representative skyline query. In order to support this type of queries, a novel framework is proposed named PAKRS (predict-based approximate k representative skyline). Firstly, PAKRS partitions the current window into a group of sub-windows. Next, the predicted result sets of a few future windows are constructed according to the partition result. In this way, the earliest moment can be predicted when new arrived objects may become skyline objects. Secondly, an index is proposed named r-GRID, which can help PAKRS to select r-approximate k representative skyline with O(k/s+k/m) computational cost under 2-dimensional space, and O(2Ld/m+2Ld/s) computational cost under d-dimensional space (d>2), where L is a little integer smaller than k. Theoretical analysis shows that the computational complexity of PAKRS is lower than the state-of-the-art efforts. Extensive experiments have been conducted to confirm the efficiency and effectiveness of the proposed algorithms. Experimental results show that the running time of PAKRS is about 1/4 times of PBA (prefix-based algorithm), algorithm 1/6 times of GA (greedy algorithm) and about 1/3 times of e-GA (e-constraint greedy algorithm).
Keywords:skyline query  k representative skyline query  sliding window  partition  high-speed streaming
点击此处可从《软件学报》浏览原始摘要信息
点击此处可从《软件学报》下载全文
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号