首页 | 本学科首页   官方微博 | 高级检索  
     

基于Flink的k-支配skyline体并行求解算法
引用本文:孙国璋,黄山,艾力卡木·再比布拉,徐浩桐,段晓东.基于Flink的k-支配skyline体并行求解算法[J].计算机工程与科学,2023,45(1):17-27.
作者姓名:孙国璋  黄山  艾力卡木·再比布拉  徐浩桐  段晓东
作者单位:(1.大连民族大学计算机科学与工程学院,辽宁 大连 116600; 2.大数据应用技术国家民委重点实验室(大连民族大学),辽宁 大连116600; 3.大连市民族文化数字技术重点实验室(大连民族大学),辽宁 大连116600)
基金项目:国家重点研发计划(2018YFB1004402)
摘    要:k-支配skyline算法弱化了数据点之间的支配关系,更适合高维数据。k-支配skyline体适应于多名用户使用k-支配skyline算法查询,而现有的求解算法在时间效率和代码扩展性方面都有待提高。因此,提出了面向多用户的k-支配skyline体求解优化算法MKSSOA,该算法对每名用户的候选集和中间集分别进行存储,同时在k-支配检查过程中利用2集合中数据点出现的先后次序将候选集中的非k-支配skyline点存储到对应用户的中间集中,以便下一名用户筛选使用,这样可以减少数据点之间的比较次数,避免重复计算,从而提升查询效率。同时,提出了面向多用户的k-支配skyline体并行求解算法MKSPSA,通过Apache Flink并行处理框架有效减少了数据点的比较时间。理论研究和实验结果显示,提出的算法具有较高的效率,能很好地处理多用户k-支配skyline问题。

关 键 词:k-支配  skyline查询  多用户  Apache  Flink  并行查询
收稿时间:2022-09-05
修稿时间:2022-10-21

A k-dominant skyline body parallelsolving algorithm based on Flink
SUN Guo-zhang,HUANG Shan,ALKAM Zabibul,XU Hao-tong,DUAN Xiao-dong.A k-dominant skyline body parallelsolving algorithm based on Flink[J].Computer Engineering & Science,2023,45(1):17-27.
Authors:SUN Guo-zhang  HUANG Shan  ALKAM Zabibul  XU Hao-tong  DUAN Xiao-dong
Affiliation:(1.College of Computer Science and Engineering,Dalian Minzu University,Dalian 116600; 2.State Ethnic Affairs Commission Key Laboratory of Big Data Applied Technology (Dalian Minzu University),Dalian 116600; 3.Dalian Key Laboratory of Digital Technology for National Culture (Dalian Minzu University),Dalian 116600,China)
Abstract:The k-dominated skyline algorithm weakens the domination relationship between data points and is more suitable for high-dimensional data. k-dominated skyline bodies are suitable for multiple users to query with the k-dominated skyline algorithm, but the existing solution algorithms need to be improved in terms of time efficiency and code scalability. Therefore, this paper proposes an optimization algorithm for solving k-dominated skyline bodies. This algorithm stores the candidate set and the intermediate set for each user separately, and stores the non-k-dominated skyline points in the candidate set to the intermediate set of the corresponding user in the order of appearance of data points in the two sets during the k-domination checking process, so that the next user can filter and use them, which can reduce the number of comparisons between data points, avoids double counting, and improve query efficiency. A multi-user k-dominated skyline body parallel solving algorithm is also proposed, which effectively reduces the comparison time of data points through the Apache Flink parallel processing framework. The theoretical study and experimental data show that the proposed algorithm is highly efficient and can handle the multi-user k-dominated skyline problem well.
Keywords:k-dominant  skyline query  multi-user  Apache Flink  parallel query  
点击此处可从《计算机工程与科学》浏览原始摘要信息
点击此处可从《计算机工程与科学》下载全文
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号