首页 | 本学科首页   官方微博 | 高级检索  
     

面向开放大数据环境的动态数据保护系统
引用本文:屠要峰,牛家浩,王德政,高洪,徐进,洪科,阳方.面向开放大数据环境的动态数据保护系统[J].软件学报,2023,34(3):1213-1235.
作者姓名:屠要峰  牛家浩  王德政  高洪  徐进  洪科  阳方
作者单位:移动网络和移动多媒体技术国家重点实验室, 广东 深圳 518057;中兴通讯股份有限公司, 江苏 南京 210014
基金项目:国家重点研发计划(2021YFB3101100)
摘    要:大数据成为国家基础性战略资源,数据的开放共享是我国大数据战略的核心.云原生技术和湖仓一体架构正在重构大数据基础设施,并推动数据共享和价值传播.大数据产业和技术的发展都需要更强的数据安全和数据共享能力.然而,开放环境下数据的安全问题已成为制约大数据技术发展与利用的瓶颈.无论开源大数据生态还是商业大数据系统,所引发的数据安全及隐私保护问题都日益凸显.开放大数据环境下的动态数据保护系统面临着数据可用性、处理高效性和系统可扩展性等方面的挑战.提出了面向开放大数据环境的动态数据保护系统BDMasker,通过一种基于查询依赖模型(querydependencymodel)的精准查询分析及查询改写技术,能够精准感知但不改变原始业务请求,实现动态脱敏全过程对业务零影响;通过面向多引擎的统一安全策略框架,实现了动态数据保护能力的纵向扩展和在多种计算引擎中的横向扩展;利用大数据执行引擎的分布式计算能力,提升系统的数据保护处理性能.实验结果表明, BDMasker提出的精准SQL分析及改写技术是有效的,系统具有良好的扩展能力和性能表现,在TPC-DS和YCSB基准测试中,整体性能波动在3%之内.

关 键 词:大数据  数据脱敏  动态数据脱敏  SQL改写  查询依赖
收稿时间:2022/5/14 0:00:00
修稿时间:2022/9/7 0:00:00

Dynamic Data Protection System for Open Big Data Environment
TU Yao-Feng,NIU Jia-Hao,WANG De-Zheng,GAO Hong,XU Jin,HONG Ke,YANG Fang.Dynamic Data Protection System for Open Big Data Environment[J].Journal of Software,2023,34(3):1213-1235.
Authors:TU Yao-Feng  NIU Jia-Hao  WANG De-Zheng  GAO Hong  XU Jin  HONG Ke  YANG Fang
Affiliation:State Key Laboratory of Mobile Network and Mobile Multimedia Technology, Shenzhen, Guangdong 518057, China;ZTE Corporation, Nanjing, Jiangsu 210014, China
Abstract:Big data has become a national basic strategic resource,while the opening and sharing of data is the core of our country''s big data strategy.Cloud native technology and lake-house architecture are reconstructing the big data infrastructure and promoting data sharing and value dissemination.The development of big data industry and technology require stronger data security and data sharing capabilities.However,data security in an open environment has become a bottleneck which restricts the development and utilization of big data technology.The issues of data security and privacy protection have become increasingly prominent both in the open source big data ecosystem and the commercial big data system.Dynamic data protection system under open big data environment is now facing challenges of data availability,processing efficiency and system scalability and etc.This paper proposes a dynamic data protection system BDMasker for the open big data environment.Through a precise query analysis and query rewriting technology based on query dependency model,it can accurately perceive but not change the original business request,which indicates that the whole process of dynamic desensitization has zero impact on the business.Furthermore,its multi-engine-oriented unified security strategy framework realizes the vertical expansion of dynamic data protection capabilities and the horizontal expansion among multiple computing engines.The distributed computing capability of the big data execution engine can be used to improve the data protection processing performance of the system.The experimental results show that the precise SQL analysis and rewriting technology proposed by BDMasker is effectively,the system has good scalability and performance,and the overall performance fluctuates within 3% in the TPC-DS and YCSB benchmark tests.
Keywords:big data  data masking  dynamic data masking  SQL rewriting  query dependency
点击此处可从《软件学报》浏览原始摘要信息
点击此处可从《软件学报》下载全文
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号