首页 | 本学科首页   官方微博 | 高级检索  
     

关系数据库中聚合代数约束的高效发现算法——AAC-Hunter
引用本文:张效伟,江大伟,陈珂,陈刚. 关系数据库中聚合代数约束的高效发现算法——AAC-Hunter[J]. 计算机应用, 2021, 41(3): 636-642. DOI: 10.11772/j.issn.1001-9081.2020091473
作者姓名:张效伟  江大伟  陈珂  陈刚
作者单位:1. 浙江大学 计算机科学与技术学院, 杭州 310027;2. 浙江省大数据智能计算重点实验室(浙江大学), 杭州 310027
基金项目:国家自然科学基金青年科学基金资助项目
摘    要:针对如何更好地维护关系数据库的数据完整性以及帮助审计员找出违规的报销记录的问题,提出了自动发现聚合代数约束(AAC)的算法AAC-Hunter.AAC是一种定义在数据库中两列的聚合结果之间的模糊约束,作用于大多数而非全部记录上.AAC-Hunter首先枚举连接、分组和代数表达式来产生候选AAC,然后分别计算这些候选AA...

关 键 词:约束发现  聚合代数约束  关系数据库  数据驱动  审计
收稿时间:2020-09-07
修稿时间:2020-10-30

AAC-Hunter:efficient algorithm for discovering aggregation algebraic constraints in relational databases
ZHANG Xiaowei,JIANG Dawei,CHEN Ke,CHEN Gang. AAC-Hunter:efficient algorithm for discovering aggregation algebraic constraints in relational databases[J]. Journal of Computer Applications, 2021, 41(3): 636-642. DOI: 10.11772/j.issn.1001-9081.2020091473
Authors:ZHANG Xiaowei  JIANG Dawei  CHEN Ke  CHEN Gang
Affiliation:1. College of Computer Science and Technology, Zhejiang University, Hangzhou Zhejiang 310027, China;2. Key Laboratory of Big Data Intelligent Computing of Zhejiang Province(Zhejiang University), Hangzhou Zhejiang 310027, China
Abstract:In order to better maintain the data integrity and help auditors find anomalous reimbursement records in relational databases, the algorithm AAC-Hunter (Aggregation Algebraic Constraints Hunter), which discovered Aggregation Algebraic Constraints (AACs) automatically, was proposed. An AAC is a fuzzy constraint defined between the aggregation results of two columns in the database and acts on most but not all records. Firstly, joining, grouping and algebraic expressions were enumerated to generate candidate AACs. Secondly, the value range sets of these candidate AACs were calculated. Finally, the AAC results were output. However, this method was not able to face the performance challenges caused by massive data, so that a set of heuristic rules were applied to decrease the size of candidate constraint space and the optimization strategies based on intermediate results reuse and trivial candidate AACs elimination were employed to speed up the value range set calculation for candidate AACs. Experimental results on TPC-H and European Soccer datasets show that AAC-Hunter reduces the constraint discovery space by 95.68% and 99.94% respectively, and shortens running time by 96.58% and 92.51% respectively, compared with the baseline algorithm without heuristic rules or optimization strategies. As the effectiveness of AAC-Hunter is verified, it can be seen that AAC-Hunter can improve the efficiency and capability of auditing application.
Keywords:constraints discovery  Aggregation Algebraic Constraint (AAC)  relational database  data-driven  auditing  
本文献已被 万方数据 等数据库收录!
点击此处可从《计算机应用》浏览原始摘要信息
点击此处可从《计算机应用》下载全文
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号