首页 | 本学科首页   官方微博 | 高级检索  
     

基于关系数据库的top-k聚合关键词查询
引用本文:张东站 苏志锋 林子雨 薛永生. 基于关系数据库的top-k聚合关键词查询[J]. 计算机研究与发展, 2014, 51(4): 918-929.
作者姓名:张东站  苏志锋  林子雨  薛永生
作者单位:(厦门大学计算机科学系 福建厦门 361005) (zdz@xmu.edu.cn)
基金项目:中央高校基本科研业务费专项资金项目(2011121049);国家自然科学基金项目(61102136,61202012);福建省自然科学基金项目(2011J05156,2011J05158);福建省自然科学基金项目(2013J05099);国家自然科学基金项目(61303004)
摘    要:基于关系数据库的关键词查询,使得用户在不需要掌握结构化查询语言和数据库模式的情况下,可以方便地进行关系数据库查询.给定一个关键词查询,已有的方法通过数据库中的主外键关联,查询得到包含关键词的元组集合.但是,在很多实际应用中,元组集合的聚合结果对用户更有价值;研究了基于关系数据库的top-k聚合关键词查询,提出了基于递归的聚合单元枚举算法——基于递归的完全搜索(recursion-based full search, RFS).为了获得更好的查询性能,设计了新的排序方法、二维索引和快速搜索算法——基于输出的快速搜索(output-based quick search, OQS),从而可以高效地枚举top-k个聚合单元;在不同的数据集上进行了大量的实验,实验结果表明OQS算法具有良好的查询性能.

关 键 词:聚合关键词查询  关系数据库  二维索引  聚合单元  排序

top-k Aggregation Keyword Search over Relational Databases
Zhang Dongzhan, Su Zhifeng, Lin Ziyu, and Xue Yongsheng. top-k Aggregation Keyword Search over Relational Databases[J]. Journal of Computer Research and Development, 2014, 51(4): 918-929.
Authors:Zhang Dongzhan  Su Zhifeng  Lin Ziyu  and Xue Yongsheng
Affiliation:(Department of Computer Science, Xiamen University, Xiamen, Fujian 361005)
Abstract:Structured query language (SQL) is a classical approach to performing query over relational databases. However, it is difficult to search information for ordinary users who are unfamiliar with the underlying schema of the database and SQL. While keyword search technology used in information retrieval (IR) systems allows users to just simply input a set of keywords to get the required results. Therefore, it is desirable to integrate DB and IR, which allows users to search relational databases without any knowledge of database schema and query languages. Given a keyword query, the existing approaches find individual tuples which match a set of query keywords based on primary-foreign-key relationships in databases. However, it is more useful for users to get the aggregation result of tuples in many real applications, and those existing methods cannot be used to deal with such issue. Therefore, this paper focuses on the problem of top-k aggregation keyword search over relational databases. Here recursion-based full search algorithm, i.e., RFS, is proposed to get all aggregation cells. To achieve high performance, new ranking techniques, keyword-tuple-based two dimensional index and quick search algorithm, i.e., OQS, are developed for effectively identifying top-k aggregation cells. A large number of experiments have been implemented upon two large real datasets, and the experimental results show the benefits of our approach.
Keywords:aggregation keyword search  relational database  two dimensional index  aggregation cell  ranking
本文献已被 CNKI 等数据库收录!
点击此处可从《计算机研究与发展》浏览原始摘要信息
点击此处可从《计算机研究与发展》下载全文
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号