期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

1.

The role of data warehousing in bioterrorism surveillance

Donald J. John W. Jamie Griffiths Alan R. Stephen James 《Decision Support Systems》2007,43(4):1383

The development of an effective bioterrorism surveillance system requires effective solutions to several critical challenges. The system must support multidimensional historical data, provide real-time surveillance of sensor data, have the capability for pattern recognition to quickly identify abnormal situations, and provide an analytic environment that accelerates investigations by epidemiologists and other responders. The use of real-time or flash data warehousing provides the essential ability to compare unfolding health events with historical patterns of key surveillance indicators. To explore the role of data warehousing in surveillance systems, we study naturally occurring incidents, Florida wildfires from 1996 through 2001, as reasonable facsimiles of bioterrorism attacks. Hospital admissions data on respiratory illnesses during that period are analyzed to uncover patterns that might resemble an airborne biochemical attack. A principal contribution of this research is the adroit use of online analytic processing (OLAP) techniques, along with spatial and statistical analyses, to study the adverse effects of this natural phenomenon. These techniques will provide important capabilities for epidemiologist-in-the-loop surveillance systems, enabling the rapid exploration of unusual situations and guidance for follow-up investigations. 相似文献

2.

Selecting and using views to compute aggregate queries

Foto Afrati Rada Chirkova 《Journal of Computer and System Sciences》2011,77(6):1079-1107

We consider a workload of aggregate queries and investigate the problem of selecting materialized views that (1) provide equivalent rewritings for all the queries, and (2) are optimal, in that the cost of evaluating the query workload is minimized. We consider conjunctive views and rewritings, with or without aggregation; in each rewriting, only one view contributes to computing the aggregated query output. We look at query rewriting using existing views and at view selection. In the query-rewriting problem, we give sufficient and necessary conditions for a rewriting to exist. For view selection, we prove complexity results. Finally, we give algorithms for obtaining rewritings and selecting views. 相似文献

3.

Making filters smart in distributed data stream environments

Cheqing Jin Bolin Ding 《Information Sciences》2009,179(9):1348-4753

Monitoring aggregate queries in real-time over distributed streaming environments appears to be a great challenge not only because of the huge data volume and high rate, but also because of the limitation of the network transmission bandwidth. Consequently, ensuring qualified approximate results with economical network consumption becomes one of the most important goals in such scenarios. In this paper, we study how to monitor aggregate queries continuously over distributed environments efficiently by disposing numerous filters at remote sites, in order to transmit only a small part of incoming data to the query site and therefore save the network resource significantly. We also show how to adjust the parameters of a filter continuously when the incoming data distribution at the corresponding remote site changes. Analysis and extensive experimental results demonstrate that our approach outperforms the existing work. 相似文献

4.

Semi-Closed Cube: An Effective Approach to Trading Off Data Cube Size and Query Response Time

下载免费PDF全文

Sheng-En Li Shan Wang 《计算机科学技术学报》2005,20(3):367-372

The results of data cube will occupy huge amount of disk space when the base table is of a large number of attributes. A new type of data cube, compact data cube like condensed cube and quotient cube, was proposed to solve the problem. It compresses data cube dramatically. However, its query cost is so high that it cannot be used in most applications. This paper introduces the semi-closed cube to reduce the size of data cube and achieve almost the same query response time as the data cube does. Semi-closed cube is a generalization of condensed cube and quotient cube and is constructed from a quotient cube. When the query cost of quotient cube is higher than a given threshold, semi-closed cube selects some views and picks a fellow for each of them. All the tuples of those views are materialized except those closed by their fellows. To find a tuple of those views, users only need to scan the view and its fellow. Thus, their query performance is improved. Experiments were conducted using a real-world data set. The results show that semi-closed cube is an effective approach of data cube. 相似文献

5.

CineCubes: Aiding data workers gain insights from OLAP queries

《Information Systems》2015

In this paper we demonstrate that it is possible to enrich query answering with a short data movie that gives insights to the original results of an OLAP query. Our method, implemented in an actual system, CineCubes, includes the following steps. The user submits a query over an underlying star schema. Taking this query as input, the system comes up with a set of queries complementing the information content of the original query, and executes them. For each of the query results, we execute a set of highlight extraction algorithms that identify interesting patterns and values in the data of the results. Then, the system visualizes the query results and accompanies this presentation with a text commenting on the result highlights. Moreover, via a text-to-speech conversion the system automatically produces audio for the constructed text. Each combination of visualization, text and audio practically constitutes a movie, which is wrapped as a PowerPoint presentation and returned to the user. 相似文献

6.

基于内存的数据立方查询处理

晏明春方茂华陈红梅《计算机工程》2004,30(8):75-76,138

随着内存容量的飞速扩大,出现了一些配备以GB计的内存的工作站。但现行的OLAP系统都没有充分利用大容量RAM,鉴于此,文章提出一种基于内存的数据立方查询处理系统。该系统采用一种二级索引内存数据结构,充分利用有限的内存空间,有效组织各数据小方的元组,实现了高效数据立方查询。相似文献

7.

无须附加空间的数据立方体联机聚集

李红松黄厚宽《软件学报》2006,17(4):806-813

以往在数据立方体上实现的联机聚集往往需要附加空间来存储联机聚集估算所需要的信息,极大地影响了数据立方体的存储和维护性能.提出了基于QC-Tree的用于范围查询处理的联机聚集PE(progressively estimate)算法以及它与简单聚集算法相结合的混合聚集算法HPE(hybrid progressively estimate);还提出了一种能够同时处理多个范围查询的联机聚集算法MPE(multiple progressively estimate).与以往联机聚集算法不同,这些算法不需要任何附加空间,而是利用QC-Tree自身保存的聚集数据和语义关系来估算聚集结果.由于QC-Tree是一种极为高效的数据立方体存储结构,因此能够以较理想的性能实现数据立方体上的联机聚集.对算法的分析和实验结果表明,所提出的算法具有较好的性能. 相似文献

8.

PnP: sequential, external memory, and parallel iceberg cube computation

Ying Chen Frank Dehne Todd Eavis Andrew Rau-Chaplin 《Distributed and Parallel Databases》2008,23(2):99-126

We present “Pipe ’n Prune” (PnP), a new hybrid method for iceberg-cube query computation. The novelty of our method is that it achieves a tight integration of top-down piping for data aggregation with bottom-up a priori data pruning. A particular strength of PnP is that it is efficient for all of the following scenarios: (1) Sequential iceberg-cube queries, (2) External memory iceberg-cube queries, and (3) Parallel iceberg-cube queries on shared-nothing PC clusters with multiple disks. We performed an extensive performance analysis of PnP for the above scenarios with the following main results: In the first scenario PnP performs very well for both dense and sparse data sets, providing an interesting alternative to BUC and Star-Cubing. In the second scenario PnP shows a surprisingly efficient handling of disk I/O, with an external memory running time that is less than twice the running time for full in-memory computation of the same iceberg-cube query. In the third scenario PnP scales very well, providing near linear speedup for a larger number of processors and thereby solving the scalability problem observed for the parallel iceberg-cubes proposed by Ng et al. Research partially supported by the Natural Sciences and Engineering Research Council of Canada. A preliminary version of this work appeared in the International Conference on Data Engineering (ICDE’05). 相似文献

9.

证券公司数据仓库解决方案 总被引：5，自引：0，他引：5

李长树田锋《计算机工程》2002,28(5):189-191

介绍了数据仓库、联机分析处理和数据挖掘技术。在对我国证券公司应用需求和信息系统现状进行了详细分析的基础上，提出了证券公司数据仓库的整体构架，并对其逻辑设计、数据抽取与转换、前端展现程序设计方案进行了探讨。相似文献

10.

OLAP数据仓库在电网调度决策中的研究与应用 总被引：7，自引：1，他引：6

柳进胡政唐降龙《计算机工程与设计》2005,26(2):296-298,311

以某电力系统为研究背景,在对原有的数据源进行分析和重新组织的基础上,构建电网调度数据仓库,并建立多维雪花模式的数据立方体。运用OLAP和数据挖掘技术,从多角度、多层次快速地分析和查询数据仓库的数据,实现负荷预估和调度的科学化,并说明OLAP数据仓库能够为电网调度管理人员提供有效的决策信息。相似文献

11.

基于GPU的多数据流相关系数并行计算方法研究* 总被引：2，自引：1，他引：1

周勇王皓程春田郭禾《计算机应用研究》2010,27(4):1232-1235

为了满足多数据流处理的实时性需求,提出一种跨PCIE总线的四层滑动窗口模型和基于图形处理器的多数据流并行处理框架模型,在此框架模型下可以并行维护数量巨大的滑动实时多数据流统计信息,同时采用精确方法并行计算多数据流间任意两条的相关系数。通过对比在同样的实验环境下只使用CPU的计算处理方法,验证了新方法的实时计算性能具有显著的提高。相似文献

12.

大型数据仓库实现技术的研究 总被引：2，自引：0，他引：2

陈慧萍陈岚峰王建东《计算机工程与设计》2006,27(21):3956-3958,3961

大型数据仓库是实现海量数据存储的有效途径,但在大型数据仓库的实现中存在很多问题。在分析问题的基础上,对大型数据仓库的实现问题提出了一定的解决策略,对其中的几个关键技术即数据立方体的有效计算、增量式更新维护、索引优化、故障恢复、模式设计和查询优化的代价模型及元数据的定义和管理等作了研究。相似文献

13.

Application of OLAP to a PDM database for interactive performance evaluation of in-progress product development

Namchul Do 《Computers in Industry》2014

This paper applies online analytic processing (OLAP), a widely accepted database analysis technique, to a product data management (PDM) database to evaluate the performance of in-progress product development. This paper introduces a set of processing key performance indicators (KPIs) that can measure ongoing product development. To convert and analyze operational data in a PDM database with interactive OLAP operations, this study proposes a multidimensional data model specified by product facts with associated dimensions. The model is implemented using a commercial OLAP engine, and applied to a database supported by a prototype PDM system. The OLAP engine allows analysts to interactively evaluate the performance of in-progress product development in a multidimensional data space. This is a far more flexible and efficient approach than other result-oriented static evaluation approaches. 相似文献

14.

数据仓库中的一种立方体数据模型 总被引：9，自引：1，他引：9

王继奎宁云晖《计算机工程与应用》2002,38(5):188-190

数据仓库和联机分析处理(OLAP)是商业数据处理领域中的两个最重大的新技术。OLAP应用要求对数据仓库中存储的大量数据进行分析,用标准关系数据库技术来实现非常复杂的查询是相当困难的。所以,在数据仓库中,数据被组织成立方体数据模型。该文提出了一种简单、直观的数据立方体模型以及在这个立方体上支持OLAP操作的代数。为复杂的查询提供了简要的表述方法。相似文献

15.

Attribute-based evaluation of multiple continuous queries for filtering incoming tuples of a data stream

Hyun-Ho Lee Eun-Won Yun Won-Suk Lee 《Information Sciences》2008,178(11):2416-2432

The filtering of incoming tuples of a data stream should be completed quickly and continuously, which requires strict time and space constraints. In order to guarantee these constraints, the selection predicates of continuous queries are grouped or indexed in most data stream management systems (DSMS). This paper proposes a new scheme called attribute selection construct (ASC). Given a set of continuous queries, an ASC divides the domain of an attribute of a data stream into a set of disjoint regions based on the selection predicates that are imposed on the attribute. Each region maintains the pre-computed matching results of the selection predicates. Consequently, an ASC can collectively evaluate all of its selection predicates at the same time. Furthermore, it can also monitor the overall evaluation statistics, such as its selectivity and tuple dropping ratio, dynamically. For those attributes that are employed to express the selection predicates of the queries, the processing order of their ASC’s can significantly influence the overall performance of a multiple query evaluation. The evaluation sequence can be optimized by periodically capturing the run-time tuple dropping ratio of its current evaluation sequence. The performance of the proposed method is analyzed by a series of experiments to identify its various characteristics. 相似文献

16.

Mapping plant functional types from MODIS data using multisource evidential reasoning 总被引：1，自引：0，他引：1

Wanxiao Sun Shunlin Liang Hongliang Fang 《Remote sensing of environment》2008,112(3):1010-1024

Reliable information about the geographic distribution and abundance of major plant functional types (PFTs) around the world is increasingly needed for global change research. Using remote sensing techniques to map PFTs is a relatively recent field of research. This paper presents a method to map PFTs from the Moderate Resolution Imaging Spectroradiometer (MODIS) data using a multisource evidential reasoning (ER) algorithm. The method first utilizes a suite of improved and standard MODIS products to generate evidence measures for each PFT class. The multiple lines of evidence computed from input data are then combined using Dempster's Rule of combination. Finally, a decision rule based on maximum support is used to make classification decisions. The proposed method was tested over the states of Illinois, Indiana, Iowa, and North Dakota, USA where crops dominate. The Cropland Data Layer (CDL) data provided by the United States Department of Agriculture were employed to validate our new PFT maps and the current MODIS PFT product. Our preliminary results suggest that multisource data fusion is a promising approach to improve the mapping of PFTs. For several major PFT classes such as crop, trees, and grass and shrub, the PFT maps generated with the ER method provide greater spatial details compared to the MODIS PFT. The overall accuracies increased for all the four states, with the biggest improvement occurring in Iowa from 51% (MODIS) to 64% (ER). The overall kappa statistic also increased for all the four states, with the biggest improvement occurring in Iowa from 0.03 (MODIS) to 0.38 (ER). The paper concludes with a discussion of several methodological issues pertaining to the further improvement of the ER approach. 相似文献