期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

1.

Emerging Cubes: Borders,size estimations and lossless reductions

Sébastien Nedjar Alain Casali Rosine Cicchetti Lotfi Lakhal 《Information Systems》2009

Discovering trend reversals between two data cubes provides users with a novel and interesting knowledge when the real world context fluctuates: What is new? Which trends appear or emerge? Which tendencies are immersing or disappear? With the concept of Emerging Cube, we capture such trend reversals by enforcing an emergence constraint. We resume the classical borders for the Emerging Cube and introduce a new one which optimizes both storage space and computation time, provides a simple characterization of the size of Emerging Cubes, as well as classification and cube navigation tools. We soundly state the connection between the classical and proposed borders by using cube transversals. Knowing the size of Emerging Cubes without computing them is of great interest in particular for adjusting at best the underlying emergence constraint. We address this issue by studying an upper bound and characterizing the exact size of Emerging Cubes. We propose two strategies for quickly estimate their size: one based on analytical estimation, without database access, and one based on probabilistic counting using the proposed borders as the input of the near-optimal algorithm HyperLogLog. Due to the efficiency of the estimation algorithm various iterations can be performed to calibrate at best the emergence constraint. Moreover, we propose reduced and lossless representations of the Emerging Cube by using the concept of cube closure. Finally, we perform experiments for different data distributions in order to measure on one hand the size of the introduced condensed and concise representations and on the other hand the performance (accuracy and computation time) of the proposed estimation method. 相似文献

2.

An efficient method for maintaining data cubes incrementally

Ki Yong Lee Yon Dohn Chung 《Information Sciences》2010,180(6):928-2059

The data cube operator computes group-bys for all possible combinations of a set of dimension attributes. Since computing a data cube typically incurs a considerable cost, the data cube is often precomputed and stored as materialized views in data warehouses. A materialized data cube needs to be updated when the source relations are changed. The incremental maintenance of a data cube is to compute and propagate only its changes, rather than recompute the entire data cube from scratch. For n dimension attributes, the data cube consists of 2ⁿ group-bys, each of which is called a cuboid. To incrementally maintain a data cube with 2ⁿ cuboids, the conventional methods compute 2ⁿdelta cuboids, each of which represents the change of a cuboid. In this paper, we propose an efficient incremental maintenance method that can maintain a data cube using only a subset of 2ⁿ delta cuboids. We formulate an optimization problem to find the optimal subset of 2ⁿ delta cuboids that minimizes the total maintenance cost, and propose a heuristic solution that allows us to maintain a data cube using only delta cuboids. As a result, the cost of maintaining a data cube is substantially reduced. Through various experiments, we show the performance advantages of the proposed method over the conventional methods. We also extend the proposed method to handle partially materialized cubes and dimension hierarchies. 相似文献

3.

中医诊疗决策支持系统中的OLAP与DM融合 总被引：1，自引：0，他引：1

孙亚男宁士勇鲁明羽陆玉昌《计算机工程》2006,32(9):251-252,255

提出了一种基于数据仓库、在线分析处理及数据挖掘的中医诊疗决策支持系统的解决方窠，该方案很好地融合了中医个体化诊疗的特点。结合实际应用详细讨论了中医诊疗数据仓库建立、OLAP实现的关健问题和数据挖掘的应用主题，设计了集成化的中医诊疗决策支持系统的3层体系结构，同时给出了实现案例。相似文献

4.

一种OLAP应用系统的设计和实现

周龙郑诚《微机发展》2006,16(6):101-103

通过对数据仓库和OLAP概念及体系结构的分析,描述了一种OLAP应用系统的设计方案,并介绍了它的具体实现方法。基于数据仓库的查询,一般都是及时特定查询,要在严格的响应时间内执行复杂的查询,遍历百万上亿的记录,同时进行可能很复杂的搜索、连接和汇总的操作。查询的数据吞吐量和响应时间是判断数据仓库性能的重点。CUBE的计算是OLAP及时查询的基础,提高查询的速度需要对OLAP进行预先的计算。文中系统比较了一些计算立方体的算法,并运用到具体的系统当中。相似文献

5.

Contrasting temporal trend discovery for large healthcare databases

Goran Hrovat Gregor Stiglic Peter Kokol Milan Ojsteršek 《Computer methods and programs in biomedicine》2014

With the increased acceptance of electronic health records, we can observe the increasing interest in the application of data mining approaches within this field. This study introduces a novel approach for exploring and comparing temporal trends within different in-patient subgroups, which is based on associated rule mining using Apriori algorithm and linear model-based recursive partitioning. The Nationwide Inpatient Sample (NIS), Healthcare Cost and Utilization Project (HCUP), Agency for Healthcare Research and Quality was used to evaluate the proposed approach. This study presents a novel approach where visual analytics on big data is used for trend discovery in form of a regression tree with scatter plots in the leaves of the tree. The trend lines are used for directly comparing linear trends within a specified time frame. Our results demonstrate the existence of opposite trends in relation to age and sex based subgroups that would be impossible to discover using traditional trend-tracking techniques. Such an approach can be employed regarding decision support applications for policy makers when organizing campaigns or by hospital management for observing trends that cannot be directly discovered using traditional analytical techniques. 相似文献

6.

MSMiner—a developing platform for OLAP

Zhongzhi Youping Qing Lida Xu Shaohui Liu Liangxi Ziyan Jiayou Li Huijing Lei Zhao 《Decision Support Systems》2007,42(4):2016

Since the early 1970s, decision support systems (DSS) have evolved significantly. In this paper, the design and implementation of MSMiner, a developing platform for DSS, is introduced. The system is constructed on a data warehouse and integrated with a number of data mining algorithms. It is well suited for on-line analytical processing (OLAP). The characteristics of MSMiner include the ability to support multiple data sources and data mining strategies, additional organizational flexibility in regard to data and mining strategies, and the powerful expansibility of data mining tasks. 相似文献

7.

OLAP技术在电信领域中的应用

张中平徐佑军《计算机工程与设计》2005,26(7):1950-1952

OLAP（联机分析处理）是一种数据分析技术，它和数据仓库有着密切的联系，详细阐述了OLAP技术在电信领域的应用，以及OALP技术和数据仓库的一些关系。以移动电话业务收入总量为分析主题，确定了分析方法，定义了维度，并构造分析了立方体和星型结构，最后对结果进行了分析。相似文献

8.

The automatic creation of OLAP cube using an MDA approach

下载免费PDF全文

Khadija Letrache Omar El Beggar Mohammed Ramdani 《Software》2017,47(12):1887-1903

The Model‐Driven Architecture (MDA) is an approach that aligns modeling and automation for software development. By applying such an approach to data warehouse (DW) projects, we can minimize a great deal of time and cost. Furthermore, most of OnLine Analytical Processing (OLAP) platforms seem to be like black boxes that provide wizards only to business intelligence developers to create and manipulate OLAP objects without allowing their sustainability and migration from a platform to another. That is why many works in the literature have proposed using the MDA approach in DW projects. However, most of them have mainly focused on the generation of the DW relational model from the conceptual one, and they overlooked the OLAP model and the cube implementation. To deal with this problem, we propose in this paper an MDA solution to automate the process of getting OLAP cube and its implementation through a set of metamodels and automatic transformations among them. In fact, the proposal generates the OLAP and DW relational models (PSMs) from the conceptual one, using also a PDM model that describes the target business intelligence platform. After that, the source code to create the cube is got from both PSM models. For this aim, we define a set of transformation rules implemented using the Atlas transformation language. Finally, a case study will be provided to validate our approach. 相似文献

9.

Exploiting edge semantics in citation graphs using efficient, vertical ARM

Imad Rahal Dongmei Ren Weihua Wu Anne Denton Christopher Besemann William Perrizo 《Knowledge and Information Systems》2006,10(1):57-91

Graphs are increasingly becoming a vital source of information within which a great deal of semantics is embedded. As the size of available graphs increases, our ability to arrive at the embedded semantics grows into a much more complicated task. One form of important hidden semantics is that which is embedded in the edges of directed graphs. Citation graphs serve as a good example in this context. This paper attempts to understand temporal aspects in publication trends through citation graphs, by identifying patterns in the subject matters of scientific publications using an efficient, vertical association rule mining model. Such patterns can (a) indicate subject-matter evolutionary history, (b) highlight subject-matter future extensions, and (c) give insights on the potential effects of current research on future research. We highlight our major differences with previous work in the areas of graph mining, citation mining, and Web-structure mining, propose an efficient vertical data representation model, introduce a new subjective interestingness measure for evaluating patterns with a special focus on those patterns that signify strong associations between properties of cited papers and citing papers, and present an efficient algorithm for the purpose of discovering rules of interest followed by a detailed experimental analysis. Imad Rahal is a newly appointed assistant professor in the Department of Computer Science at the College of Saint Benedict ∣ Saint John's University, Collegeville, MN, and a Ph.D. candidate at North Dakota State University, Fargo, ND. In August 2003, he earned his master's degree in computer science from North Dakota State University. Prior to that, he graduated summa cum laude from the Lebanese American University, Beirut, Lebanon, in February 2001 with a bachelor's degree in computer science. Currently, he is completing the final requirements for his Ph.D. degree in computer science on an NSF ND-EPSCoR doctoral dissertation assistantship with August of 2005 as a projected completion date. He is very active in research, proposal writing, and publications; his research interests are largely in the broad areas of data mining, machine learning, databases, artificial intelligence, and bioinformatics. Dongmei Ren is working for the Database Technology Institute for z/OS, IBM Silicon Valley Lab, San Jose, CA, as a staff software engineer. She holds a Ph.D. degree from North Dakota State University, Fargo, ND, and master's and bachelor's degrees from TianJin University, TianJin, China. She has been a software engineer at DaTang Telecommunications, Beijing, China. Her areas of expertise are outlier analysis, data mining and knowledge discovery, database systems, machine learning, intelligent systems, wireless networks and bioinformatics. She has been awarded the Siemens Scholarship research enhancement for excellent performance in study and research. She is a member of ACM, IEEE. Weihua Wu is a network monitoring & managed services analyst at Hewlett-Packard Co. in Canada. He holds a master's degree from North Dakota State University and a bachelor's degree from Nanjing University, both in computer science. His research areas of interest include data mining, knowledge discovery, data warehousing, information technology, network security, and bioinformatics. He has participated in various projects supported by NSF, DARPA, NASA, USDA, and GSA grants. Anne Denton is an assistant professor in computer science at North Dakota State University. Her research interests are in data mining, knowledge discovery in scientific data, and bioinformatics. Specific interests include data mining of diverse data, in which objects are characterized by a variety of properties such as numerical and categorical attributes, graphs, sequences, time-dependent attributes, and others. She received her Ph.D. in physics from the University of Mainz, Germany, and her M.S. in computer science from North Dakota State University, Fargo, ND. Christopher Besemann received his M.Sc. in computer science from North Dakota State University in Fargo, ND, 2005. Currently, he works in data mining research topics including association mining and relational data mining with recent work in model integration as a research assistant. He is accepted under a fellowship program for Ph.D. study at North Dakota State University. William Perrizo is a professor of computer science at North Dakota State University. He holds a Ph.D. degree from the University of Minnesota, a master's degree from the University of Wisconsin and a bachelor's degree from St. John's University. He has been a research scientist at the IBM Advanced Business Systems Division and the U.S. Air Force Electronic Systems Division. His areas of expertise are data mining, knowledge discovery, database systems, distributed database systems, high speed computer and communications networks, precision agriculture and bioinformatics. He is a member of ISCA, ACM, IEEE, IAAA, and AAAS. 相似文献

10.

数据仓库及OLAP技术在电信业务分析中的应用探讨

刘光榕《电脑编程技巧与维护》2011,(4):32-35

以一个对实时收入和实时话务为基础数据进行的包括诸如时间、产品、渠道、资费等的多维度分析的过程为例,论述了数据仓库及联机分析(OLAP)技术的概念、技术要点及开发实施步骤,探讨了这些技术在电信业务分析中的应用. 相似文献

11.

Finding an efficient rewriting of OLAP queries using materialized views in data warehouses

Chang-Sup Myoung Ho Yoon-Joon 《Decision Support Systems》2002,32(4)

OLAP queries involve a lot of aggregations on a large amount of data in data warehouses. To process expensive OLAP queries efficiently, we propose a new method to rewrite a given OLAP query using various kinds of materialized views which already exist in data warehouses. We first define the normal forms of OLAP queries and materialized views based on the selection and aggregation granularities, which are derived from the lattice of dimension hierarchies. Conditions for usability of materialized views in rewriting a given query are specified by relationships between the components of their normal forms. We present a rewriting algorithm for OLAP queries that can effectively utilize materialized views having different selection granularities, selection regions, and aggregation granularities together. We also propose an algorithm to find a set of materialized views that results in a rewritten query which can be executed efficiently. We show the effectiveness and performance of the algorithm experimentally. 相似文献

12.

基于数据仓库的OLAP在DSS中的应用研究 总被引：14，自引：1，他引：14

陈德军盛翊智陈绵云《计算机工程与应用》2003,39(1):30-31,61

以数据仓库为基础的在线联机分析处理(OLAP)技术是决策支持系统(DSS)中一种新的决策分析方法。该文从DSS的实际需求出发,分析了数据仓库的特征和结构模型,研究了OLAP的特征和体系结构及其实现决策分析的过程,最后给出了应用实例,并探讨了完善DSS结构的研究方向。相似文献

13.

OLAP性能测试方法研究与实现

赵博叶晓俊《计算机研究与发展》2011,48(10)

随着商业智能市场的逐步扩大,联机分析处理(OLAP)系统的使用质量评估已经成为数据库应用的研究热点.作为效用特性的OLAP系统性能评估需要一个性能基准.以OLAP委员会推出的APB-1性能基准为基础,首先设计了面向多维数据库的立方体(Cube)模型以及相应的多维表达式(MDX)查询模板,在Cube模型设计的过程中修改了APB-1基准ROLAP星型模型的不足之处;接着在测试数据一致和测试参数一致的前提下,通过对设计的MOLAP模型查询结果与ROLAP模型查询结果进行对比分析,证明了MOLAP模型及MDX查询模板设计的正确性;然后给出了OLAP性能测试流程,描述了支持ROLAP和MOLAP性能测试的工具框架及其主要模块.最后使用该测试框架在商业数据库管理系统上对ROLAP和MOLAP进行并发查询实践,验证了框架的有效性.提出的方法及技术实现为未来OLAP产品性能的测试和评价提供多维数据模型、业务模型和工具的支持. 相似文献

14.

一种新型HLA仿真结果处理方法

鞠儒生乔海泉陈少卿黄柯棣《计算机仿真》2006,23(6):143-146

针对HLA仿真结果数据日益增多、难以组织管理的状况,利用数据仓库与数据挖掘技术对数据进行处理。采用基于HLA的作战系统仿真为例,设计对应的联邦与联邦成员。针对仿真产生的海量结果数据创建O rac le数据仓库,采用商业智能软件COGNOS抽取数据仓库中的数据,设计制作对应的数据立方体。对立方体中的数据进行分析和挖掘,发现并纠正仿真过程中出现的问题,有效地完成了对HLA仿真结果的评估。论文提出的方法和思路可供广大同行设计参考。相似文献

15.

采用AM插件管理OLAP数据结构的研究

叶德谦叶柠张晶明《计算机工程与设计》2003,24(1):27-29

采用插件接口对象编写的基于微软管理控制台（MMC）的分析管理器（AM）插件可以调用决策支持对象（DSO）来管理OLAP Server中的各种对象,从而大大地提高OLAP数据结构管理的安全性,准确性,灵活性以及快速性,分别介绍了分析管理器插件中使用DSO管理OLAP数据的具体方法。相似文献

16.

分布式空间数据仓库的构建与OLAP服务实现

唐萍《计算机与现代化》2010,(5):44-46

介绍基于分布式数据库技术、网络通信技术、地理信息系统技术的空间数据仓库的设计方法。以福建省沿海地区遗迹保护区为例,搭建了一个分布式的空间数据仓库。在此基础上实现以地区行政级别、类型划分及保护区一般信息为雏度的多粒度数据部署。并以本数据仓库为倒,根据不同的空间联机分析（OLAP）服务请求,动态创建数据立方体,完成OLAP服务并返回空间OLAP结果。相似文献

17.

Online mining of fuzzy multidimensional weighted association rules

Mehmet Kaya Reda Alhajj 《Applied Intelligence》2008,29(1):13-34

This paper addresses the integration of fuzziness with On-Line Analytical Processing (OLAP) based association rules mining. It contributes to the ongoing research on multidimensional online association rules mining by proposing a general architecture that utilizes a fuzzy data cube for knowledge discovery. A data cube is mainly constructed to provide users with the flexibility to view data from different perspectives as some dimensions of the cube contain multiple levels of abstraction. The first step of the process described in this paper involves introducing fuzzy data cube as a remedy to the problem of handling quantitative values of dimensional attributes in a cube. This facilitates the online mining of fuzzy association rules at different levels within the constructed fuzzy data cube. Then, we investigate combining the concepts of weight and multiple-level to mine fuzzy weighted multi-cross-level association rules from the constructed fuzzy data cube. For this purpose, three different methods are introduced for single dimension, multidimensional and hybrid (integrates the other two methods) fuzzy weighted association rules mining. Each of the three methods utilizes a fuzzy data cube constructed to suite the particular method. To the best of our knowledge, this is the first effort in this direction. We compared the proposed approach to an existing approach that does not utilize fuzziness. Experimental results obtained for each of the three methods on a synthetic dataset and on the adult data of the United States census in year 2000 demonstrate the effectiveness and applicability of the proposed fuzzy OLAP based mining approach. OLAP is one of the most popular tools for on-line, fast and effective multidimensional data analysis. In the OLAP framework, data is mainly stored in data hypercubes (simply called cubes). 相似文献

18.

DW技术在HIS中的应用

邹鹏张礼平龚正良《微型电脑应用》2000,16(12):18-20

本文就数据仓库技术在医院管理信息系统中的应用进行研究,阐述了新型医院管理信息系统中辅助决策的构架和动作,描述了其中的数据模型,并探讨了数据仓库的设计方法文化。相似文献

19.

基于MS Analysis Services 的OLAP分析系统模型设计及应用 总被引：1，自引：0，他引：1

杨彬彬郑晓薇《计算机应用与软件》2007,24(8):216-218

首先对OLAP和MS Analysis Services技术进行了讨论,然后提出了一种基于MS Analysis Services的OLAP分析系统模型设计,模型具有通用性、实用性和开放性.最后给出了运用此模型构建的OLAP分析系统实例,阐述了系统开发的步骤.实践证明运用提出的模型构建OLAP分析系统,实现了良好的分析决策功能. 相似文献

20.

MS OLAP数据立方自动增量更新的程序实现

向阳王庆大张迎春《计算机工程》2005,31(20):70-71,129

针对MS OLAP对海量数据立方体进行完全更新非常耗时,以及在它的服务器管理器中实现数据立方体的手动增最更新操作繁琐,需要由熟悉MDX语言的专业人员完成的问题,文章提出了基于事实表上时间戳或其它标志的自动增量更新方案,并给出了C#编写的程序示例。相似文献