首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
《Computer》2001,34(12):76-79
Most operational systems store data in a normalized model in which certain rules eliminate redundancy and simplify data relationships. While beneficial for the online transaction processing workload, this model can inhibit those same OLTP databases from running analytical queries effectively. Because the analytical systems did not need to support the OLTP workload, many developers began preplanning for the answer sets. Preplanning, however, created problems in four areas: creating summary tables of preaggregated data, placing indexes in the system to eliminate scanning large data volumes, putting data into one table instead of having tables that join together, and storing the data in sorted order. All these activities require prior knowledge of the analysis and reports being requested. Unfortunately, most data warehouse implementations ignore the longer-term goals of analysis and flexibility in the rush to provide initial value. Taking time to consider the project's real purpose, then building a correct foundation for it, can assure a better future for the data warehouse. To meet user demands for more timely and flexible analysis, companies can use a step-by-step approach to move from maintaining detailed information to using summary-level data  相似文献   

2.
Modern database systems desperate for the ability to support highly scalable transactions and efficient queries simultaneously for real-time applications. One solution is to utilize query optimization techniques on the on-line transaction processing (OLTP) systems. The materialized view is considered as a panacea to decrease query latency. However, it also involves the significant cost of maintenance which trades away transaction performance. In this paper, we examine the design space and conclude several design features for the implementation of a view on a distributed log-structured merge-tree (LSMtree), which is a well-known structure for improving data write performance. As a result, we develop two incremental view maintenance (IVM) approaches on LSM-tree. One avoids join computation in view maintenance transactions. Another with two optimizations is proposed to decouple the view maintenance with the transaction process. Under the asynchronous update, we also provide consistency queries for views. Experiments on TPC-H benchmark show our methods achieve better performance than straightforward methods on different workloads.  相似文献   

3.
To evaluate the performance of database applications and database management systems (DBMSs), we usually execute workloads of queries on generated databases of different sizes and then benchmark various measures such as respond time and throughput. This paper introduces MyBenchmark, a parallel data generation tool that takes a set of queries as input and generates database instances. Users of MyBenchmark can control the characteristics of the generated data as well as the characteristics of the resulting workload. Applications of MyBenchmark include DBMS testing, database application testing, and application-driven benchmarking. In this paper, we present the architecture and the implementation algorithms of MyBenchmark. Experimental results show that MyBenchmark is able to generate workload-aware databases for a variety of workloads including query workloads extracted from TPC-C, TPC-E, TPC-H, and TPC-W benchmarks.  相似文献   

4.
BerlinMOD: a benchmark for moving object databases   总被引:2,自引:0,他引:2  
This document presents a method to design scalable and representative moving object data (MOD) and two sets of queries for benchmarking spatio-temporal DBMS. Instead of programming a dedicated generator software, we use the existing Secondo DBMS to create benchmark data. The benchmark is based on a simulation scenario, where the positions of a sample of vehicles are observed for an arbitrary period of time within the street network of Berlin. We demonstrate the data generator’s extensibility by showing how to achieve more natural movement generation patterns, and how to disturb the vehicles’ positions to create noisy data. As an application and for reference, we also present first benchmarking results for the Secondo DBMS. Whereas the benchmark focuses on range queries, we demonstrate its ability to incorporate new future classes of queries by presenting a preliminary extension handling various nearest neighbour queries. Such a benchmark is useful in several ways: It provides well-defined data sets and queries for experimental evaluations; it simplifies experimental repeatability; it emphasizes the development of complete systems; it points out weaknesses in existing systems motivating further research. Moreover, the BerlinMOD benchmark allows one to compare different representations of the same moving objects.  相似文献   

5.
混合事务与分析处理(hybridtransactionalanalyticalprocessing,HTAP)技术是一种基于一站式架构同时处理事务请求与查询分析请求的技术. HTAP技术不仅消除了从关系型事务数据库到数据仓库的数据抽取、转换和加载过程,还支持实时地分析最新事务数据.然而,为了同时处理OLTP与OLAP, HTAP系统也需要在系统性能与数据分析新鲜度之间做出取舍,这主要是因为高并发、短时延的OLTP与带宽密集型、高时延的OLAP访问模式不同且互相干扰.目前,主流的HTAP数据库主要以行列共存的方式来支持混合事务与分析处理,但是由于该类数据库面向不同的业务场景,所以它们的存储架构与处理技术各有不同.首先,全面调研HTAP数据库,总结它们主要的应用场景与优缺点,并根据存储架构对它们进行分类、总结与对比.现有综述工作侧重于基于行/列单格式存储的HTAP数据库以及基于Spark的松耦合HTAP系统,而这里侧重于行列共存的实时HTAP数据库.特别地,凝炼了主流HTAP数据库关键技术,包括数据组织技术、数据同步技术、查询优化技术、资源调度技术这4个部分.同时总结分析了HTAP数据库构...  相似文献   

6.
The type of the workload on a database management system (DBMS) is a key consideration in tuning the system. Allocations for resources such as main memory can be very different depending on whether the workload type is Online Transaction Processing (OLTP) or Decision Support System (DSS). A DBMS also typically experiences changes in the type of workload it handles during its normal processing cycle. Database administrators must therefore recognize the significant shifts of workload type that demand reconfiguring the system in order to maintain acceptable levels of performance. We envision intelligent, autonomic DBMSs that have the capability to manage their own performance by automatically recognizing the workload type and then reconfiguring their resources accordingly. In this paper, we present an approach to automatically identifying a DBMS workload as either OLTP or DSS. Using data mining techniques, we build a classification model based on the most significant workload characteristics that differentiate OLTP from DSS and then use the model to identify any change in the workload type. We construct and compare classifiers built from two different sets of workloads, namely the TPC-C and TPC-H benchmarks and the Browsing and Ordering profiles from the TPC-W benchmark. We demonstrate the feasibility and success of these classifiers with TPC-generated workloads and with industry-supplied workloads.  相似文献   

7.
Database Management System (DBMS) is used as a data source with financial, educational, web and other applications from last many years. Users are connected with the DBMS to update existing records and retrieving reports by executing workloads that consist of complex queries. In order to get the sufficient level of performance, arrangement of workloads is necessary. Rapid growth in data, maximum functionality and changing behavior tends the database workload to be more complex and tricky. Each DBMS experiences complex workloads that are difficult to manage by the humans; human experts take much time to manage database workload efficiently; even in some cases it may become impossible and leads toward malnourishment. This problem leads database practitioners, vendors and researchers toward new challenges. To achieve a satisfactory level of performance, either Database Administrator (DBA) or DBMSs must have the knowledge about the workload shifts. Efficient execution and resource allocation of workload is dependent on the workload type that may be either On Line Transaction Processing (OLTP) or Decision Support System (DSS). The research introduces a way to manage the workload in DBMSs on the basis of the workload type. The main goal of the research is to manage the workload in DBMSs through characterization, scheduler and idleness detection modules. The database workload management is performed by using the case based reasoning characterization; Fuzzy logic based scheduling and finally detection of CPU Idleness. Results are validated through experiments that are performed on real time and benchmark workload to reveal effectiveness and efficiency.  相似文献   

8.
The requirement for anthropocentric, or human-centred decision support is outlined, and the IDIOMS management information tool, which implements several human-centred principles, is described. IDIOMS provides a flexible decision support environment in which applications can be modelled using both ‘objective’ database information, and user-centred ‘subjective’ and contextual information. The system has been tested on several real applications, demonstrating its power and flexibility. IDIOMS (Intelligent Decision-making In On-line Management Systems) is a collaboration between the National Transputer Support Centre, Sheffield University, Strand Software Technologies Ltd., Bristol Transputer Centre and a high street bank, partially funded by the DTI under the Information Engineering Advanced Technology Programme. The project has demonstrated several technical features which are not detailed in this paper, including a multi-user interface allowing dynamic shared access to data; machine learning strategies for three banking applications; a scalable, modular database engine; and realistic transactions being handled while on-line management information queries are made.  相似文献   

9.
工业界、学术界,以及最终用户都急切需要一个大数据的评测基准, 用以评估现有的大数据系统,改进现有技术以及开发新的技术。回顾了近几年来大数据评测基准研发方面的主要工作。 对它们的特点和缺点进行了比较分析。在此基础上, 对研发新的大数据评测基准提出了一系列考虑因素:1)为了对整个大数据平台的不同子工具进行评测, 以及把大数据平台作为一个整体进行评测, 需要研发面向组件的评测基准和面向大数据平台整体的评测基准, 后者是前者的有机组合;2)工作负载除了SQL查询之外, 必须包含大数据分析任务所需要的各种复杂分析功能, 涵盖各类应用需求;3)在评测指标方面,除了性能指标(响应时间和吞吐量)之外, 还需要考虑其他指标的评测, 包括系统的可扩展性、容错性、节能性和安全性等。  相似文献   

10.
PEGS (Production and Environmental Generic Scheduler) is a generic production scheduler that produces good schedules over a wide range of problems. It is centralised, using search strategies with the Shifting Bottleneck algorithm. We have also developed an alternative distributed approach using software agents. In some cases this reduces run times by a factor of 10 or more. In most cases, the agent-based program also produces good solutions for published benchmark data, and the short run times make our program useful for a large range of problems. Test results show that the agents can produce schedules comparable to the best found so far for some benchmark datasets and actually better schedules than PEGS on our own random datasets. The flexibility that agents can provide for today’s dynamic scheduling is also appealing. We suggest that in this sort of generic or commercial system, the agent-based approach is a good alternative.  相似文献   

11.
In today’s digital information age, companies are struggling with an immense overload of mainly unstructured data. Reducing search times, fulfilling compliance requirements and maintaining information quality represent only three of the challenges that organisations from all industry sectors are faced with. Enterprise content management (ECM) has emerged as a promising approach addressing these challenges. Yet, there are still numerous obstacles to the implementation of ECM technologies, particularly fostered by the fact that the key challenges of ECM adaptation processes are rather organisational than technological. In the present article we claim that the consideration of an organisation’s business process structure is particularly crucial for ECM success. In response to this, we introduce a process-oriented conceptual framework that systematises the key steps of an ECM adoption. The paper suggests that ECM and business process management are two strongly related fields of research.  相似文献   

12.
The RDF-3X engine for scalable management of RDF data   总被引:1,自引:0,他引:1  
RDF is a data model for schema-free structured information that is gaining momentum in the context of Semantic-Web data, life sciences, and also Web 2.0 platforms. The “pay-as-you-go” nature of RDF and the flexible pattern-matching capabilities of its query language SPARQL entail efficiency and scalability challenges for complex queries including long join paths. This paper presents the RDF-3X engine, an implementation of SPARQL that achieves excellent performance by pursuing a RISC-style architecture with streamlined indexing and query processing. The physical design is identical for all RDF-3X databases regardless of their workloads, and completely eliminates the need for index tuning by exhaustive indexes for all permutations of subject-property-object triples and their binary and unary projections. These indexes are highly compressed, and the query processor can aggressively leverage fast merge joins with excellent performance of processor caches. The query optimizer is able to choose optimal join orders even for complex queries, with a cost model that includes statistical synopses for entire join paths. Although RDF-3X is optimized for queries, it also provides good support for efficient online updates by means of a staging architecture: direct updates to the main database indexes are deferred, and instead applied to compact differential indexes which are later merged into the main indexes in a batched manner. Experimental studies with several large-scale datasets with more than 50 million RDF triples and benchmark queries that include pattern matching, manyway star-joins, and long path-joins demonstrate that RDF-3X can outperform the previously best alternatives by one or two orders of magnitude.  相似文献   

13.
Cloud computing is increasingly being seen as a way to reduce infrastructure costs and add elasticity, and is being used by a wide range of organizations. Cloud data management systems today need to serve a range of different workloads, from analytical read-heavy workloads to transactional (OLTP) workloads. For both the service providers and the users, it is critical to minimize the consumption of resources like CPU, memory, communication bandwidth, and energy, without compromising on service-level agreements if any. In this article, we develop a workload-aware data placement and replication approach, called SWORD, for minimizing resource consumption in such an environment. Specifically, we monitor and model the expected workload as a hypergraph and develop partitioning techniques that minimize the average query span, i.e., the average number of machines involved in the execution of a query or a transaction. We empirically justify the use of query span as the metric to optimize, for both analytical and transactional workloads, and develop a series of replication and data placement algorithms by drawing connections to several well-studied graph theoretic concepts. We introduce a suite of novel techniques to achieve high scalability by reducing the overhead of partitioning and query routing. To deal with workload changes, we propose an incremental repartitioning technique that modifies data placement in small steps without resorting to complete repartitioning. We propose the use of fine-grained quorums defined at the level of groups of data items to control the cost of distributed updates, improve throughput, and adapt to different workloads. We empirically illustrate the benefits of our approach through a comprehensive experimental evaluation for two classes of workloads. For analytical read-only workloads, we show that our techniques result in significant reduction in total resource consumption. For OLTP workloads, we show that our approach improves transaction latencies and overall throughput by minimizing the number of distributed transactions.  相似文献   

14.
Kunkel  S. Armstrong  B. Vitale  P. 《Micro, IEEE》1999,19(3):56-64
Major performance enhancements in large commercial systems are best achieved when advances in hardware technology are matched with advances in software technology. This article connects recent AS/400 hardware advances with the corresponding approaches used to tune the system performance for large online transaction processing (OLTP) workloads. We particularly emphasize those tuning efforts that affect the memory system. OLTP workloads are large and complex, stressing many parts of both the software and hardware. These workloads quickly expose software bottlenecks caused by contention on software locks. They also have large working sets, populated with hard-to-predict access patterns that make cache miss rates high. This causes the processor to spend a significant part of its execution time waiting for memory accesses. In multiprocessor systems, compilers alone have minimal effect on cycles spent in storage latency. Other optimizations are needed to affect this portion of the execution time, and many of those require direct involvement of the system software  相似文献   

15.
MorphoSys reconfigurable hardware for?cryptography:?the?twofish?case   总被引:1,自引:0,他引:1  
This paper presents the mapping and performance analysis of the Twofish algorithm on MorphoSys. MorphoSys is a reconfigurable architecture that can provide high performance compared to custom hardware and yet preserves a level of flexibility compared to general-purpose processors. With today’s high demand for secure data transfer mediums including wired and wireless networks, there is a growing demand for real-time implementation of cryptographic algorithms. The choice of the Twofish algorithm, one of the five AES finalists, is because it is computationally intensive algorithm. It requires lookup tables, logical and arithmetic computations that stipulate high flexibility and performance. So it is a perfect algorithm to be mapped in order to evaluate such hardware.  相似文献   

16.
Massive scale of transactions with critical requirements become popular for emerging businesses, especially in E-commerce. One of the most representative applications is the promotional event running on Alibaba’s platform on some special dates, widely expected by global customers. Although we have achieved significant progress in improving the scalability of transactional database systems (OLTP), the presence of contention operations in workloads is still one of the fundamental obstacles to performance improving. The reason is that the overhead of managing conflict transactions with concurrency control mechanisms is proportional to the amount of contentions. As a consequence, generating contented workloads is urgent to evaluate performance of modern OLTP database systems. Though we have kinds of standard benchmarks which provide some ways in simulating contentions, e.g., skew distribution control of transactions, they can not control the generation of contention quantitatively; even worse, the simulation effectiveness of these methods is affected by the scale of data. So in this paper we design a scalable quantitative contention generation method with fine contention granularity control. We conduct a comprehensive set of experiments on popular opensourced DBMSs compared with the latest contention simulation method to demonstrate the effectiveness of our generation work.  相似文献   

17.
Information systems are the glue between people and computers. Both the social and business environments are in a continual, some might say chaotic, state of change while computer hardware continues to double its performance about every 18 months. This presents a major challenge for information system developers.  The term user-friendly is an old one, but one which has come to take on a multitude of meanings. However, in today’s context we might well take a user-friendly system to be one where the technology fits the user’s cognitive models of the activity in hand. This article looks at the relationship between information systems and the changing demands of their users as the underlying theme for the current issue of Cognition, Technology and Work.  People, both as individuals and organisations, change. The functionalist viewpoint, which attempts to freeze and inhibit such change, has failed systems developers on numerous occasions. Responding to, and building on, change in the social environment is still a significant research issue for information systems specialists who need to be able to create living information systems.  相似文献   

18.
In traditional database management systems, queries are intended to retrieve data which satisfy crisp criteria. In some cases, this lack of flexibility leads to empty answers. That is one of the reasons why we have been investigating the extension of these systems so that they become able to support imprecise querying capabilities. In this article, the introduction of imprecise queries in a particular nonrelational system (Information Warehouse) is presented. One of the main interesting aspects of this work resides in the specific data model for which the semantics of fuzzy queries firstly has to be defined. © 1994 John Wiley & Sons, Inc.  相似文献   

19.
In the petroleum industry, new technologies and work processes are currently being developed as an innovation strategy for better, faster and safer drilling. In this article, some features of today’s work processes that contribute to successful operations are presented and discussed. The articulation work involved in handling the transient complexity of operations involves making black-boxed and invisible work processes visible and transparent. It is argued that this articulation work contributes to the organisation’s understanding and knowledge of the drilling processes and the dependencies that exist between different actors. In addition to contributing to ongoing problem solving, the articulation work also contributes to the awareness of possible future events. Following this insight, it is argued that efforts to improve operational efficiency and safety by introducing new tools and work processes should focus not only on the capability of new tools to support decisions and actions by instrumentation and automation, but attention should also be paid to the existing articulation work and its role in the accomplishment of work. In that way, the contributions of today’s articulation work can be strengthened instead of lost, and the outcome of the change processes can be even better than anticipated.  相似文献   

20.
A review of the current air traffic control system is undertaken from the perspective of human centered design, focusing on the development of today’s system, the problems in today’s system, and the challenges going forward. Today’s system evolved around the operators in the system (mainly air traffic controllers and pilots), rather than being designed based on specific engineering analyses. This human centered focus has helped make air transportation remarkably safe, but has also made the air traffic control system somewhat inscrutable. This opaqueness of how the system operates poses significant problems for current attempts to transform the system into its “next generation” with significantly improved capacity. Research advances in human centered computing research required in order for this transformation work to proceed are discussed, specifically advances in computing the safety of complex human-integrated systems, understanding and measuring situation awareness, and visualizations of complex data.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号