首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到11条相似文献,搜索用时 0 毫秒
1.
Parallel joins have been widely studied during the past decade and a number of efficient algorithms were presented. While it is known that the performance of these algorithms may suffer greatly in the presence of skewed input data, the work on load balancing schemes for parallel join has been limited. The main contribution of this paper is the development and analysis of a new distributed data structure and an effective load balancing scheme for parallel main memory hash join on NUMA architecture. Multiprocessors based on this architecture are scalable in both size of main memory and number of processors, and provide very high memory bandwidth. The load balancing scheme is based on random probing to avoid the hot spot problems caused by probing sequentially. We have modeled this load balancing scheme both analytically and experimentally. The experiments were run on a BBN TC2000 multiprocessor system  相似文献   

2.
Powerful storage, high performance and scalability are the most important issues for analytical databases. These three factors interact with each other, for example, powerful storage needs less scalability but higher performance, high performance means less consumption of indexes and other materializations for storage and fewer processing nodes, larger scale relieves stress on powerful storage and the high performance processing engine. Some analytical databases (ParAccel, Teradata) bind their performance with advanced hardware supports, some (Asterdata, Greenplum) rely on the high scalability framework of MapReduce, some (MonetDB, Sybase IQ, Vertica) highlight performance on processing engine and storage engine. All these approaches can be integrated into an storage-performance-scalability (S-P-S) model, and future large scale analytical processing can be built on moderate clusters to minimize expensive hardware dependency. The most important thing is a simple software framework is fundamental to maintain pace with the development of hardware technologies. In this paper, we propose a schema-aware on-line analytical processing (OLAP) model with deep optimization from native features of the star or snowflake schema. The OLAP model divides the whole process into several stages, each stage pipes its output to the next stage, we minimize the size of output data in each stage, whether in central processing or clustered processing. We extend this mechanism to cluster processing using two major techniques, one is using NetMemory as a broadcasting protocol based dimension mirror synchronizing buffer, the other is predicate-vector based DDTA-OLAP cluster model which can minimize the data dependency of star-join using bitmap vectors. Our OLAP model aims to minimize network transmission cost (MiNT in short) for OLAP clusters and support a scalable but simple distributed storagemodel for large scale clustering processing. Finally, the experimental results show the speedup and scalability performance.  相似文献   

3.
Schemaless databases, and document-oriented databases in particular, are preferred to relational ones for storing heterogeneous data with variable schemas and structural forms. However, the absence of a unique schema adds complexity to analytical applications, in which a single analysis often involves large sets of data with different schemas. In this paper we propose an original approach to OLAP on collections stored in document-oriented databases. The basic idea is to stop fighting against schema variety and welcome it as an inherent source of information wealth in schemaless sources. Our approach builds on four stages: schema extraction, schema integration, FD enrichment, and querying; these stages are discussed in detail in the paper. To make users aware of the impact of schema variety, we propose a set of indicators inspired by the definition of attribute density. Finally, we experimentally evaluate our approach in terms of efficiency and effectiveness.  相似文献   

4.
This paper shows how trees can be stored in a very compact form, called ‘Bonsai’, using hash tables. A method is described that is suitable for large trees that grow monotonically within a predefined maximum size limit. Using it, pointers in any tree can be represented within 6 + [log2n] bits per node where n is the maximum number of children a node can have. We first describe a general way of storing trees in hash tables, and then introduce the idea of compact hashing which underlies the Bonsai structure. These two techniques are combined to give a compact representation of trees, and a practical methodology is set out to permit the design of these structures. The new representation is compared with two conventional tree implementations in terms of the storage required per node. Examples of programs that must store large trees within a strict maximum size include those that operate on trie structures derived from natural language text. We describe how the Bonsai technique has been applied to the trees that arise in text compression and adaptive prediction, and include a discussion of the design parameters that work well in practice.  相似文献   

5.
Algorithms for parallel memory,II: Hierarchical multilevel memories   总被引:1,自引:0,他引:1  
In this paper we introduce parallel versions of two hierarchical memory models and give optimal algorithms in these models for sorting, FFT, and matrix multiplication. In our parallel models, there areP memory hierarchies operating simultaneously; communication among the hierarchies takes place at a base memory level. Our optimal sorting algorithm is randomized and is based upon the probabilistic partitioning technique developed in the companion paper for optimal disk sorting in a two-level memory with parallel block transfer. The probability of using/times the optimal running time is exponentially small in (log ) logP.A summarized version of this research was presented at the 22nd Annual ACM Symposium on Theory of Computing, Baltimore, MD, May 1990. This work was done while the first author was at Brown University. Support was provided in part by a National Science Foundation Presidential Young Investigator Award with matching funds from IBM, by NSF Research Grants DCR-8403613 and CCR-9007851, by Army Research Office Grant DAAL03-91-G-0035, and by the Office of Naval Research and the Defense Advanced Research Projects Agency under Contract N00014-91-J-4052 ARPA Order 8225. This work was done in part while the second author was at Brown University supported by a Bellcore graduate fellowship and at Bellcore.  相似文献   

6.
Each year many people suffer an accident or illness which leaves them with a permanent memory impairment. As a result they will fail to remember to do things, making it difficult for them to maintain employment or independent leisure activities. Conventional memory aids offer little assistance because their use depends on relatively normal memory functioning. A system designed specifically for memory-impaired people has been developed recently in the USA, however. NeuroPage uses a combination of computing and telecommunications to store and transmit reminders to a pager worn by the user. The system is designed to place minimal demands on memory. This paper presents a case study evaluation of NeuroPage in use by a young man with a severe memory impairment. The pager messages led to a marked increase in the probability that he would carry out his intended tasks. Issues of usability and acceptability of the system to the user and his family are also reported. A number of possible developments to the system are considered, including the incorporation of artificial intelligence and context-sensitivity. Broader issues in the design of memory aids for people with and without memory impairments are also considered, as well as the potential role of memory-impaired people in the design of technology for their own use.  相似文献   

7.
A parallel workload balanced and memory efficient lattice-Boltzmann algorithm for laminar Newtonian fluid flow through large porous media is investigated. It relies on a simplified LBM scheme using a single unit BGK relaxation time, which is implemented by means of a shift algorithm and comprises an even fluid node partitioning domain decomposition strategy based on a vector data structure. It provides perfect parallel workload balance, and its two-nearest-neighbour communication pattern combined with a simple data transfer layout results in 20-55% lower communication cost, 25-60% higher computational parallel performance and 40-90% lower memory usage than previously reported LBM algorithms. Performance tests carried out using scale-up and speed-up case studies of laminar Newtonian fluid flow through hexagonal packings of cylinders and a random packing of polydisperse spheres on two different computer architectures reveal parallel efficiencies with 128 processors as high as 75% for domain sizes comprising more than 5 billion fluid nodes.  相似文献   

8.
Web-facing applications are expected to provide certain performance guarantees despite dynamic and continuous workload changes. As a result, application owners are using cloud computing as it offers the ability to dynamically provision computing resources (e.g., memory, CPU) in response to changes in workload demands to meet performance targets and eliminates upfront costs. Horizontal, vertical, and the combination of the two are the possible dimensions that cloud application can be scaled in terms of the allocated resources. In vertical elasticity as the focus of this work, the size of virtual machines (VMs) can be adjusted in terms of allocated computing resources according to the runtime workload. A commonly used vertical resource elasticity approach is realized by deciding based on resource utilization, named capacity-based. While a new trend is to use the application performance as a decision making criterion, and such an approach is named performance-based. This paper discusses these two approaches and proposes a novel hybrid elasticity approach that takes into account both the application performance and the resource utilization to leverage the benefits of both approaches. The proposed approach is used in realizing vertical elasticity of memory (named as vertical memory elasticity), where the allocated memory of the VM is auto-scaled at runtime. To this aim, we use control theory to synthesize a feedback controller that meets the application performance constraints by auto-scaling the allocated memory, i.e., applying vertical memory elasticity. Different from the existing vertical resource elasticity approaches, the novelty of our work lies in utilizing both the memory utilization and application response time as decision making criteria. To verify the resource efficiency and the ability of the controller in handling unexpected workloads, we have implemented the controller on top of the Xen hypervisor and performed a series of experiments using the RUBBoS interactive benchmark application, under synthetic and real workloads including Wikipedia and FIFA. The results reveal that the hybrid controller meets the application performance target with better performance stability (i.e., lower standard deviation of response time), while achieving a high memory utilization (close to 83%), and allocating less memory compared to all other baseline controllers.  相似文献   

9.
In the past decade, advances in the speed of commodity CPUs have far out-paced advances in memory latency. Main-memory access is therefore increasingly a performance bottleneck for many computer applications, including database systems. In this article, we use a simple scan test to show the severe impact of this bottleneck. The insights gained are translated into guidelines for database architecture, in terms of both data structures and algorithms. We discuss how vertically fragmented data structures optimize cache performance on sequential data access. We then focus on equi-join, typically a random-access operation, and introduce radix algorithms for partitioned hash-join. The performance of these algorithms is quantified using a detailed analytical model that incorporates memory access cost. Experiments that validate this model were performed on the Monet database system. We obtained exact statistics on events such as TLB misses and L1 and L2 cache misses by using hardware performance counters found in modern CPUs. Using our cost model, we show how the carefully tuned memory access pattern of our radix algorithms makes them perform well, which is confirmed by experimental results. Received April 20, 2000 / Accepted June 23, 2000  相似文献   

10.
The semiconductor and thin-film-transistor–liquid-crystal-display (TFT-LCD) industries widely value Automatic Virtual Metrology System (AVMS). AVMS needs to handle a large volume of VM-related data, which may cause poor internal database performance. In general, AVMS adopts efficient but expensive commercial database management systems (DBMSs) to yield good AVMS performance. This usually makes the AVMS construction cost very high. Therefore, the industries require a novel AVMS architecture with lower cost and greater efficiency in database. This paper proposes a novel AVMS architecture based on Main Memory Database (MMDB) technology. Specifically, the MMDB is used to improve the performance bottlenecks of the current Disk Resident Database (DRDB). Also, we design automatic data-backup and automatic data-query sources integration mechanisms to effectively relieve rapidly increased data volume in the original AVMS architecture. In addition, the novel AVMS architecture adopts a free commercial MMDB to significantly reduce total system cost. Integrated testing results show that the proposed AVMS architecture and developed technologies can enable the AVMS to have better data-storage efficiency, superior data-query performance, and lower database cost. The proposed AVMS architecture and research results in this paper can be a useful reference for TFT-LCD manufacturing companies in constructing their own AVM systems. The proposed AVMS architecture can also be applied in the semiconductor and solar-cell industries.  相似文献   

11.
A simplified miniaturized wideband balun design covering (0.38‐3.5 GHz) is presented. The broadband balun structure occupies a small area of 20.7 mm × 20.8 mm. The balun is designed using low loss double‐sided parallel strip lines and is comprised of a two‐stage Wilkinson divider followed by loading, symmetrically, the two output ports. One port with a phase inverter circuit; while a very similar but non‐inverting circuit is placed in the other port for loading‐balance compensation. To realize the balun's function, different smooth transitions have been employed and were accounted for. The fabricated balun circuit demonstrated phase and amplitude imbalance of less than ±5° and ± 0.4 dB, respectively, over the band.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号