共查询到20条相似文献,搜索用时 0 毫秒
1.
2.
The Syte workstation architecture closely couples the graphics system and the processor to improve interactive performance and reduce hardware and software overhead without added support mechanisms. 相似文献
3.
4.
5.
Yawei Li Zhiling Lan Gujrati P. Xian-He Sun 《Parallel and Distributed Systems, IEEE Transactions on》2009,20(4):460-473
As the scale of parallel systems continues to grow, fault management of these systems is becoming a critical challenge. While existing research mainly focuses on developing or improving fault tolerance techniques, a number of key issues remain open. In this paper, we propose runtime strategies for spare node allocation and job rescheduling in response to failure prediction. These strategies, together with failure predictor and fault tolerance techniques, construct a runtime system called FARS (Fault-Aware Runtime System). In particular, we propose a 0-1 knapsack model and demonstrate its flexibility and effectiveness for reallocating running jobs to avoid failures. Experiments, by means of synthetic data and real traces from production systems, show that FARS has the potential to significantly improve system productivity (i.e., performance and reliability). 相似文献
6.
Data-intensive problems challenge conventional computing architectures with demanding CPU, memory, and I/O requirements. Experiments with three benchmarks suggest that emerging hardware technologies can significantly boost performance of a wide range of applications by increasing compute cycles and bandwidth and reducing latency. 相似文献
7.
High-performance reconfigurable computers have the potential to exploit coarse-grained functional parallelism as well as fine-grained instruction-level parallelism through direct hardware execution on FPGAs. 相似文献
8.
9.
We propose a new approach, called cluster-based search (CBS), for scheduling large task graphs in parallel on a heterogeneous cluster of workstations connected by a high-speed network (e.g., using an ATM switch at OC-3 speed). The CBS algorithm uses a parallel random neighborhood search which works by refining multiple different initial schedules simultaneously using different workstations. The workstations communicate periodically to exchange their best solutions found thus far in order to direct the search to more promising regions in the search space. Heterogeneity of machines is exploited by the biased partitioning of the search space. The parallel random neighborhood search is fault-tolerant in that the workload of a failed workstation is automatically redistributed to other workstations so that the search can continue. We have implemented the CBS algorithm as a core function of our on-going development of SSI middleware for a Sun workstation cluster. 相似文献
10.
Energy-Efficient Thermal-Aware Task Scheduling for Homogeneous High-Performance Computing Data Centers: A Cyber-Physical Approach 总被引:4,自引:0,他引:4
Qinghui Tang Gupta S.K.S. Varsamopoulos G. 《Parallel and Distributed Systems, IEEE Transactions on》2008,19(11):1458-1472
High Performance Computing data centers have been rapidly growing, both in number and in size. Thermal management of data centers can address dominant problems associated with cooling such as the recirculation of hot air from the equipment outlets to their inlets, and the appearance of hot spots. In this paper, we are looking into assigning the incoming tasks to machines of a data center in such a way so as to affect the heat recirculation and make cooling more efficient. Using a low complexity linear heat recirculation model, we formulate the problem of minimizing the peak inlet temperature within a data center through task assignment, consequently leading to minimal cooling power consumption. We also provide two methods to solve the formulation, one that uses a genetic algorithm and the other that uses sequential quadratic programming. We show through formalization that minimizing the peak inlet temperature allows for the lowest cooling power needs. Results from a simulated, small-scale data center show that solving the formulation leads to an inlet temperature distribution that is 2 °C to 5 °C lower compared to other approaches, and achieves about 20%-30% cooling energy savings at moderate data center utilization rates. Moreover, our algorithms consistently outperform MinHR, a recirculation-reducing placement algorithm in the literature. 相似文献
11.
12.
13.
As the prices of commodity workstations go down, clusters of workstations have started to emerge as a viable economic solution for scalable computing. Recent advances in networking technology have made it possible to obtain high-bandwidth connections between applications. However, the interconnect latency between workstation nodes in a cluster remains a serious concern and can prove to be the limiting factor in workstation performance. In this paper, we present the CNI orcluster network interface that achieves the twin goals of low latency and high bandwidth. In addition, CNI efficiently supports multiple programming paradigms for programming generality. This is done by functionally coupling the network interface more closely to the CPU without violating the constraints of a standard workstation architecture, CNI results in performance gains for applications, substantially reducing communication overhead and delay. 相似文献
14.
15.
With increasing richness in features such as personalization of content, Web applications are becoming increasingly complex and hence compute intensive. Traditional approaches for improving performance of static content Web sites have been based on the assumption that static content such as images are network intensive. However, these methods are not applicable to the dynamic content applications which are more compute intensive than static content. This paper proposes a suite of algorithms which jointly optimize the performance of dynamic content applications by reducing the client access times while also minimizing the resource utilization. A server migration algorithm allocates servers on-demand within a cluster such that the client access times are not affected even under sudden overload conditions. Further, a server selection mechanism enables statistical multiplexing of resources across clusters by redirecting requests away from overloaded clusters. We also propose a cluster decision algorithm which decides whether to migrate in additional servers at the local cluster or redirect requests remotely under different workload conditions. Through a combination of analytical modeling, trace-driven simulation over traces from large e-commerce sites and testbed implementation, we explore the performance savings achieved by the proposed algorithms. 相似文献
16.
The Promise of High-Performance Reconfigurable Computing 总被引:3,自引:0,他引:3
Several high-performance computers now use field-programmable gate arrays as reconfigurable coprocessors. The authors describe the two major contemporary HPRC architectures and explore the pros and cons of each using representative applications from remote sensing, molecular dynamics, bioinformatics, and cryptanalysis. 相似文献
17.
Multi-core CPUs,Clusters, and Grid Computing: A Tutorial 总被引:1,自引:0,他引:1
The nature of computing is changing and it poses both challenges and opportunities for economists. Instead of increasing clock
speed, future microprocessors will have “multi-cores” with separate execution units. “Threads” or other multi-processing techniques
that are rarely used today are required to take full advantage of them. Beyond one machine, it has become easy to harness
multiple computers to work in clusters. Besides dedicated clusters, they can be made up of unused lab computers or even your
colleagues’ machines. Finally, grids of computers spanning the Internet are now becoming a reality. 相似文献
18.
Editorial Introduction
Guest Editor Introduction for the Special Section on Commercial Applications for High-Performance Computing 相似文献19.
Jongwook Woo Jean-Luc Gaudiot Andrew L. Wendelborn 《International journal of parallel programming》2004,32(1):39-76
In this paper, a flow-sensitive, context-insensitive alias analysis in Java is proposed. It is more efficient and precise than previous analyses for C++, and it does not negatively affect the safety of aliased references. To this end, we first present a reference-set alias representation. Second, data-flow equations based on the propagation rules for the reference-set alias representation are introduced. The equations compute alias information more efficiently and precisely than previous analyses for C++. Third, for the constant time complexity of the type determination, a type table is introduced with reference variables and all possible types for each reference variable. Fourth, an alias analysis algorithm is proposed, which uses a popular iterative loop method for an alias analysis. Finally, running times of benchmark codes are compared for reference-set and existing object-pair representation. 相似文献