期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

全文获取类型

收费全文	46篇
免费	0篇
国内免费	2篇

专业分类

化学工业	8篇
金属工艺	2篇
轻工业	1篇
无线电	6篇
一般工业技术	4篇
冶金工业	9篇
自动化技术	18篇

出版年

2019年	1篇
2018年	1篇
2017年	2篇
2016年	6篇
2014年	1篇
2013年	9篇
2012年	1篇
2011年	2篇
2009年	1篇
2006年	1篇
2005年	1篇
2001年	1篇
1997年	1篇
1996年	8篇
1992年	2篇
1991年	2篇
1990年	1篇
1986年	1篇
1982年	1篇
1981年	2篇
1978年	1篇
1977年	1篇
1976年	1篇

排序方式： 共有48条查询结果，搜索用时 15 毫秒

1 [2] [3] [4] [5] 下一页 » 末页»

Removal of redundant dependences in DOACROSS loops with constantdependences

Krothapalli V.P. Sadayappan P. 《Parallel and Distributed Systems, IEEE Transactions on》1991,2(3):281-289

An efficient algorithm to remove redundant dependences in simple loops with constant dependences is presented. Dependences constrain the parallel execution of programs and are typically enforced by synchronization instructions. The synchronization instructions represent a significant part of the overhead in the parallel execution of a program. Some program dependences are redundant because they are covered by other dependences. It is shown that unlike with single loops, in the case of nested loops, a particular dependence may be redundant at some iterations but not redundant at others, so that the redundancy of a dependence may not be uniform over the entire iteration space. A sufficient condition for the uniformity of redundancy in a doubly nested loop is developed 相似文献

Compile-time techniques for data distribution in distributed memorymachines

Ramanujam J. Sadayappan P. 《Parallel and Distributed Systems, IEEE Transactions on》1991,2(4):472-482

A solution to the problem of partitioning data for distributed memory machines is discussed. The solution uses a matrix notation to describe array accesses in fully parallel loops, which allows the derivation of sufficient conditions for communication-free partitioning (decomposition) of arrays. A series of examples that illustrate the effectiveness of the technique for linear references, the use of loop transformations in deriving the necessary data decompositions, and a formulation that aids in deriving heuristics for minimizing a communication when communication-free partitions are not feasible are presented 相似文献

An Integrated Approach to Locality-Conscious Processor Allocation and Scheduling of Mixed-Parallel Applications

Vydyanathan Naga Krishnamoorthy Sriram Sabin Gerald M. Catalyurek Umit V. Kurc Tahsin Sadayappan Ponnuswamy Saltz Joel H. 《Parallel and Distributed Systems, IEEE Transactions on》2009,20(8):1158-1172

Complex parallel applications can often be modeled as directed acyclic graphs of coarse-grained application tasks with dependences. These applications exhibit both task and data parallelism, and combining these two (also called mixed parallelism) has been shown to be an effective model for their execution. In this paper, we present an algorithm to compute the appropriate mix of task and data parallelism required to minimize the parallel completion time (makespan) of these applications. In other words, our algorithm determines the set of tasks that should be run concurrently and the number of processors to be allocated to each task. The processor allocation and scheduling decisions are made in an integrated manner and are based on several factors such as the structure of the task graph, the runtime estimates and scalability characteristics of the tasks, and the intertask data communication volumes. A locality-conscious scheduling strategy is used to improve intertask data reuse. Evaluation through simulations and actual executions of task graphs derived from real applications and synthetic graphs shows that our algorithm consistently generates schedules with a lower makespan as compared to Critical Path Reduction (CPR) and Critical Path and Allocation (CPA), two previously proposed scheduling algorithms. Our algorithm also produces schedules that have a lower makespan than pure task- and data-parallel schedules. For task graphs with known optimal schedules or lower bounds on the makespan, our algorithm generates schedules that are closer to the optima than other scheduling approaches. 相似文献

Optimizing latency and throughput of application workflows on clusters

Naga Vydyanathan Umit Catalyurek Tahsin Kurc Ponnuswamy Sadayappan Joel Saltz 《Parallel Computing》2011,37(10-11):694-712

Scheduling, in many application domains, involves optimization of multiple performance metrics. For example, application workflows with real-time constraints have strict throughput requirements and also desire a low latency or response time. In this paper, we present a novel algorithm for the scheduling of workflows that act on a stream of input data. Our algorithm focuses on the two performance metrics, latency and throughput, and minimizes the latency of workflows while satisfying strict throughput requirements. We also describe steps to use the above approach to solve the problem of meeting latency requirements while maximizing throughput. We leverage pipelined, task and data parallelism in a coordinated manner to meet these objectives and investigate the benefit of task duplication in alleviating communication overheads in the pipelined schedule for different workflow characteristics. The proposed algorithm is designed for a realistic bounded multi-port communication model, where each processor can simultaneously communicate with at most k distinct processors. Experimental evaluation using synthetic benchmarks as well as those derived from real applications shows that our algorithm consistently produces lower latency schedules that meet throughput requirements, even when previously proposed schemes fail. 相似文献

Hybrid algorithms for complete exchange in 2D meshes

Sundar N.S. Jayasimha D.N. Panda D.K. Sadayappan P. 《Parallel and Distributed Systems, IEEE Transactions on》2001,12(12):1201-1218

Parallel algorithms for several common problems such as sorting and the FFT involve a personalized exchange of data among all the processors. Past approaches to doing complete exchange have taken one of two broad approaches: direct exchange or the indirect message-combining approaches. While combining approaches reduce the number of message startups, direct exchange minimizes the volume of data transmitted. This paper presents a family of hybrid algorithms for wormhole-routed 2D meshes that can effectively utilize the complementary strengths of these two approaches to complete exchange. The performance of hybrid algorithms using Cyclic Exchange and Scott's Direct Exchange are studied using analytical models, simulation, and implementation on a Cray T3D system. The results show that hybrids achieve lower completion times than either pure algorithm for a range of mesh sizes, data block sizes, and message startup costs. It is also demonstrated that barriers may be used to enhance performance by reducing message contention, whether or not the target system provides hardware support for barrier synchronization. The analytical models are shown useful in selecting the optimum hybrid for any given combination of system parameters (mesh size, message startup time, flit transfer time, and barrier cost) and the problem parameter (data block size) 相似文献

A Special Issue of Journal of Parallel and Distributed Computing: Domain-Specific Languages and High-Level Frameworks for High-Performance Computing

Sriram Krishnamoorthy J. Ramanujam P. Sadayappan 《Journal of Parallel and Distributed Computing》2013

相似文献

Evaluation of Cooling Rate Effects on the Mechanical Properties of Die Cast Magnesium Alloy AM60

P. Sharifi Y. Fan H. B. Anaraki A. Banerjee K. Sadayappan J. T. Wood 《Metallurgical and Materials Transactions A》2016,47(10):5159-5168

相似文献

Influence of Substrate Temperature on the Properties of Cobalt Oxide Thin Films Prepared by Nebulizer Spray Pyrolysis (NSP) Technique

M. Manickam V. Ponnuswamy C. Sankar R. Mariappan R. Suresh 《SILICON》2016,8(3):351-360

Cobalt oxide thin films are prepared by the nebulizer spray pyrolysis technique using cobalt chloride as the precursor material. The structural, optical, morphological and electrical properties are investigated as a function of substrate temperature (300–450 °C). The X-ray diffraction (XRD) analysis reveals that all the films are polycrystalline in nature, having cubic structure with preferential orientation along the (111) plane. The optical spectra show that the films are transparent (68 %) in the IR region. The optical band gap values are calculated for different substrate temperature. Photoluminescence (PL) spectra of the films indicate the presence of indigo, blue and green emission peaks with an ultraviolet emission peak centered around 368nm. SEM images reveals small sphere-like structures for the prepared Co3O4 films. The maximum conductivity obtained is 1.48 x 10?3 S/cm at 350 °C. The activation energy varies between 0.039 and 0.138 eV for the substrate temperature variation from 300-450 Q°C. 相似文献

Global‐view coefficients: a data management solution for parallel quantum Monte Carlo applications

Qingpeng Niu James Dinan Sravya Tirukkovalur Anouar Benali Jeongnim Kim Lubos Mitas Lucas Wagner P. Sadayappan 《Concurrency and Computation》2016,28(13):3655-3671

Quantum Monte Carlo (QMC) applications perform simulation with respect to an initial state of the quantum mechanical system, which is often captured by using a cubic B‐spline basis. This representation is stored as a read‐only table of coefficients and accesses to the table are generated at random as part of the Monte Carlo simulation. Current QMC applications, such as QWalk and QMCPACK, replicate this table at every process or node, which limits scalability because increasing the number of processors does not enable larger systems to be run. We present a partitioned global address space approach to transparently managing this data using Global Arrays in a manner that allows the memory of multiple nodes to be aggregated. We develop an automated data management system that significantly reduces communication overheads, enabling new capabilities for QMC codes. Experimental results with QWalk and QMCPACK demonstrate the effectiveness of the data management system. Copyright © 2016 John Wiley & Sons, Ltd. 相似文献

10.

A finite boundary method for fluid flow field computations

V. Ponnuswamy 《International journal for numerical methods in engineering》1982,18(12):1765-1784

A powerful new finite boundary concept of seeking field solution, at few selected regions in the solution domain, is introduced. Also a highly economical finite boundary method (FBM) which would greatly reduce the size of the coefficient matrix of the resulting system of simultaneous algebraic equations, requiring lesser computer memory and lesser computing time, is developed. Fluid flow fields governed by the basic elliptic partial differential equations—the Laplace's and the Poisson's equations—in two independent variables are mainly considered. The computational merits of the FBM are shown by solving, as an example, a simple representative flow problem, and the relevant computational finite boundary formulae are given in tabular form. The formulae are numerically derived based on a generalized method presented here. The added feature of the FBM is that it proves to be equally economical even when the solution is sought in the entire flow domain. The problem of steady-state viscous flows governed by the Navier-Stokes equations, the system of two simultaneous partial differential equations—the Poisson's equation and the vorticity transport equation—makes the FBM doubly economical. The possibility of developing an efficient hybrid computational algorithm, for curved problem boundaries, in conjunction with the finite element method, is discussed. The extension of FBM to transient, non-elliptic problems and to three-dimensional problem fields is also indicated. The FBM has been discussed in a more detailed manner so as to clearly bring out the advantages of the new finite boundary concept. 相似文献

1 [2] [3] [4] [5] 下一页 » 末页»