首页 | 本学科首页   官方微博 | 高级检索  
文章检索
  按 检索   检索词:      
出版年份:   被引次数:   他引次数: 提示:输入*表示无穷大
  收费全文   46篇
  免费   0篇
  国内免费   2篇
化学工业   8篇
金属工艺   2篇
轻工业   1篇
无线电   6篇
一般工业技术   4篇
冶金工业   9篇
自动化技术   18篇
  2019年   1篇
  2018年   1篇
  2017年   2篇
  2016年   6篇
  2014年   1篇
  2013年   9篇
  2012年   1篇
  2011年   2篇
  2009年   1篇
  2006年   1篇
  2005年   1篇
  2001年   1篇
  1997年   1篇
  1996年   8篇
  1992年   2篇
  1991年   2篇
  1990年   1篇
  1986年   1篇
  1982年   1篇
  1981年   2篇
  1978年   1篇
  1977年   1篇
  1976年   1篇
排序方式: 共有48条查询结果,搜索用时 15 毫秒
1.
An efficient algorithm to remove redundant dependences in simple loops with constant dependences is presented. Dependences constrain the parallel execution of programs and are typically enforced by synchronization instructions. The synchronization instructions represent a significant part of the overhead in the parallel execution of a program. Some program dependences are redundant because they are covered by other dependences. It is shown that unlike with single loops, in the case of nested loops, a particular dependence may be redundant at some iterations but not redundant at others, so that the redundancy of a dependence may not be uniform over the entire iteration space. A sufficient condition for the uniformity of redundancy in a doubly nested loop is developed  相似文献   
2.
A solution to the problem of partitioning data for distributed memory machines is discussed. The solution uses a matrix notation to describe array accesses in fully parallel loops, which allows the derivation of sufficient conditions for communication-free partitioning (decomposition) of arrays. A series of examples that illustrate the effectiveness of the technique for linear references, the use of loop transformations in deriving the necessary data decompositions, and a formulation that aids in deriving heuristics for minimizing a communication when communication-free partitions are not feasible are presented  相似文献   
3.
Complex parallel applications can often be modeled as directed acyclic graphs of coarse-grained application tasks with dependences. These applications exhibit both task and data parallelism, and combining these two (also called mixed parallelism) has been shown to be an effective model for their execution. In this paper, we present an algorithm to compute the appropriate mix of task and data parallelism required to minimize the parallel completion time (makespan) of these applications. In other words, our algorithm determines the set of tasks that should be run concurrently and the number of processors to be allocated to each task. The processor allocation and scheduling decisions are made in an integrated manner and are based on several factors such as the structure of the task graph, the runtime estimates and scalability characteristics of the tasks, and the intertask data communication volumes. A locality-conscious scheduling strategy is used to improve intertask data reuse. Evaluation through simulations and actual executions of task graphs derived from real applications and synthetic graphs shows that our algorithm consistently generates schedules with a lower makespan as compared to Critical Path Reduction (CPR) and Critical Path and Allocation (CPA), two previously proposed scheduling algorithms. Our algorithm also produces schedules that have a lower makespan than pure task- and data-parallel schedules. For task graphs with known optimal schedules or lower bounds on the makespan, our algorithm generates schedules that are closer to the optima than other scheduling approaches.  相似文献   
4.
Scheduling, in many application domains, involves optimization of multiple performance metrics. For example, application workflows with real-time constraints have strict throughput requirements and also desire a low latency or response time. In this paper, we present a novel algorithm for the scheduling of workflows that act on a stream of input data. Our algorithm focuses on the two performance metrics, latency and throughput, and minimizes the latency of workflows while satisfying strict throughput requirements. We also describe steps to use the above approach to solve the problem of meeting latency requirements while maximizing throughput. We leverage pipelined, task and data parallelism in a coordinated manner to meet these objectives and investigate the benefit of task duplication in alleviating communication overheads in the pipelined schedule for different workflow characteristics. The proposed algorithm is designed for a realistic bounded multi-port communication model, where each processor can simultaneously communicate with at most k distinct processors. Experimental evaluation using synthetic benchmarks as well as those derived from real applications shows that our algorithm consistently produces lower latency schedules that meet throughput requirements, even when previously proposed schemes fail.  相似文献   
5.
Parallel algorithms for several common problems such as sorting and the FFT involve a personalized exchange of data among all the processors. Past approaches to doing complete exchange have taken one of two broad approaches: direct exchange or the indirect message-combining approaches. While combining approaches reduce the number of message startups, direct exchange minimizes the volume of data transmitted. This paper presents a family of hybrid algorithms for wormhole-routed 2D meshes that can effectively utilize the complementary strengths of these two approaches to complete exchange. The performance of hybrid algorithms using Cyclic Exchange and Scott's Direct Exchange are studied using analytical models, simulation, and implementation on a Cray T3D system. The results show that hybrids achieve lower completion times than either pure algorithm for a range of mesh sizes, data block sizes, and message startup costs. It is also demonstrated that barriers may be used to enhance performance by reducing message contention, whether or not the target system provides hardware support for barrier synchronization. The analytical models are shown useful in selecting the optimum hybrid for any given combination of system parameters (mesh size, message startup time, flit transfer time, and barrier cost) and the problem parameter (data block size)  相似文献   
6.
7.
8.
Cobalt oxide thin films are prepared by the nebulizer spray pyrolysis technique using cobalt chloride as the precursor material. The structural, optical, morphological and electrical properties are investigated as a function of substrate temperature (300–450 °C). The X-ray diffraction (XRD) analysis reveals that all the films are polycrystalline in nature, having cubic structure with preferential orientation along the (111) plane. The optical spectra show that the films are transparent (68 %) in the IR region. The optical band gap values are calculated for different substrate temperature. Photoluminescence (PL) spectra of the films indicate the presence of indigo, blue and green emission peaks with an ultraviolet emission peak centered around 368nm. SEM images reveals small sphere-like structures for the prepared Co3O4 films. The maximum conductivity obtained is 1.48 x 10?3 S/cm at 350 °C. The activation energy varies between 0.039 and 0.138 eV for the substrate temperature variation from 300-450 Q°C.  相似文献   
9.
Quantum Monte Carlo (QMC) applications perform simulation with respect to an initial state of the quantum mechanical system, which is often captured by using a cubic B‐spline basis. This representation is stored as a read‐only table of coefficients and accesses to the table are generated at random as part of the Monte Carlo simulation. Current QMC applications, such as QWalk and QMCPACK, replicate this table at every process or node, which limits scalability because increasing the number of processors does not enable larger systems to be run. We present a partitioned global address space approach to transparently managing this data using Global Arrays in a manner that allows the memory of multiple nodes to be aggregated. We develop an automated data management system that significantly reduces communication overheads, enabling new capabilities for QMC codes. Experimental results with QWalk and QMCPACK demonstrate the effectiveness of the data management system. Copyright © 2016 John Wiley & Sons, Ltd.  相似文献   
10.
A powerful new finite boundary concept of seeking field solution, at few selected regions in the solution domain, is introduced. Also a highly economical finite boundary method (FBM) which would greatly reduce the size of the coefficient matrix of the resulting system of simultaneous algebraic equations, requiring lesser computer memory and lesser computing time, is developed. Fluid flow fields governed by the basic elliptic partial differential equations—the Laplace's and the Poisson's equations—in two independent variables are mainly considered. The computational merits of the FBM are shown by solving, as an example, a simple representative flow problem, and the relevant computational finite boundary formulae are given in tabular form. The formulae are numerically derived based on a generalized method presented here. The added feature of the FBM is that it proves to be equally economical even when the solution is sought in the entire flow domain. The problem of steady-state viscous flows governed by the Navier-Stokes equations, the system of two simultaneous partial differential equations—the Poisson's equation and the vorticity transport equation—makes the FBM doubly economical. The possibility of developing an efficient hybrid computational algorithm, for curved problem boundaries, in conjunction with the finite element method, is discussed. The extension of FBM to transient, non-elliptic problems and to three-dimensional problem fields is also indicated. The FBM has been discussed in a more detailed manner so as to clearly bring out the advantages of the new finite boundary concept.  相似文献   
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号