首页 | 本学科首页   官方微博 | 高级检索  
文章检索
  按 检索   检索词:      
出版年份:   被引次数:   他引次数: 提示:输入*表示无穷大
  收费全文   82篇
  免费   6篇
  国内免费   1篇
化学工业   9篇
金属工艺   2篇
机械仪表   1篇
建筑科学   2篇
能源动力   1篇
轻工业   27篇
无线电   8篇
一般工业技术   4篇
冶金工业   5篇
自动化技术   30篇
  2022年   2篇
  2021年   2篇
  2020年   3篇
  2019年   1篇
  2018年   1篇
  2017年   3篇
  2016年   11篇
  2015年   2篇
  2014年   3篇
  2013年   9篇
  2012年   7篇
  2011年   11篇
  2010年   4篇
  2009年   4篇
  2008年   6篇
  2007年   2篇
  2006年   2篇
  2005年   1篇
  2003年   1篇
  2001年   1篇
  1999年   1篇
  1997年   1篇
  1996年   5篇
  1993年   1篇
  1992年   2篇
  1991年   2篇
  1979年   1篇
排序方式: 共有89条查询结果,搜索用时 31 毫秒
1.
2.
An efficient algorithm to remove redundant dependences in simple loops with constant dependences is presented. Dependences constrain the parallel execution of programs and are typically enforced by synchronization instructions. The synchronization instructions represent a significant part of the overhead in the parallel execution of a program. Some program dependences are redundant because they are covered by other dependences. It is shown that unlike with single loops, in the case of nested loops, a particular dependence may be redundant at some iterations but not redundant at others, so that the redundancy of a dependence may not be uniform over the entire iteration space. A sufficient condition for the uniformity of redundancy in a doubly nested loop is developed  相似文献   
3.
A solution to the problem of partitioning data for distributed memory machines is discussed. The solution uses a matrix notation to describe array accesses in fully parallel loops, which allows the derivation of sufficient conditions for communication-free partitioning (decomposition) of arrays. A series of examples that illustrate the effectiveness of the technique for linear references, the use of loop transformations in deriving the necessary data decompositions, and a formulation that aids in deriving heuristics for minimizing a communication when communication-free partitions are not feasible are presented  相似文献   
4.
Complex parallel applications can often be modeled as directed acyclic graphs of coarse-grained application tasks with dependences. These applications exhibit both task and data parallelism, and combining these two (also called mixed parallelism) has been shown to be an effective model for their execution. In this paper, we present an algorithm to compute the appropriate mix of task and data parallelism required to minimize the parallel completion time (makespan) of these applications. In other words, our algorithm determines the set of tasks that should be run concurrently and the number of processors to be allocated to each task. The processor allocation and scheduling decisions are made in an integrated manner and are based on several factors such as the structure of the task graph, the runtime estimates and scalability characteristics of the tasks, and the intertask data communication volumes. A locality-conscious scheduling strategy is used to improve intertask data reuse. Evaluation through simulations and actual executions of task graphs derived from real applications and synthetic graphs shows that our algorithm consistently generates schedules with a lower makespan as compared to Critical Path Reduction (CPR) and Critical Path and Allocation (CPA), two previously proposed scheduling algorithms. Our algorithm also produces schedules that have a lower makespan than pure task- and data-parallel schedules. For task graphs with known optimal schedules or lower bounds on the makespan, our algorithm generates schedules that are closer to the optima than other scheduling approaches.  相似文献   
5.
Scheduling, in many application domains, involves optimization of multiple performance metrics. For example, application workflows with real-time constraints have strict throughput requirements and also desire a low latency or response time. In this paper, we present a novel algorithm for the scheduling of workflows that act on a stream of input data. Our algorithm focuses on the two performance metrics, latency and throughput, and minimizes the latency of workflows while satisfying strict throughput requirements. We also describe steps to use the above approach to solve the problem of meeting latency requirements while maximizing throughput. We leverage pipelined, task and data parallelism in a coordinated manner to meet these objectives and investigate the benefit of task duplication in alleviating communication overheads in the pipelined schedule for different workflow characteristics. The proposed algorithm is designed for a realistic bounded multi-port communication model, where each processor can simultaneously communicate with at most k distinct processors. Experimental evaluation using synthetic benchmarks as well as those derived from real applications shows that our algorithm consistently produces lower latency schedules that meet throughput requirements, even when previously proposed schemes fail.  相似文献   
6.
Parallel algorithms for several common problems such as sorting and the FFT involve a personalized exchange of data among all the processors. Past approaches to doing complete exchange have taken one of two broad approaches: direct exchange or the indirect message-combining approaches. While combining approaches reduce the number of message startups, direct exchange minimizes the volume of data transmitted. This paper presents a family of hybrid algorithms for wormhole-routed 2D meshes that can effectively utilize the complementary strengths of these two approaches to complete exchange. The performance of hybrid algorithms using Cyclic Exchange and Scott's Direct Exchange are studied using analytical models, simulation, and implementation on a Cray T3D system. The results show that hybrids achieve lower completion times than either pure algorithm for a range of mesh sizes, data block sizes, and message startup costs. It is also demonstrated that barriers may be used to enhance performance by reducing message contention, whether or not the target system provides hardware support for barrier synchronization. The analytical models are shown useful in selecting the optimum hybrid for any given combination of system parameters (mesh size, message startup time, flit transfer time, and barrier cost) and the problem parameter (data block size)  相似文献   
7.
8.
9.
Assignment of military personnel to positions is very demanding, primarily a manual process performed by detailers. Detailers try to satisfy needs and preferences of commands and personnel. In this paper, an analytic hierarchy process (AHP) and two sided matching based Decision Support System is proposed to assist detailers. The DSS is programmed to generate positions’ preferences from position requirement profiles and personnel competence profiles by using analytic hierarchy process and matches personnel to positions by using two-sided matching. The use of the proposed DSS is demonstrated with an example. Also, the effects of preference list length on two-sided matching are examined.  相似文献   
10.
Quantum Monte Carlo (QMC) applications perform simulation with respect to an initial state of the quantum mechanical system, which is often captured by using a cubic B‐spline basis. This representation is stored as a read‐only table of coefficients and accesses to the table are generated at random as part of the Monte Carlo simulation. Current QMC applications, such as QWalk and QMCPACK, replicate this table at every process or node, which limits scalability because increasing the number of processors does not enable larger systems to be run. We present a partitioned global address space approach to transparently managing this data using Global Arrays in a manner that allows the memory of multiple nodes to be aggregated. We develop an automated data management system that significantly reduces communication overheads, enabling new capabilities for QMC codes. Experimental results with QWalk and QMCPACK demonstrate the effectiveness of the data management system. Copyright © 2016 John Wiley & Sons, Ltd.  相似文献   
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号