首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
This paper presents a mixed-integer programming model for a multi-floor layout design of cellular manufacturing systems (CMSs) in a dynamic environment. A novel aspect of this model is to concurrently determine the cell formation (CF) and group layout (GL) as the interrelated decisions involved in the design of a CMS in order to achieve an optimal (or near-optimal) design solution for a multi-floor factory in a multi-period planning horizon. Other design aspects are to design a multi-floor layout to form cells in different floors, a multi-rows layout of equal area facilities in each cell, flexible reconfigurations of cells during successive periods, distance-based material handling cost, and machine depot keeping idle machines. This model incorporates with an extensive coverage of important manufacturing features used in the design of CMSs. The objective is to minimize the total costs of intra-cell, inter-cell, and inter-floor material handling, purchasing machines, machine processing, machine overhead, and machine relocation. Two numerical examples are solved by the CPLEX software to verify the performance of the presented model and illustrate the model features. Since this model belongs to NP-hard class, an efficient genetic algorithm (GA) with a matrix-based chromosome structure is proposed to derive near-optimal solutions. To verify its computational efficiency in comparison to the CPLEX software, several test problems with different sizes and settings are implemented. The efficiency of the proposed GA in terms of the objective function value and computational time is proved by the obtained results.  相似文献   

2.
科学与工程应用对计算性能要求的不断增加使得异构计算得到了迅速发展,然而CPU与加速单元之间没有共享内存的特点增加了异构编程难度,编程人员必须显式地指定数据在不同设备之间的传递情况.全局数组(global arrays, GA)模型基于聚合远程内存拷贝接口(ARMCI)为分布式存储系统提供异步单边通信、共享内存的编程环境,但ARMCI接口拓展的复杂性使得GA不能根据特定计算平台的特点迅速在该平台上实现.CoGA模型是对GA模型的异构拓展,旨在为CPU+英特尔至强融核(MIC)的异构系统提供全局数组结构,隐藏数据传输细节从而简化异构编程难度.CoGA基于MIC上的对称传输接口(SCIF)实现对CPU和MIC的内存管理,并结合SCIF远程内存访问特点优化CPU与MIC间的数据传输性能.最后,通过数据传输带宽、通信延迟和稀疏矩阵乘问题的测试,证明了CoGA简化编程并优化数据传输性能的有效性和实用性.  相似文献   

3.
在分布式并行机上,数据布局的质量极大的影响着应用程序的执行性能,以往的研究一般将自动数据布局优化问题近似分解为数据对准优化和数据分布优化两步来解决,且对两者的结合只研究了一维的情况,在相关研究工作的基础上,在多维情况下将数据对准优化和数据分布优化结合在一个模型当中,提出了一个数据对准优化与数据分布优化统一的多维静态数据布局模型,避免了采用启发式策略,从而更加精确地描述了自动数据布局优化问题,同时给  相似文献   

4.
We present an improved version of the Parallel Programming Interface for Distributed Data with Multiple Helper Servers (PPIDDv2) library, which provides a common application programming interface that is based on the most frequently used functionality of both MPI-2 and GA. Compared with the previous version, the PPIDDv2 library introduces multiple helper servers to facilitate global data structures, and allows programmers to make heavy use of large global data structures efficiently.

Program summary

Program title: PPIDDv2Catalogue identifier: AEEF_v2_0Program summary URL:http://cpc.cs.qub.ac.uk/summaries/AEEF_v2_0.htmlProgram obtainable from: CPC Program Library, Queen?s University, Belfast, N. IrelandLicensing provisions: Standard CPC licence, http://cpc.cs.qub.ac.uk/licence/licence.htmlNo. of lines in distributed program, including test data, etc.: 22 997No. of bytes in distributed program, including test data, etc.: 184 477Distribution format: tar.gzProgramming language: Fortran, CComputer: Many parallel systemsOperating system: VariousHas the code been vectorised or parallelised?: Yes. 2–1024 processors usedRAM: 50 MbytesClassification: 6.5External routines: Global Arrays or MPI-2Catalogue identifier of previous version: AEEF_v1_0Journal reference of previous version: Comput. Phys. Comm. 180 (2009) 2673Does the new version supersede the previous version?: YesNature of problem: Many scientific applications require management and communication of data that is global, and the standard MPI-2 protocol provides only low-level methods for the required one-sided remote memory access.Solution method: The Parallel Programming Interface for Distributed Data (PPIDD) library provides an interface, suitable for use in parallel scientific applications, that delivers communications and global data management. The library can be built either using the Global Arrays (GA) toolkit, or a standard MPI-2 library. This abstraction allows the programmer to write portable parallel codes that can utilise the best, or only, communications library that is available on a particular computing platform.Reasons for new version: In the previous version, functionality in global data structure was mainly implemented by MPI-2 passive one-sided operations. In real applications which make heavy use of global data structures, very poor performance was observed.Summary of revisions: Multiple helper servers are introduced to facilitate the manipulation and management of global data structure. Mutual exclusion is also implemented by the help of a data server, and becomes much more robust and efficient. In addition, flexible options are provided to choose different settings for helper servers. Significant improvement has been seen in performance tests.Running time: Problem-dependent. The test provided with the distribution takes only a few seconds to run.  相似文献   

5.
Data transformation, an important part of report generation, converts the layout of source data into a new layout suitable for presentation. Many report tools have been developed for end-users to specify data transformation. However, current report tools only support a limited set of report layouts. This paper proposes a visual dataflow programming language, called VisualTPL, to resolve this problem. Data transformation is accomplished by writing graphical dataflow programs, which manipulate tables as first-class objects with a set of extendable table operations. A report tool, called VisualTPS, has been developed to offer an easy and intuitive end-user programming environment. Reports with sophisticated layouts can be created through top-down decomposition and incremental development. An evaluation has been conducted to assess end-users' performance with VisualTPL. The results indicated that end-users could learn VisualTPL in a short time and create complicated report layouts all by themselves. And, in comparison with a commercial report tool, VisualTPL offered end-users similar performances and was preferred over the commercial tool.  相似文献   

6.
Virtual cellular manufacturing system (VCMS) is one of the modern strategies in the production facilities layout, which has attracted considerable attention in recent years. In this system, machines are located in different positions on the shop floor and virtual cells are a logical grouping of machines, jobs, and workers from the viewpoint of the production control system. These features not only enhance the system’s agility but also allow a dynamic reassignment of cells as demand changes. This paper addresses the VCMS scheduling problems where the jobs have different orders on machines and the objective is to simultaneously minimize the weighted sum of the makespan and total traveling distance in order to create a balance between criteria. The research methodology firstly consists of a mathematical programming model with regard to the production constraints in order to describe the characteristics of the VCMS. Secondly, a basic genetic algorithm (GA), a biogeography-based optimization (BBO) algorithm, an algorithm based on hybridization of BBO and GA, and the BBO algorithm accompanied by restart phase are developed to solve the VCMS scheduling problems. The developed algorithms have been compared to each other and their performance are evaluated in terms of their best solution and computational time as effectiveness and efficiency criteria, respectively. Consequently, the performance of the best algorithm has been evaluated by the state-of-the-art algorithm, GA, in the literature. The results show that the best algorithm based on BBO could find solutions at least as good as the last famous algorithm, GA, in the literature.  相似文献   

7.
In this paper, the goal is to incorporate qualitative criteria in addition to quantitative criteria to facility layout design (FLD) problem. To this end, we present an integrated methodology based on the synthetic value of fuzzy judgments and nonlinear programming (SVFJ-NLP). The facility layout patterns (FLPs) together with their performance measures of total cost of material handling are generated by a computer-aided layout-design tool, CRAFT. Also, the performance measures of second quantitative criterion (construction cost of width walls) are calculated by appraising these FLPs. The SVFJ is then applied to collect the performance measures related to qualitative criteria and finally, a non-linear programming (NLP) model is proposed to solve the FLD. Results obtained from a real case study validate the effectiveness of the proposed model.  相似文献   

8.
Recently, several experimental studies have been conducted on block data layout in conjunction with tiling as a data transformation technique to improve cache performance. In this paper, we analyze cache and translation look-aside buffer (TLB) performance of such alternate layouts (including block data layout and Morton layout) when used in conjunction with tiling. We derive a tight lower bound on TLB performance for standard matrix access patterns, and show that block data layout and Morton layout achieve this bound. To improve cache performance, block data layout is used in concert with tiling. Based on the cache and TLB performance analysis, we propose a data block size selection algorithm that finds a tight range for optimal block size. To validate our analysis, we conducted simulations and experiments using tiled matrix multiplication, LU decomposition, and Cholesky factorization. For matrix multiplication, simulation results using UltraSparc II parameters show that tiling and block data layout with a block size given by our block size selection algorithm, reduces up to 93 percent of TLB misses compared with other techniques. The total miss cost is reduced considerably. Experiments on several platforms show that tiling with block data layout achieves up to 50 percent performance improvement over other techniques that use conventional layouts. Morton layout is also analyzed and compared with block data layout. Experimental results show that matrix multiplication using block data layout is up to 15 percent faster than that using Morton data layout.  相似文献   

9.
基于SLP方法的遗传算法在多目标布置设计中的应用   总被引:2,自引:0,他引:2  
在物流布置设计中,综合考虑了生产成本及作业单位之间关系的密切度等因素,在SLP方法的基础上引进多目标适值函数,提出了多目标布置设计的概念。采用Goldberg和Lingle提出的部分映射交叉方式(PMX)和互换变异相结合的遗传算法来求解具体的布置设计方案。通过实例验证了该算法的可行性。最后利用数值分析讨论了遗传算法在相应应用中的灵敏度及效果。  相似文献   

10.
Nial is a programming language designed around a mathematical treatment of data as nested arrays. A goal of the research described is to integrate within Nial a functional style of programming based on the theory of arrays with the declarative capabilities of a logic programming environment. This is partially accomplished by storing logic clauses as arrays which can be manipulated using logic clauses. Arrays as terms are considered as part of the syntax of the clauses. The approach to logic programming is based on providing a flexible environment for experimenting with full clausal or Horn clause logic. A variety of predefined control strategies and the capability for user-defined control strategies have been provided. The expressive capabilities of combining logic and functional programming styles provides a suitable language for many application areas. The philosophy and design behind a combined logic/database model used to prototype a knowledge-based systems application are described  相似文献   

11.
基于遗传算法的发动机模糊控制   总被引:2,自引:0,他引:2  
陈恒  张玉琢  左晓阳 《计算机仿真》2002,19(3):40-42,47
模糊控制不依赖被控对象的数学模型,模糊控制应用于航空发动机控制取得了较好的效果。但是,模糊控制设计,较多地依赖设计者的主观经验,工作量大、负担重。遗传算法具有良好的寻优能力,该文将遗传算法应用到了航空发动机模糊控制中,使大量的设计寻优工作,可以通过程序自动寻优,对遗传算法寻优的发动机模糊控制系统进行了仿真,结果表明遗传算法不但改进了发动机模糊控制器的设计,而且整个控制系统具有良好的性能。  相似文献   

12.
This paper focuses on the planar storage location assignment problem (PSLAP) that needs to be clearly defined and newly formulated. In addition, the solving procedure should be developed. The PSLAP can be defined as the assignment of the inbound and outbound objects to the storage yard with aim of minimizing the number of obstructive object moves. The storage yard allows only planar moves of objects. The PSLAP usually occurs in the assembly block stockyard operations at a shipyard. This paper formulates the PSLAP using a mathematical programming model, but which belongs to the NP-hard problems category. Thus this paper utilizes an efficient genetic algorithm (GA) to solve the PSLAP for real-sized instances. The performance of the proposed mathematical programming model and developed GA is verified by a number of numerical experiments.  相似文献   

13.
Portability, efficiency, and ease of coding are all important considerations in choosing the programming model for a scalable parallel application. The message-passing programming model is widely used because of its portability, yet some applications are too complex to code in it while also trying to maintain a balanced computation load and avoid redundant computations. The shared-memory programming model simplifies coding, but it is not portable and often provides little control over interprocessor data transfer costs. This paper describes an approach, called Global Arrays (GAs), that combines the better features of both other models, leading to both simple coding and efficient execution. The key concept of GAs is that they provide a portable interface through which each process in a MIMD parallel program can asynchronously access logical blocks of physically distributed matrices, with no need for explicit cooperation by other processes. We have implemented the GA library on a variety of computer systems, including the Intel Delta and Paragon, the IBM SP-1 and SP-2 (all message passers), the Kendall Square Research KSR-1/2 and the Convex SPP-1200 (nonuniform access shared-memory machines), the CRAY T3D (a globally addressable distributed-memory computer), and networks of UNIX workstations. We discuss the design and implementation of these libraries, report their performance, illustrate the use of GAs in the context of computational chemistry applications, and describe the use of a GA performance visualization tool.(An earlier version of this paper was presented at Supercomputing'94.)  相似文献   

14.
A dual‐port multiple‐input multiple‐output (MIMO) dielectric resonator antenna (DRA) for 5 GHz IEEE (802.11a/h/j/n/ac/ax) is discussed in this article. Two prototypes of single feed DRA and dual feed MIMO DRA are fabricated and measured results are compared with the simulated data. The proposed single feed DRA and dual feed MIMO DRA exhibits wide impedance bandwidth (IBW). Antennas have been fabricated on Rogers RT Duroid substrate with Eccostock made DRA placed over the substrate. DRAs are excited by aperture coupled feed to achieve wide bandwidth and high efficiency. The measured IBW of uniport DRA and dual‐port MIMO DRA are 26.6% (4.75‐6.21 GHz) and 27.5% (4.7‐6.2 GHz) respectively. Maximum gain of the antenna is 7.4 dBi. The results of the antennas are in good agreement with simulated data and they are suitable for WLAN applications. These antennas are also compact with area of substrate 32.8 cm2.  相似文献   

15.
A vibration isolation system is designed using novel hybrid optimization techniques, where locations of machines, locations of isolators and layout of supporting structure are all taken as design variables. Instead of conventional parametric optimization model, the 0-1 programming model is established to optimize the locations of machines and isolators so that the time-consuming remeshing procedure and the complicated sensitivity analysis with respect to position parameters can be circumvented. The 0-1 sequence for position design variables is treated as binary bits so as to reduce the actual number of design variables to a great extent. This way the 0-1 programming can be solved in a quite efficient manner using a special version of genetic algorithm(GA) that has been published by the authors. The layout of supporting structure is optimized using SIMP based topology optimization method, where the fictitious elemental densities are taken as design variables ranging from 0 to 1. Influence of different design variables is firstly investigated by numerical examples. Then a hybrid multilevel optimization method is proposed and implemented to simultaneously take all design variables into account.  相似文献   

16.
Scientific data is mostly multi-valued, e.g., coordinates, velocities, moments or feature components, and it comes in large quantities. The data layout of such containers has an enormous impact on the achieved performance, however, layout optimization is very time-consuming and error-prone because container access syntax in standard programming languages is not sufficiently abstract. This means that changing the data layout of a container necessitates syntax changes in all parts of the code where the container is used. Object oriented languages allow to solve this problem by hiding the data layout behind a class interface. However, the additional coding effort is enormous in comparison to a simple structure. A clever coding pattern, previously presented by the author, significantly reduces the code overhead, however, it relies heavily on advanced C++ features, a language that is not supported on most accelerators. This paper develops a concise macro based solution that requires only support for structures and unions and can therefore be utilized in OpenCL, a widely supported programming language for parallel processors. This enables the development of high performance code without an a-priori commitment to a certain layout and includes the possibility to optimize it subsequently. This feature is used to identify the best data layouts for different processing patterns of multi-valued containers on a multi-GPU system.  相似文献   

17.
Zippy: A Framework for Computation and Visualization on a GPU Cluster   总被引:1,自引:0,他引:1  
Due to its high performance/cost ratio, a GPU cluster is an attractive platform for large scale general‐purpose computation and visualization applications. However, the programming model for high performance general‐purpose computation on GPU clusters remains a complex problem. In this paper, we introduce the Zippy frame‐work, a general and scalable solution to this problem. It abstracts the GPU cluster programming with a two‐level parallelism hierarchy and a non‐uniform memory access (NUMA) model. Zippy preserves the advantages of both message passing and shared‐memory models. It employs global arrays (GA) to simplify the communication, synchronization, and collaboration among multiple GPUs. Moreover, it exposes data locality to the programmer for optimal performance and scalability. We present three example applications developed with Zippy: sort‐last volume rendering, Marching Cubes isosurface extraction and rendering, and lattice Boltzmann flow simulation with online visualization. They demonstrate that Zippy can ease the development and integration of parallel visualization, graphics, and computation modules on a GPU cluster.  相似文献   

18.
This paper compares a feature transformation method using a genetic algorithm (GA) with two conventional methods for artificial neural networks (ANNs). In this study, the GA is incorporated to improve the learning and generalizability of ANNs for stock market prediction. Daily predictions are conducted and prediction accuracy is measured. In this study, three feature transformation methods for ANNs are compared. Comparison of the results achieved by a feature transformation method using the GA to the other two feature transformation methods shows that the performance of the proposed model is better. Experimental results show that the proposed approach reduces the dimensionality of the feature space and decreases irrelevant factors for stock market prediction.  相似文献   

19.
Unified Parallel C(UPC) is a parallel extension of ANSI C based on the Partitioned Global Address Space(PGAS) programming model,which provides a shared memory view that simplifies code development while it can take advantage of the scalability of distributed memory architectures.Therefore,UPC allows programmers to write parallel applications on hybrid shared/distributed memory architectures,such as multi-core clusters,in a more productive way,accessing remote memory by means of different high-level language constructs,such as assignments to shared variables or collective primitives.However,the standard UPC collectives library includes a reduced set of eight basic primitives with quite limited functionality.This work presents the design and implementation of extended UPC collective functions that overcome the limitations of the standard collectives library,allowing,for example,the use of a specific source and destination thread or defining the amount of data transferred by each particular thread.This library fulfills the demands made by the UPC developers community and implements portable algorithms,independent of the specific UPC compiler/runtime being used.The use of a representative set of these extended collectives has been evaluated using two applications and four kernels as case studies.The results obtained confirm the suitability of the new library to provide easier programming without trading off performance,thus achieving high productivity in parallel programming to harness the performance of hybrid shared/distributed memory architectures in high performance computing.  相似文献   

20.
Dynamic programming (DP) is a popular technique which is used to solve combinatorial search and optimization problems. This paper focuses on one type of DP, which is called nonserial polyadic dynamic programming (NPDP). Owing to the nonuniform data dependencies of NPDP, it is difficult to exploit either parallelism or locality. Worse still, the emerging multi/many-core architectures with small on-chip memory make these issues more challenging. In this paper, we address the challenges of exploiting the fine grain parallelism and locality of NPDP on multicore architectures. We describe a latency-tolerant model and a percolation technique for programming on multicore architectures. On an algorithmic level, both parallelism and locality do benefit from a specific data dependence transformation of NPDP. Next, we propose a parallel pipelining algorithm by decomposing computation operators and percolating data through a memory hierarchy to create just-in-time locality. In order to predict the execution time, we formulate an analytical performance model of the parallel algorithm. The parallel pipelining algorithm achieves not only high scalability on the 160-core IBM Cyclops64, but portable performance as well, across the 8-core Sun Niagara and quad-cores Intel Clovertown.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号