首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 0 毫秒
1.
Summary Conservation laws are useful in the investigation of physical systems. There exists a special conservation law for priority queueing systems: a preferential treatment given to one class of jobs is afforded at the expense of other jobs. This fundamental relationship has been proved previously for G/G/1 queueing systems. In the present paper the conservation law is extended to G/G/m queueing systems and a useful application of this extension is given.  相似文献   

2.
3.
A hybrid method is presented for domain decomposition employing a graph theoretical algorithm and a neural network optimization model. The method uses graph theory and neural networks for decomposition of structured finite element meshes. Examples are presented to illustrate the efficiency of the developed method.  相似文献   

4.
Heterogeneous multiprocessor systems, where commodity multicore processors are coupled with graphics processing units (GPUs), have been widely used in high performance computing (HPC). In this work, we focus on the design and optimization of Computational Fluid Dynamics (CFD) applications on such HPC platforms. In order to fully utilize the computational power of such heterogeneous platforms, we propose to design the performance-critical part of CFD applications, namely the linear equation solvers, in a hybrid way. A hybrid linear solver includes both one CPU version and one GPU version of code for solving a linear equations system. When a hybrid linear equation solver is invoked during the CFD simulation, the CPU portion and the GPU portion will be run on corresponding processing devices respectively in parallel according to the execution configuration. Furthermore, we propose to build functional performance models (FPMs) of processing devices and use FPM-based heterogeneous decomposition method to distribute workload between heterogeneous processing devices, in order to ensure balanced workload and optimized communication overhead. Efficiency of this approach is demonstrated by experiments with numerical simulation of lid-driven cavity flow on both a hybrid server and a hybrid cluster.  相似文献   

5.
Two major approximate techniques have been proposed for the analysis of general closed queueing networks, namely the aggregation method and Marie's method. The idea of the aggregation technique is to replace a subsystem (a subnetwork) by a flow equivalent single-server with load-dependent service rates. The parameters of the equivalent server are obtained by analyzing the subsystem in isolation as a closed system with different populations. The idea of Marie's method is also to replace a subsystem by an equivalent exponential service station with load-dependent service rates. However, in this case, the parameters of the equivalent server are obtained by analyzing the subsystem in isolation under a load-dependent Poisson arrival process. Moreover, in Marie's case, the procedure is iterative.

In this paper we provide a general and unified view of these two methods. The contributions of this paper are the following. We first show that their common principle is to partition the network into a set of subsystems and then to define an equivalent product-form network. To each subsystem is associated a load-dependent exponential station in the equivalent network. We define a set of rules in order to partition any general closed network with various features such as general service time distributions, pupulation constraints, finite buffers, state-dependent routing. We then show that the aggregation method and Marie's method are two ways of obtaining the parameters of the equivalent network associated with a given partition. Finally, we provide a discussion pertaining to the comparison of the two methods with respect to their accuracy and computational complexity.  相似文献   


6.
The paper considers the problem of the qualitative analysis of complex switched server queueing networks. Such networks can be used to model various flexible manufacturing, communications, and computer systems. We introduce the concept of regularizability for such systems and obtain a necessary and sufficient condition for a switched server queueing network to be regularizable.  相似文献   

7.
Congestion is ever present in most practical situations. We describe a methodology for approximate analysis of open state-dependent M/G/c/c queueing networks in which the service rate is subject to congestion, that is, it is a function of the number of customers in the system. Important performance measurements are easily computed with high accuracy, such as the blocking probability, throughput, expected number of customers in the system (known also as work-in-process), and expected waiting time. The methodology forms a basic building block useful in many practical applications and contexts. Computational results demonstrate that the methodology provides accurate results in many topological configurations as well as in the analysis of network evacuation problems in high-rise buildings.  相似文献   

8.
9.
We introduce a new solution technique for closed product-form queueing networks that generalizes the Method of Moments (MoM), a recently proposed exact algorithm that is several orders of magnitude faster and memory efficient than the established Mean Value Analysis (MVA) algorithm. Compared to MVA, MoM recursively computes higher-order moments of queue lengths instead of mean values, an approach that remarkably reduces the computational costs of exact solutions, especially on models with large numbers of jobs.In this paper, we show that the MoM recursion can be generalized to include multiple recursive branches that evaluate models with different numbers of queues, a solution approach inspired by the Convolution algorithm. Combining the approaches of MoM and Convolution simplifies the evaluation of normalizing constants and leads to large computational savings with respect to the recursive structure originally proposed for MoM.  相似文献   

10.
Analytical lower and upper bounds for the throughput of closed queueing networks with single and delay (infinite) servers are studied in this paper. The numerical evaluation of these bounds requires a small number of significant operations which is independent of the population N. This is in contrast to the exact computation of the throughput which requires at least O(N) operations as N tends to infinity. The bounds are given by simple closed-form analytical expressions and may be more suitable for various performance studies than the algorithmical form of the exact solution.In this paper, the previously known balanced-job bounds are generalized to networks containing delay servers (terminals) and a hierarchy of bounds is obtained for single and multiple class networks. For the single class network, further new bounds are derived: lower and upper bounds that require the evaluation of one square root and an upper bound that requires a constant number of exponentiations. This upper bound does not employ the balancing of server loadings and is especially useful for asymptotic analysis in the case of a large number of customers N.  相似文献   

11.
In this paper, the simulation analysis of an automated material handling system (AMHS) for a photobay in a 300 mm wafer fab was analyzed, considering the effects of the dispatching rules. Discrete-event simulation models were developed in an e-M Plant to study this system. Currently, the combination of the shortest distance with the nearest vehicle (SD_NV) and the first-encounter first-served (FEFS) dispatching rule was used in this system. In order to improve the system performance, a hybrid push/pull (PP) dispatching rule was proposed. The simulation results reveal a substantial improvement of the AMHS performance and reduced the WIP and cycle time as a consequence of implementing a PP dispatching rule.  相似文献   

12.
N-body codes are routinely used for simulation studies of physical systems, e.g. in the fields of computational astrophysics and molecular dynamics. Typically, they require only a moderate amount of run-time memory, but are very demanding in computational power. A detailed analysis of an N-body code performance, in terms of the relative weight of each task of the code, and how this weight is influenced by software or hardware optimisations, is essential in improving such codes. The approach of developing a dedicated device, GRAPE [J. Makino, M. Taiji, Scientific Simulations with Special Purpose Computers, Wiley, New York, 1998], able to provide a very high performance for the most expensive computational task of this code, has resulted in a dramatic performance leap. We explore on the performance of different versions of parallel N-body codes, where both software and hardware improvements are introduced. The use of GRAPE as a ‘force computation accelerator’ in a parallel computer architecture, can be seen as an example of a hybrid architecture, where special purpose device boards help a general purpose (multi)computer to reach a very high performance.  相似文献   

13.
In this paper, we propose the use of a hybrid algorithm for the inversion of 3D Alternate Current (AC) resistivity logging measurements. The forward problem is solved using a goal-oriented self-adaptive hp-Finite Element Method (hp-FEM) that provides exponential convergence of the numerical error with respect to the mesh size. The inverse problem is solved using a Hierarchical Genetic Search (HGS) coupled with a Broyden–Fletcher–Goldfar–Shanno (BFGS) method. Individuals from the genetic populations represent the resistivity of the formation layers. The fitness function is estimated based on hp-FEM results. The hybrid method controls the accuracy of evaluation of particular individuals, as well as the accuracy of the genetic coding. After finding those regions where the fitness function has small values, the local search method by means of BFGS algorithm is executed. The paper is concluded with numerical results for the hybrid algorithm.  相似文献   

14.
With Moore’s law supplying billions of transistors on-chip, embedded systems are undergoing a transition from single-core to multi-core to exploit this high transistor density for high performance. However, the optimal layout of these multiple cores along with the memory subsystem (caches and main memory) to satisfy power, area, and stringent real-time constraints is a challenging design endeavor. The short time-to-market constraint of embedded systems exacerbates this design challenge and necessitates the architectural modeling of embedded systems to reduce the time-to-market by expediting target applications to device/architecture mapping. In this paper, we present a queueing theoretic approach for modeling multi-core embedded systems that provides a quick and inexpensive performance evaluation both in terms of time and resources as compared to the development of multi-core simulators and running benchmarks on these simulators. We verify our queueing theoretic modeling approach by running SPLASH-2 benchmarks on the SuperESCalar simulator (SESC). Results reveal that our queueing theoretic model qualitatively evaluates multi-core architectures accurately with an average difference of 5.6% as compared to the architectures’ evaluations from the SESC simulator. Our modeling approach can be used for performance per watt and performance per unit area characterizations of multi-core embedded architectures, with varying number of processor cores and cache configurations, to provide a comparative analysis.  相似文献   

15.
This paper describes a hybrid steepest descent method to decrease over time any given convex cost function while keeping the optimization variables in any given convex set. The method takes advantage of the properties of hybrid systems to avoid the computation of projections or of a dual optimum. The convergence to a global optimum is analyzed using Lyapunov stability arguments. A discretized implementation and simulation results are presented and analyzed. This method is of practical interest to integrate real-time convex optimization into embedded controllers thanks to its implementation as a dynamical system, its simplicity, and its low computation cost.  相似文献   

16.
K.  A. 《Performance Evaluation》2003,51(2-4):137-152
In this paper we develop approximation models for feed-forward networks of G/G/1/N queues. We use Linear Algebra Queueing Theory (LAQT) techniques to create reduced state space representations for the queue departure processes. Reduced state space departure processes are presented where the first three moments and the correlation decay are mapped to a two state process. A three state process is also presented matching the first five moments and the first three lag autocorrelations. Numerical examples of end-to-end performance for high-speed communications networks with correlated arrival traffic are presented. The results are compared with simulation models and other approximation methods.  相似文献   

17.
Two classes of performance bounds for separable queueing networks are presented, one for single-chain networks and one for multichain networks. Unlike most bounds for single-chain networks, our bounds are not based upon the consideration of balanced networks. Instead, they are obtained by assuming mean queue lengths to be proportional to server loads; hence, they are called proportional bounds. Proportional bounds are tighter than balanced bounds because individual server loads are retained as parameters in a bound's formula. For the same reason, they require more computational effort than balanced bounds. We also show how proportional bounds are related to balanced bounds. Next we present generalized bounds that are calculated iteratively over sequences of population sizes; our method extends that of Eager and Sevcik (1983). These generalized bounds are shown to have a nested property. Furthermore, we present optimal population sequences, over all sequences of the same length, for getting the tightest upper and lower bounds. The other emphasis of this paper is on performance bounds for networks with many closed chains and many service centers. Bounding techniques are especially important for multichain networks since the computation time and space requirements are often so large that an exact solution is not feasible. Models of communication networks typically have many routing chains which are characterized by a sparseness property. In the computation of our performance bounds for multichain networks, we improve their accuracy by making use of routing information and exploiting the sparseness property.  相似文献   

18.
In this paper, we formulate a recursive relation for the marginal probabilities of a closed network with K customers in terms of the same network with K ? 1 customers. Mean-value analysis (MVA) is an application of this relation, together with Little's formula. It is shown that the convolution method, too, can be based on the same recursive result. This leads to a new convolution algorithm called normalized convolution algorithm (NCA), which like MVA works entirely with probabilities and throughputs rather than with quantities such as normalization contacts. NCA avoids a difficult problem, the occurrence of floating-point over-flows in the original convolution algorithm.We shall also solve a numerical stability problem found in MVA. Finally, we show how MVA and the convolution algorithm can be combined in the same problem to yield a hybrid method retaining the best properties of both methods.  相似文献   

19.
This paper describes an approach to carry out performance analysis of parallel embedded applications. The approach is based on measurement, but in addition, the idea of driving the measurement process (application instrumentation and monitoring) by a behavioral model is introduced. Using this model, highly comprehensible performance information can be collected. The whole approach is based on this behavioral model, one instrumentation method and two tools, one for monitoring and the other for visualization and analysis. Each of these is briefly described, and the steps to carry out performance analysis using them are clearly defined. They are explained by means of a case study. Finally, one method to evaluate the intrusiveness of the monitoring approach is proposed, and the intrusiveness results for the case study are presented.  相似文献   

20.
There are generally two main directions for the investigation and development of parallel manipulators, namely macro/meso stream and micro/nano stream, in which the former one has been thoroughly investigated in recent decades, while the latter one still remains many performance related open issues that significantly affect their application potentials in critical situations such as high-precision automated cell manipulation. Improving the overall performance of parallel manipulators is the bridge to connect the academia and industry for the great development and real-world application. This research is to develop a novel methodology called performance decomposition and integration for governing the design optimization process of complicated micromanipulator. A new five degrees-of-freedom (DOF) compliant hybrid parallel micromanipulator which is configured with five identical PSS limbs and one constraining UPU limb is proposed as a case study. The performance visualization, finite element analysis, and dimensional optimization are implemented. The proposed methodology is applicable for the design improvement of different kinds of compliant/parallel mechanisms.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号