期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

1.

Performance evaluation and optimal design for FPGA-based digit-serial DSP functions

Hanho LeeAuthor Vitae Gerald E. SobelmanAuthor Vitae 《Computers & Electrical Engineering》2003,29(2):357-377

As field programmable gate array (FPGA) technology has steadily improved, FPGAs are now viable alternatives to other technology implementations for high-speed classes of digital signal processing (DSP) applications. Digit-serial DSP architectures have been effective implementation method for FPGAs. In this work, a method of implementing digit-serial DSP architectures on FPGAs is presented, and their performance is evaluated with the objective of finding and developing the most efficient digit-serial DSP architectures on FPGAs. This paper discusses area costs and operational delays of the various digit-serial DSP functions and presents the area/delay models on Xilinx XC4000-series FPGAs. These area/delay models can make predictions of performance and hardware resource utilization before a lengthy layout and synthesis process is undertaken. The results show that the area/delay models proposed here are valid and the digit-serial DSP designs are promising candidates for efficient FPGA implementations. 相似文献

2.

Performance evaluation of job-shop systems using timed event-graphs 总被引：1，自引：0，他引：1

Hillion H.P. Proth J.-M. 《Automatic Control, IEEE Transactions on》1989,34(1):3-9

Timed event-graphs, a special class of timed Petri nets, are used for modelling and analyzing job-shop systems. The modelling allows the steady-state performance of the system to be evaluated under a deterministic and cyclic production process. Given any fixed processing times, the productivity (i.e., production rate) of the system can be determined from the initial state. It is shown in particular that, given any desired product mix, it is possible to start the system with enough jobs in process so that some machines will be fully utilized in steady-state. These machines are called bottleneck machines, since they limit the throughput of the system. In that case, the system works at the maximal rate and the productivity is optimal. The minimal number of jobs in process allowing optimal functioning of the system is further specified as an integer linear programming problem. An efficient heuristic algorithm is developed to obtain a near-optimal solution 相似文献

3.

Performance evaluation of concurrent systems using Petri nets

Jan Magott 《Information Processing Letters》1984,18(1):7-13

相似文献

4.

Constructing suboptimal schedules using directed search

V. L. Kochnev G. P. Tarasov 《Cybernetics and Systems Analysis》1972,8(4):576-580

相似文献

5.

Performance evaluation of supercomputers using HPCC and IMB Benchmarks

Subhash Saini Robert Ciotti Brian T.N. Gunney Thomas E. Spelce Alice Koniges Don Dossa Panagiotis Adamidis Rolf Rabenseifner Sunil R. Tiyyagura Matthias Mueller 《Journal of Computer and System Sciences》2008,74(6):965-982

The HPC Challenge (HPCC) Benchmark suite and the Intel MPI Benchmark (IMB) are used to compare and evaluate the combined performance of processor, memory subsystem and interconnect fabric of five leading supercomputers—SGI Altix BX2, Cray X1, Cray Opteron Cluster, Dell Xeon Cluster, and NEC SX-8. These five systems use five different networks (SGI NUMALINK4, Cray network, Myrinet, InfiniBand, and NEC IXS). The complete set of HPCC Benchmarks are run on each of these systems. Additionally, we present Intel MPI Benchmarks results to study the performance of 11 MPI communication functions on these systems. 相似文献

6.

Performance evaluation of GAIA supercomputer using NPB multi-zone benchmarks

Hongsuk Yi 《Computer Physics Communications》2011,(1):263-265

GAIA is a recently developed IBM POWER6 supercomputer consisting of 24 SMP compute nodes with 64-way processors each, and it is currently ranked 393 on the Top500 supercomputer list published in November 2009. In this paper, we present the performance characteristics of GAIA evaluated by interconnecting the 24 computing nodes with low-latency InfiniBand network. We evaluate the performance of the new dual-core Power 595 system in terms of increased problem size using the multi-zone versions of the NAS Parallel Benchmarks. 相似文献

7.

Performance evaluation of cloud computing platforms using statistical methods

Gültekin Ataş Vehbi Cagri Gungor 《Computers & Electrical Engineering》2014

Cloud computing is a very attractive research topic. Many studies have examined the infrastructure as a service and software as a service aspects of cloud computing; however, few studies have focused on platform as a service (PaaS). According to recent reports, demand for enterprise PaaS solutions will increase continuously. However, different sectors require different types of PaaS applications and computing resources. Therefore, an evaluation and ranking framework for PaaS solutions according to application needs is required. To address this need, this study presents the most essential aspects of PaaS solutions and provides a framework for evaluating the performance of PaaS providers. It also proposes a suitable set of benchmarking algorithms that can help determine the most appropriate PaaS provider based on different resource needs and application requirements. Performance evaluations of three well-known cloud computing PaaS providers were conducted using the analytic hierarchy process and the logic scoring of preference methods. 相似文献

8.

Performance evaluation of FMIG clustering using fuzzy validity indexes

Monia Tlili Thouraya Ayadi Tarek M. Hamdani Adel M. Alimi 《Soft Computing - A Fusion of Foundations, Methodologies and Applications》2015,19(12):3515-3528

相似文献

9.

Performance evaluation of high-speed interconnects using dense communication patterns

Rod Fatoohi Ken Kardys Sumy Koshy Soundarya Sivaramakrishnan Jeffrey S. Vetter 《Parallel Computing》2006,32(11-12):794

We study the performance of high-speed interconnects using a set of communication micro-benchmarks. The goal is to identify certain limiting factors and bottlenecks with these interconnects. Our micro-benchmarks are based on dense communication patterns with different communicating partners and varying degrees of these partners. We tested our micro-benchmarks on five platforms: an IBM system of 68-node 16-way Power3, interconnected by a SP switch2; another IBM system of 264-node 4-way Power PC 604e, interconnected by an SP switch; a Compaq cluster of 128-node 4-way ES40/EV67 processor, interconnected by an Quadrics interconnect; an Intel cluster of 16-node dual-CPU Xeon, interconnected by an Quadrics interconnect; and a cluster of 22-node Sun Ultra Sparc, interconnected by an Ethernet network. Our results show many limitations of these networks including the memory contention within a node as the number of communicating processors increased and the limitations of the network interface for communication between multiple processors of different nodes. 相似文献

10.

Performance analysis of active schedules in identical parallel machine

Changjun WANG Yugeng XI 《控制理论与应用(英文版)》2007,5(3):239-243

Active schedule is one of the most basic and popular concepts in production scheduling research. For identical parallel machine scheduling with jobs’ dynamic arrivals, the tight performance bounds of active schedules under the measurement of four popular objectives are respectively given in this paper. Similar analysis method and conclusionscan be generalized to static identical parallel machine and single machine scheduling problem. 相似文献

11.

Optimizing overall loop schedules using prefetching andpartitioning

Fei Chen O'Neil T.W. Sha E.H.-M. 《Parallel and Distributed Systems, IEEE Transactions on》2000,11(6):604-614

In this paper, a method combining the loop pipelining technique with data prefetching, called Partition Scheduling with Prefetching (PSP), is proposed. In PSP, the iteration space is first divided into regular partitions. Then a two-part schedule, consisting of the ALU and memory parts, is produced and balanced to produce high throughput. These two parts are executed simultaneously, and hence, the remote memory latencies are overlapped. We study the optimal partition shape and size so that a well-balanced overall schedule can be obtained. Experiments on DSP benchmarks show that the proposed methodology consistently produces optimal or near optimal solutions 相似文献

12.

Optimized evaluation of a large sum of functions using a three-grid approach

Franz Schreier 《Computer Physics Communications》2006,174(10):783-792

The efficient evaluation of numerous values of functions that vary rapidly only in a small part of the region of interest is presented. An optimized algorithm using a sequence of grids with increasing resolution is developed. The algorithm does not make any assumptions about special properties of the function to be evaluated, e.g., symmetry. An additional speed-up is obtained by exploiting the asymptotic behaviour of the functions to be summed. Two applications from high resolution atmospheric radiative transfer modelling in the infrared and microwave are presented. In a third example asymmetric Rautian line shapes important for high resolution molecular spectroscopy are considered. Computational gains by more than two orders of magnitude with relative errors less than 10⁻³ have been achieved. 相似文献

13.

Knowledge representation and integration for portfolio evaluation using linear belief functions

《IEEE transactions on systems, man, and cybernetics. Part A, Systems and humans : a publication of the IEEE Systems, Man, and Cybernetics Society》2006,36(4):774-785

This paper proposes a linear belief function (LBF) approach to evaluate portfolio performance. By drawing on the notion of LBFs, an elementary approach to knowledge representation in expert systems is proposed. It is shown how to use basic matrices to represent market information and financial knowledge, including complete ignorance, statistical observations, subjective speculations, distributional assumptions, linear relations, and empirical asset-pricing models. The authors then appeal to Dempster's rule of combination to integrate the knowledge for assessing the overall belief of portfolio performance and updating the belief by incorporating additional evidence. An example of three gold stocks is used to illustrate the approach. 相似文献

14.

State-based scheduling with tree schedules: analysis and evaluation

Madhukar Anand Sebastian Fischmeister Insup Lee Linh T. X. Phan 《Real-Time Systems》2012,48(4):430-462

Distributed real-time systems require bounded communication delays and achieve them by means of a predictable and verifiable control mechanism for the communication medium. Real-time bus arbitration mechanisms control access to the medium and guarantee bounded communication delays. These arbitration mechanisms can be static dispatch tables or dynamic, algorithmic approaches. In this work, we introduce a real-time bus arbitration mechanism called tree schedules that takes the best parts of both sides: It can be analyzed like static dispatch tables, and it provides a certain degree of flexibility similar to algorithmic approaches. We present tree schedules as a framework to specify real-time traffic and introduce mechanisms to analyze it. We discuss how tree schedules can capture application-specific behavior in a time-triggered state-based supply model by means of conditional branching built into the model. We present analysis results for this model specifically aiming at schedulability in fixed and dynamic priority schemes and waiting time analysis. Finally, we demonstrate the advantages of state-based supply over stateless supply by means of two case studies. 相似文献

15.

A system-centric metric for the evaluation of online job schedules

Uwe Schwiegelshohn 《Journal of Scheduling》2011,14(6):571-581

The paper discusses a rarely used metric that is well suited to evaluate online schedules for independent jobs on massively parallel processors. The metric is based on the total weighted completion time objective with the weight being the resource consumption of the job. Although every job contributes to the objective value, the metric exhibits many properties that are similar to the properties of the makespan objective. For this metric, we particularly address nonclairvoyant online scheduling of sequential jobs on parallel identical machines and prove an almost tight competitive factor of 1.25 for nondelay schedules. For the extension of the problem to rigid parallel jobs, we show that no constant competitive factor exists. However, if all jobs are released at time 0, List Scheduling in descending order of the degree of parallelism guarantees an approximation factor of 2. 相似文献

16.

Performance evaluation of large-scale object recognition system using bag-of-visual words model

Min-Uk Kim Kyoungro Yoon 《Multimedia Tools and Applications》2015,74(7):2499-2517

相似文献

17.

Performance evaluation of incremental training method for face recognition using PCA 总被引：1，自引：0，他引：1

Ch. Satyanarayana D. M. Potukuchi L. Pratap Reddy 《Journal of Real-Time Image Processing》2007,1(4):311-327

Relevance of ‘face recognition’ (FR) in the modern world requirements is presented as a case of human machine interaction. Physical conditions that influence the face recognition process regarding the facial features, illumination changes and viewing angles etc. are discussed. Face recognition process predominantly depends on machine perception i.e. information through an array of pixels with respect to the facial image. Details of eigenface approach through the involvement of contemporary algebraic and statistical analysis are revisited. Methodology involved in the Principal Component Analysis and advantages of exposing the data to incremental training (using PCA) are discussed. A model for the implementation of IPCA over the face databases is proposed to estimate its performance for the face recognition process. Performance of the present model is studied in the domain of Euclidean distance, decay parameter, recognition rate, eigenvalues and overall computational time. Present IPCA model administered over standard ORL, FERET databases along with that over the JNTU face database with large number of face images revealed relative performance. The merit of present IPCA is inferred through enhanced recognition rate and reduced complexity (in the algorithm), intelligent eigenvectors and lesser computational time. The results are presented in the wake of the body of data available with other methods.

Ch. SatyanarayanaEmail:

相似文献

18.

Performance evaluation of a multithreaded RTS using a synchronous reactive model

A. Valderruten V. M. Gulías J. Mosquera J. S. Jorge 《Control Engineering Practice》1999,7(12):771-1539

Synchronous reactive modelling provides an optimal framework for the modular decomposition of programs that engage in complex patterns of deterministic interaction, such as many real-time and communication entities. This paper presents an approach which includes performance modelling techniques in the synchronous reactive modelling method supported by ESTEREL. It defines a methodology based on timing and probabilistic quantitative constructs that complete the synchronous reactive models. A monitoring mechanism allows the computation of performance results during the simulation. This methodology is applied to study a multithreaded runtime system for a distributed functional programming language. Performance metrics are computed and validated with experimental results. 相似文献

19.

Performance evaluation of circuit switched multistageinterconnection networks using a hold strategy

Hsiao S.-H. Chen C.Y.R. 《Parallel and Distributed Systems, IEEE Transactions on》1992,3(5):632-640

The performance evaluation of processor-memory communications for multiprocessor systems using circuit switched interconnection networks with a hold strategy is performed. Message size and processor processing time are considered and shown to have a significant effect on the overall system performance. A closed queuing network model is proposed such that only (n+2) states are required by the proposed model, in contrast to (n²+3n+4)/2 states needed in previous studies, where n is the number of stages of the multistage interconnection network. Since a closed-form solution is obtained, the behavior of a complete cycle of memory access through multistage interconnection networks can be accurately analyzed and various performance bounds can be obtained 相似文献

20.

Performance evaluation of parallel systems by using unboundedgeneralized stochastic Petri nets

Granda M. Drake J.M. Gregorio J.A. 《IEEE transactions on pattern analysis and machine intelligence》1992,18(1):55-71

Methods of calculating efficiently the performance measures of parallel systems by using unbounded generalized stochastic Petri nets are presented. An explosion in the number of states to be analyzed occurs when unbounded places appear in the model. The state space of such nets is infinite, but it is possible to take advantage of the natural symmetries of the system to aggregate the states of the net and construct a finite graph of lumped states which can easily be analyzed. With the methods developed, the unbounded places introduce a complexity similar to that of safe places of the net. These methods can be used to evaluate models of open parallel systems in which unbounded places appear; systems which are k-bounded but are complex and have large values of k can also be evaluated in an appropriate way. From the steady-state solution of the model, it is possible to obtain automatically the performance measures of parallel systems represented by this type of net 相似文献