期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

1.

Partitioning Processor Arrays under Resource Constraints

Jürgen Teich Lothar Thiele Lee Z. Zhang 《The Journal of VLSI Signal Processing》1997,17(1):5-20

A single integer linear programming model for optimally scheduling partitioned regular algorithms is presented. The herein presented methodology differs from existing methods in the following capabilities: 1) Not only constraints on the number of available processors and communication capabilities are taken into account, but also local memories and constraints on the size of available memories. 2) Different types of processors can be handled. 3) The size of the optimization model (number of integer variables) is independent of the size of the tiles to be executed. Hence, 4) the number of integer variables in the optimization model is greatly reduced such that problems of relevant size can be solved in practical execution time. 相似文献

2.

A Method of Solving Redundancy Optimization Problems

Misra K. B. 《Reliability, IEEE Transactions on》1971,(3):117-120

The redundancy optimization problem is formulated as an integer programming problem of zero-one type variables. The solution is obtained making use of an algorithm due to Lawler and Bell. Objective function and constraints can be any arbitrary functions. Three different variations of the optimization problem are considered. The formulation is easy and the solution is convenient on a digital computer. The size of the problem that can be solved is not restricted by the number of constraints. 相似文献

3.

Replicated module allocation in LAN-based concurrent processing systems

June S. Park Chin-yuan Ho Timothy J. Lowe 《Telecommunication Systems》1994,3(3):295-318

We consider a problem involving the design of a system for concurrent processing of application software using multiple processors on a local area network. The task control-flow graph which graphically describes the software logic is allowed to be an arbitrary directed multigraph. We establish equations of flow conservation which arise in the execution of modules on the set of interconnected processors. Incorporating these equations, we develop a mixed integer programming model to find an optimal allocation of program modules, with possible replications, to the set of capacitated processors. The objective is to minimize the total interprocessor communication cost and module execution cost subject to the capacity constraints of processors and the broadcast channel. The decisions involved are: how many copies of each module should be maintained; how to allocate module copies across processors; and how to distribute invocations of each module across its copies on different processors. We report numerical results from solving the model. 相似文献

4.

System-Level Synthesis Using Evolutionary Algorithms 总被引：3，自引：0，他引：3

Tobias Blickle Jürgen Teich Lothar Thiele 《Design Automation for Embedded Systems》1998,3(1):23-58

In this paper, we consider system-level synthesis as the problem of optimally mapping a task-level specification onto a heterogeneous hardware/software architecture. This problem requires (1) the selection of the architecture (allocation) including general purpose and dedicated processors, ASICs, busses and memories, (2) the mapping of the specification onto the selected architecture in space (binding) and time (scheduling), and (3) the design space exploration with the goal to find a set of implementations that satisfy a number of constraints on cost and performance. Existing methodologies often consider a fixed architecture, perform the binding only, do not reflect the tight interdependency between binding and scheduling, do not consider communication (tasks and resources), or require long run-times preventing design space exploration, or yield only one implementation with optimal cost. Here, a model is introduced that handles all mentioned requirements and allows the task of system-synthesis to be specified as an optimization problem. The application and adaptation of an Evolutionary Algorithm to solve the tasks of optimization and design space exploration is described. 相似文献

5.

On gracefully degrading multiprocessors with multistageinterconnection networks

Koren I. Koren Z. 《Reliability, IEEE Transactions on》1989,38(1):82-89

The behavior of a multiprocessing system with a multistage interconnection network is studied in the presence of faulty components. Measures for the connectivity and performance of these systems are proposed, including the average number of operational paths, the average number of accessible processors and memories, the average number of fault-free processors (memories) that are connected to an accessible memory (processor), the bandwidth, and the processing power of the system. Based on these measures, a tight upper bound for the maximal fully connected system is suggested. The gracefully degrading system is then compared, through some numerical examples, to a system whose faulty components are repaired upon failure. Based on these comparisons, the anticipated reduction in system performance can be estimated and consequently, appropriate maintenance policies can be determined 相似文献

6.

MULTIPAR: behavioral partition for synthesizing multiprocessorarchitectures

Yunn-Yen Chen Yu-Chin Hsu Chung-Ta King 《Very Large Scale Integration (VLSI) Systems, IEEE Transactions on》1994,2(1):21-32

相似文献

7.

用于资源动态预留的航天测控资源配置优化算法

下载免费PDF全文

梁军陈学军刘建平原东阳罗清青《电讯技术》2022,62(12)

针对航天测控资源配置优化问题这类约束条件繁杂且数量众多的组合优化问题,提出了可用于资源动态预留的航天测控资源配置优化算法。具体来讲,考虑测控设备和航天器执行任务的唯一性约束以及时间窗口冲突约束,建立了基于原子型任务调度的0-1整数规划模型;设计了能将实际需求和求解算法进行解耦的求解框架,并基于最大化利用测控资源的思想获得了可回溯的并行最佳优先搜索算法。仿真结果表明,所提算法达到了能在国内东部、西部、南部和北部四大测控区域中更加均衡地动态预留出更多、更重要测控设备的资源配置优化效果。相似文献

8.

一种基于格理论构造高维星座图的方法

下载免费PDF全文

吴昊张建秋宋汉斌《电子学报》2014,42(9):1672-1679

星座图的增益指数是格理论中的一个术语,它可以分解成格的编码增益和星座图边界的成形增益.本文将最大化星座图增益指数的过程构造为一系列优化问题,并将星座图的几何特性作为优化问题的约束条件.由于可通过求解优化问题来得到所需的星座图,因此本文的方法可以作为一种构造高维星座图的通用方法.相比现有算法均只适用于星座点个数较少的情况,本文方法可以简便地构造星座点数目较大的高维星座图.仿真结果显示出：在星座点数目较少时,由本文方法所构造的星座图的误符号率性能与最优值十分接近;而当星座点数目较多时,本文构造的星座图较传统基于整数格的星座图具有更低的误符号率. 相似文献

9.

Test-access mechanism optimization for core-based three-dimensional SOCs

Xiaoxia Wu Krishnendu Chakrabarty Yuan Xie 《Microelectronics Journal》2010,41(10):601-615

Embedded cores in a core-based system-on-chip (SOC) are not easily accessible via chip I/O pins. Test-access mechanisms (TAMs) and test wrappers (e.g., the IEEE Standard 1500 wrapper) have been proposed for the testing of embedded cores in a core-based SOC in a modular fashion. We show that such a modular testing approach can also be used for emerging three-dimensional integrated circuits based on through-silicon vias (TSVs). Core-based SOCs based on 3D IC technology are being advocated as a means to continue technology scaling and overcome interconnect-related bottlenecks. We present an optimization technique for minimizing the post-bond test time for 3D core-based SOCs under constraints on the number of TSVs, the TAM bitwidth, and thermal limits. The proposed optimization method is based on a combination of integer linear programming, LP-relaxation, and randomized rounding. It considers the Test Bus and TestRail architectures, and incorporates wire-length constraints in test-access optimization. Simulation results are presented for the ITC 02 SOC Test Benchmarks and the test times are compared to that obtained when methods developed earlier for two-dimensional ICs are applied to 3D ICs. The test time dependence on various 3D parameters (e.g. 3D placement, the number of layers, thermal constraints, and the number of TSVs) is also studied. 相似文献

10.

Directly performance-constrained template-based layout retargeting and optimization for analog integrated circuits

Lihong Zhang^{Author Vitae} Zheng LiuAuthor Vitae 《Integration, the VLSI Journal》2011,44(1):1-11

Due to intrinsic intricacy, layout parasitics exhibit a significant impact on the performance of analog integrated circuits. In this paper a directly performance-constrained parasitic-aware automatic layout retargeting and optimization algorithm is presented. Unlike the conventional sensitivity analysis, a general central-difference based scheme using any simulator for sensitivity computation is deployed. We propose a piecewise sensitivity model to enforce more accurate sensitivity computation during parasitic optimization. Moreover, mixed-integer performance constraints due to parasitics are included in the formulated mixed integer nonlinear programming problem rather than through either indirect parasitic-bound constraints or inaccurate worst-case sensitivities. A graph technique and mixed-integer nonlinear programming are effectively combined to solve the formulated parasitic optimization problem. The automatically generated target layouts can satisfy performance constraints to ensure the desired specifications. The experimental results show that the proposed algorithm can achieve effective retargeting of analog circuits with less layout area and significant reduction in execution time. 相似文献

11.

Scalable Programming Models for Massively Multicore Processors

McCool M.D. 《Proceedings of the IEEE. Institute of Electrical and Electronics Engineers》2008,96(5):816-831

Including multiple cores on a single chip has become the dominant mechanism for scaling processor performance. Exponential growth in the number of cores on a single processor is expected to lead in a short time to mainstream computers with hundreds of cores. Scalable implementations of parallel algorithms will be necessary in order to achieve improved single-application performance on such processors. In addition, memory access will continue to be an important limiting factor on achieving performance, and heterogeneous systems may make use of cores with varying capabilities and performance characteristics. An appropriate programming model can address scalability and can expose data locality while making it possible to migrate application code between processors with different parallel architectures and variable numbers and kinds of cores. We survey and evaluate a range of multicore processor architectures and programming models with a focus on GPUs and the Cell BE processor. These processors have a large number of cores and are available to consumers today, but the scalable programming models developed for them are also applicable to current and future multicore CPUs. 相似文献

12.

A new set of linear constraints for broad-band time domain element space processors

Meng Er Cantoni A. 《Antennas and Propagation, IEEE Transactions on》1986,34(3):320-329

A new set of linear constraints for designing broad-band time domain element space antenna array processors is presented. The set of linear constraints is used to ensure that a desired look-direction response of the processor over a frequency band of interest can be closely approximated. The design technique is posed in such a way that three types of presteering can be handled: no presteering, coarse presteering, and exact presteering. The elimination of presteering time delays or the possibility to use coarse presteering is an attractive feature in a digital implementation of antenna array processors. The relationship that the new processor has to other broad-band processors is also established. Furthermore, the approach enables various types of errors and mismatches between signal model and actual scenario to be incorporated in the problem formulation. 相似文献

13.

Scalable Architecture for SoC Video Encoders

Tero Kangas Timo D. H?m?l?inen Kimmo Kuusilinna 《The Journal of VLSI Signal Processing》2006,44(1-2):79-95

Evolving video coding standards demand functional flexibility for implementations, not only at design time but also after fabrication. This paper presents a System-on-Chip design approach with a feasible combination of performance, scalability, programmability, area efficiency, and design time effort for a video encoder. The encoder is based on a homogeneous master-slave processor architecture. Each slave encodes a part of the frame in the Single Program Multiple Data (SPMD) data parallel model. Both shared and distributed memory architectures are presented. Design effort is reduced by identical program codes, automated assembly of software and hardware modules independent of the number and type of processors, as well as our flexible on-chip communication network called Heterogeneous IP Block Interconnection (HIBI). A case study implementation with two to ten simple ARM7 processors, 32-bit HIBI bus and non-optimized processor-independent software gives the performance from 6 to 53 fps for QCIF. The whole encoder area ranges from 173 to 770 kgates excluding the memories. The relation scales reasonably well to systems with more powerful processors and optimized code. The optimization of the communication network shows that with more than six slaves even a serial HIBI connection with 100 MHz speed is feasible. HIBI and the parallelization approach allow exploration and optimization of the communication both at the application and architecture layers. Tero Kangas, MSc ’01, Tampere University of Technology (TUT). Since 1999 he has been working as a research scientist in the Institute of Digital and Computer Systems (DCS) at TUT. Currently he is working towards his PhD degree and his main research topics are system architectures and SoC design methodologies in multimedia applications. Kimmo Kuusilinna, PhD ’01, TUT. His main research interests include system-level design and verification, interconnection networks, and parallel memories. Currently he is working as a senior research engineer at the Nokia Research Center. Timo D. H?m?l?inen, MSc ’93, PhD ’97, TUT. He acted as a senior research scientist and project manager at TUT in 1997-2001. He was nominated to full professor at TUT/Institute of Digital and Computer Systems in 2001. He heads the DACI research group that focuses on three main lines: wireless local area networking and wireless sensor networks, high-performance DSP/HW based video encoding, and interconnection networks with design flow tools for heterogeneous SoC platforms. 相似文献

14.

Neural network optimization for redundancy allocation

V. V. Vinod S. Ghose 《Microelectronics Reliability》1994,34(1)

System reliability optimization problems such as redundancy allocation are hard to solve exactly. Neural networks offer an alternative computational model for obtaining good approximate solutions for such problems. In this paper we present a neural network for solving the redundancy allocation problem for a n-stage parallel redundant system with separable objective function and constraints. The problem is formulated as a 0–1 integer programming problem and solved using the network. The performance of the network compare favourably with that of the best fit algorithm. The number of iterations taken by the network increases very slowly with increase in number of variables. Hence the network can easily solve large problems. 相似文献

15.

Energy Cooperation with Sleep Mechanism in Renewable Energy Assisted Cellular HetNets

Ahmed Faran Naeem Muhammad Ejaz Waleed Iqbal Muhammad Anpalagan Alagan Haneef Muhammad 《Wireless Personal Communications》2021,116(1):105-124

The emerging fifth generation (5G) and beyond radio access networks are expected to be extremely dense and heterogeneous as compared to the current networks, involving a large number of different classes of base stations (BSs), namely macro, micro, femto and pico BSs. Among several performance requirements 5G and beyond systems aim to achieve, energy efficiency is one of the crucial requirements. In order to achieve energy-efficient design in dense heterogeneous 5G networks, various approaches in terms of resource allocation, off-loading techniques, hardware solutions and energy harvesting are being considered. In this regard, this paper develops an energy usage optimization framework in a cellular heterogeneous network (HetNet) consisting of a central macro-BS and a number of micro-BSs, equipped with renewable energy sources (RESs) such as solar panels and wind turbines. The proposed framework incorporates an energy cooperation mechanism along with a sleep mechanism (BS ON/OFF switching), in which the BSs having lean traffic are put into a sleep mode and their traffic load gets shared by the central BS. The surplus harvested energy from RESs of the sleeping BSs can then be sold back to the grid. An optimization problem for maximizing the utilization of RES and minimizing the usage of the traditional sources, such as utility and generator, is formulated and this mixed integer non-linear programming problem is solved through an interior point method. The presented results for various HetNet sizes demonstrate the significant savings in the energy cost with the proposed RES-enabled HetNet sleep mechanism model over the conventional approaches.

相似文献

16.

Quality of Service Aware Reliable Task Scheduling in Vehicular Cloud Computing

Tamal Adhikary Amit Kumar Das Md. Abdur Razzaque Ahmad Almogren Majed Alrubaian Mohammad Mehedi Hassan 《Mobile Networks and Applications》2016,21(3):482-493

Vehicular Cloud Computing (VCC) facilitates real-time execution of many emerging user and intelligent transportation system (ITS) applications by exploiting under-utilized on-board computing resources available in nearby vehicles. These applications have heterogeneous time criticality, i.e., they demand different Quality-of-Service levels. In addition to that, mobility of the vehicles makes the problem of scheduling different application tasks on the vehicular computing resources a challenging one. In this article, we have formulated the task scheduling problem as a mixed integer linear program (MILP) optimization that increases the computation reliability even as reducing the job execution delay. Vehicular on-board units (OBUs), manufactured by different vendors, have different architecture and computing capabilities. We have exploited MapReduce computation model to address the problem of resource heterogeneity and to support computation parallelization. Performance of the proposed solution is evaluated in network simulator version 3 (ns-3) by running MapReduce applications in urban road environment and the results are compared with the state-of-the-art works. The results show that significant performance improvements in terms of reliability and job execution time can be achieved by the proposed task scheduling model. 相似文献

17.

Software Pipeline–Based Partitioning Method with Trade‐Off between Workload Balance and Communication Optimization

下载免费PDF全文

Kai Huang Siwen Xiu Min Yu Xiaomeng Zhang Rongjie Yan Xiaolang Yan Zhili Liu 《ETRI Journal》2015,37(3):562-572

For a multiprocessor System‐on‐Chip (MPSoC) to achieve high performance via parallelism, we must consider how to partition a given application into different components and map the components onto multiple processors. In this paper, we propose a software pipeline–based partitioning method with cyclic dependent task management and communication optimization. During task partitioning, simultaneously considering computation load balance and communication optimization can cause interference, which leads to performance loss. To address this issue, we formulate their constraints and apply an integer linear programming approach to find an optimal partitioning result — one that requires a trade‐off between these two factors. Experimental results on a reconfigurable MPSoC platform demonstrate the effectiveness of the proposed method, with 20% to 40% performance improvements compared to a traditional software pipeline–based partitioning method. 相似文献

18.

High-level software synthesis for the design of communicationsystems

Ritz S. Pankert M. Zivojinovic V. Meyr H. 《Selected Areas in Communications, IEEE Journal on》1993,11(3):348-358

A synthesis environment that targets software programmable architectures such as digital signal processors (DSPs) is presented. These processors are well suited for implementation of real-time signal processing systems with medium throughput requirements. Techniques that tightly couple the synthesis environment to an existing communication system simulator are also presented. This enables a seamless transition between the simulation and implementation design level of communication systems. Special focus is on optimization techniques for mapping data flow oriented block diagrams onto DSPs. The combination of different mapping and optimization strategies allows comfortable synthesis of real-time code that is highly adapted to application-specific needs imposed by constraints on memory space, sampling rate, or latency. Thus, tradeoff analysis is supported by efficient interactive or automatic exploration of the design space. All presented concepts are illustrated by the design of a phase synchronizer with automatic gain control on a floating-point DSP 相似文献

19.

A Generalized Technique for Register Counting and its Application to Cost-Optimal DSP Architecture Synthesis

Kazuhito Ito Keshab K. Parhi 《The Journal of VLSI Signal Processing》1997,16(1):57-72

In this paper we propose a generalized technique to count the required number of registers in a schedule which supports overlapped scheduling and can be applied to the case where a general digit-serial data format is used. This technique is integrated into an integer linear programming (ILP) model for time-constrained scheduling. In the ILP model, appropriate processors of certain data formats are chosen from a library of processors and data format converters are automatically inserted between processors of different data formats if necessary. Then the required number of registers for each data format is evaluated correctly by the proposed technique. Hence an optimal architecture for a given digital signal processing algorithm is synthesized where the cost of registers as well as the cost of processors and data format converters are minimized. It is shown that by including the cost of registers in the synthesis task as proposed in this paper leads to up to 12.8% savings in the total cost of the synthesized architecture when compared with synthesis performed without including the register cost in the total cost. 相似文献

20.

Novel dynamic merged load technology

《Solid-State Circuits, IEEE Journal of》1985,20(2):537-541

Two new device concepts for dynamic ratioless inverter logic circuits are presented. Very high circuit density is achieved by replacing the traditional MOS dynamic load transistor with a novel load element which is merged with the switching transistor. Both device types can be implemented with a relatively standard double polysilicon CMOS process and are ideally suited for very low-power digital signal processors, serial memories and correlators, and digital image processors. 相似文献