期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

1.

Behavioral-level synthesis of heterogeneous BISR reconfigurableASIC's

Guerra L.M. Potkonjak M. Rabaey J.M. 《Very Large Scale Integration (VLSI) Systems, IEEE Transactions on》1998,6(1):158-167

In this paper, behavioral-level synthesis techniques are presented for the design of reconfigurable hardware. The techniques are applicable for synthesis of several classes of designs, including: (1) design for fault tolerance against permanent faults, (2) design for Improved manufacturability, and (3) design of application specific programmable processors (ASPPs)-processors designed to perform any computation from a specified set on a single implementation platform. This paper focuses on design techniques for efficient built-in self-repair (BISR), and thus directly addresses the former two applications. Previous BISR techniques have been based on replacing a failed module with a backup of the same type. We present new heterogeneous BISR methodologies which remove this constraint and enable replacement of a module with a spare of a different type. The approach is based on the flexibility of behavioral-level synthesis to explore the design space. Two behavioral synthesis techniques are developed; the first method is through assignment and scheduling, and the second utilizes transformations. Experimental results verify the effectiveness of the approaches 相似文献

2.

Synthesis of Hard Real-Time Application Specific Systems

Chunho Lee Miodrag Potkonjak Wayne Wolf 《Design Automation for Embedded Systems》1999,4(4):215-242

This paper presents a system level approach for the synthesis of hard real-time multitask application specific systems. The algorithm takes into account task precedence constraints among multiple hard real-time tasks and targets a multiprocessor system consisting of a set of heterogeneous off-the-shelf processors. The optimization goal is to select a minimal cost multi-subset of processors while satisfying all the required timing and precedence constraints. There are three design phases: resource allocation, assignment, and scheduling. Since the resource allocation is a search for a minimal cost multi-subset of processors, we adopted an A* search based technique for the first synthesis phase. A variation of the force-directed optimization technique is used to assign a task to an allocated processor. The final scheduling of a hard-real time task is done by the task level scheduler which is based on Earliest Deadline First (EDF) scheduling policy. Our task level scheduler incorporates force-directed scheduling methodology to address the situations where EDF is not optimal. The experimental results on a variety of examples show that the approach is highly effective and efficient. 相似文献

3.

基于优化回溯模型的无重叠模调度算法

下载免费PDF全文

谭明星刘先华张吉豫程旭《电子学报》2012,40(8):1681-1686

软件流水技术通过重组循环体来挖掘指令级并行性,模调度是一类广泛使用的软件流水调度算法.传统模调度算法通常会产生变量活跃域重叠和寄存器压力增大问题,无法适用于嵌入式处理器.本文面向嵌入式处理器特性,建立了一种优化回溯模型,并基于该回溯模型提出了一种面向嵌入式处理器的无重叠模调度算法(NOn-Over-lapped Iterative Modulo Scheduling,简称NOOI).NOOI算法使用循环相关反依赖消除变量活跃域重叠,并使用依赖约束和资源约束回溯模型消解节点冲突,从而提高了模调度的有效性.实验结果表明,NOOI模调度算法能够有效改进模调度成功率和循环启动间距,并提高程序性能. 相似文献

4.

Multimedia Signal Processors: An Architectural Platform with Algorithmic Compilation

Yen-Kuang Chen S.Y. Kung 《The Journal of VLSI Signal Processing》1998,20(1-2):181-204

Novel algorithmic features of multimedia applications and advances in VLSI technologies are driving forces behind the new multimedia signal processors. We propose an architecture platform which could provide high performance and flexibility, and would require less external I/O and memory access. It is comprised of array processors to be used as the hardware accelerator and RISC cores to be used as the basis of the programmable processor. It is a hierarchical and scalable architecture style which facilitates the hardware-software codesign of multimedia signal processing circuits and systems. While some control-intensive functions can be implemented using programmable CPUs, other computation-intensive functions can rely on hardware accelerators.To compile multimedia algorithms, we also present an operation placement and scheduling scheme suitable for the proposed architectural platform. Our scheme addresses data reusability and exploits local communication in order to avoid the memory/communication bandwidth bottleneck, which leads to faster program execution. Our method shows a promising performance: a linear speed-up of 16 times can be achieved for the block-matching motion estimation algorithm and the true motion tracking algorithm, which have formed many multimedia applications (e.g., MPEG-2 and MPEG-4). 相似文献

5.

A Specification Refinement Methodology for Power Efficient Partitioning of Data-Dominated Algorithms Within Performance Constraints

K. Masselos K. Danckaert F. Catthoor N. Zervas C.E. Goutis H. De Man 《The Journal of VLSI Signal Processing》2000,26(3):291-317

相似文献

6.

Synthesis of Native Mode Self-Test Programs

Jian Shen Jacob A. Abraham 《Journal of Electronic Testing》1998,13(2):137-148

Recent studies show that at-speed functional tests are better for finding realistic defects than tests executed at lower speeds. This advantage has led to growing interest in design for at-speed tests. In addition, time-to-market requirements dictate development of tests early in the design process. In this paper, we present a new methodology for synthesis of at-speed self-test programs for microprocessors. Based on information about the instruction set, this high-level test generation methodology can generate instruction sequences that exercise all the functional capabilities of complex processors. Modern processors have large memory modules, register files and powerful ALUs with comprehensive operations, which can be used to generate and control built-in tests and to evaluate the response of the tests. Our method exploits the functional units to compress and check the test response at chip internal speeds. No hardware test pattern generators or signature analyzers are needed, and the method reduces area overhead and performance impact as compared to current BIST techniques. A novel test instruction insertion technique is introduced to activate the control/status inputs and internal modules related to them. The new methodology has been applied to an example processor much more complex than any benchmark circuit used in academia today. The results show that our approach is very effective in achieving high fault coverage and automation in at-speed self-test generation for microprocessor-like circuits. 相似文献

7.

Peak Temperature Minimization via Task Allocation and Splitting for Heterogeneous MPSoC Real-Time Systems

Junlong Zhou Jianming Yan Jing Chen Tongquan Wei 《Journal of Signal Processing Systems》2016,84(1):111-121

With the continued scaling of the CMOS devices, the exponential increase in power density has strikingly elevated the temperature of on-chip systems. Thus, thermal-aware design has become a pressing research issue in computing system, especially for real-time embedded systems with limited cooling techniques. In this paper, the authors formulate the thermal-aware real-time multiprocessor system-on-chip (MPSoC) task allocation and scheduling problem, present a task-to-processor assignment heuristics that improves the thermal profiles of tasks, and propose a task splitting policy that reduces the on-chip peak temperature. The thermal profiles of tasks are improved via task mapping by minimizing task steady state temperatures, and the task splitting technique is applied to reduce the peak temperature by enabling the alternation of hot task execution and slack time. The proposed algorithms explicitly exploits thermal characteristics of both tasks and processors to minimize the peak temperature without incurring significant overheads. Extensive simulations of benchmarking tasks were performed to validate the effectiveness of the proposed algorithms. Experimental results have shown that the task steady state temperature achieved by the proposed algorithm is 3.57 °C lower on average as compared to the benchmarking schemes, and the peak temperature of the proposed algorithm can be up to 11.5 % lower than that of the benchmarking schemes 相似文献

8.

High-efficiency memory BISR with two serial RA stages using spare memories

Kang I. Jeong W. Kang S. 《Electronics letters》2008,44(8):515-517

As technology has become more advanced, the density of memory has increased greatly. This development has led to need for a high- efficiency redundancy analysis (RA) algorithm to improve yield rate. Presented is a new methodology that can achieve high-efficiency repair against faults in memory. Experimental results show that the proposed built-in self-repair (BISR) method performs well. 相似文献

9.

Resource-constrained loop list scheduler for DSP algorithms 总被引：1，自引：0，他引：1

Ching -Yi Wang Keshab K. Parhi 《The Journal of VLSI Signal Processing》1995,11(1-2):75-96

We present a new algorithm for resource-constrained scheduling for digital signal processing (DSP) applications when the number of processors is fixed and the objective is to obtain a schedule with the minimum iteration period. This type of scheduling is best suited for moderate speed applications where conservation of area and power is more important than speed. We define and make use of newgraph dependent constraints to obtain a lower bound estimate on the iteration period for any data-flow graph. By satisfying these constraints before performing the scheduling task, we can restrict the design space and can generate valid schedules in less time than previously reported. The graph dependent constraints provide a more accurate lower bound estimate on the iteration period than previously published results. This new scheduling algorithm exploits the iterative nature of DSP algorithms and uses aniterative-loop based scheduling approach. This resource scheduling algorithm has been incorporated in the Minnesota ARchitecture Synthesis (MARS) system. Our approach exploits inter-iteration and intra-iteration precedence constraints and incorporates implicit retiming and pipelining to generate optimal and near optimal schedules.This research was supported by the Advanced Research Projects Agency under grant number F33615-93-C-1309 and the office of Naval Research under contract number N00014-91-J-1008. 相似文献

10.

Algorithm transformation methods to reduce the overhead of software-based fault tolerance techniques

José Rodrigo Azambuja Gustavo Brown Fernanda Lima Kastensmidt Luigi Carro 《Microelectronics Reliability》2014

This paper introduces a framework that tackles the costs in area and energy consumed by methodologies like spatial or temporal redundancy with a different approach: given an algorithm, we find a transformation in which part of the computation involved is transformed into memory accesses. The precomputed data stored in memory can be protected then by applying traditional and well established ECC algorithms to provide fault tolerant hardware designs. At the same time, the transformation increases the performance of the system by reducing its execution time, which is then used by customized software-based fault tolerant techniques to protect the system without any degradation when compared to its original form. Application of this technique to key algorithms in a MP3 player, combined with a fault injection campaign, show that this approach increases fault tolerance up to 92%, without any performance degradation. 相似文献

11.

Invasive weed optimization based scheduling for digital microfluidic biochip operations

《Integration, the VLSI Journal》2021

Digital Microfluidic Biochips (DMFBs) based on electro-wetting-on-dielectric (EWOD) technology are a class of lab-on-a-chip (LOC) devices. DMFBs can efficiently carry out biochemical analysis and have many advantages over the traditional laboratory system. DMFBs offer miniaturization, automation, and programmability. Resource-constrained scheduling is the first and vital step of fluidic-level synthesis of DMFBs while the other two are placement and routing of droplets. Scheduling DMFB operations is a constrained optimization problem which is NP-Complete. We propose an invasive weed optimization (IWO) algorithm based scheduling for the synthesis of DMFBs. The IWO algorithm is a nature-inspired meta-heuristic algorithm. Proposed algorithm can be used for the offline synthesis of DMFBs, where solution quality is more important than execution time. Each weed in the proposed algorithm represents a potential candidate solution for the scheduling problem. To calculate the fitness of individual weeds, we proposed an algorithm based on Heterogeneous Earliest Finish Time (HEFT), which incorporates resource binding, scheduling, and greedy module selection mechanism for bio-assay operations. Weeds (solutions) update their positions (priorities) by colonization behavior of weeds. Simulation results show that proposed IWO outperforms iterative improvement based algorithms and optimal ILP based algorithms which are existing for the offline synthesis of DMFBs. 相似文献

12.

行为级DSP算法描述的阵列处理器综合方法研究

许超张增雁《电子学报》1993,21(5):1-9

本文提出了针对递归DSP算法的高层次系统综合流程,并以脉动(systolic)式处理器阵列结构实现.从DSP算法的FDDL行为级描述开始,经由编译及划分,产生数据相关流图(Data Dependency Graph),然后实现对算法流图的空间映射及时域规划,得到算法的信号流图(Signal Flow Graph),经时序重构,生成脉动阵列,最后实现对处理器单元的数据路径综合及控制器综合,并对处理器单元定位,本文同时提出了各设计阶段的算法策略及优化策略,并给出综合结果。相似文献

13.

Simultaneous Dynamic Voltage Scaling of Processors and Communication Links in Real-Time Distributed Embedded Systems

Luo J. Jha N. K. Peh L.-S. 《Very Large Scale Integration (VLSI) Systems, IEEE Transactions on》2007,15(4):427-437

Dynamic voltage scaling has been widely acknowledged as a powerful technique for trading off power consumption and delay for processors. Recently, variable-frequency (and variable-voltage) parallel and serial links have also been proposed, which can save link power consumption by exploiting variations in the bandwidth requirement. This provides a new dimension for power optimization in a distributed embedded system connected by a voltage-scalable interconnection network. At the same time, it imposes new challenges for variable-voltage scheduling as well as flow control. First, the variable-voltage scheduling algorithm should be able to trade off the power consumption and delay jointly for both processors and links. Second, for the variable-frequency network, the scheduling algorithm should not only consider the real-time constraints, but should also be consistent with the underlying flow control techniques. In this paper, we address joint dynamic voltage scaling for variable-voltage processors and communication links in such systems. We propose a scheduling algorithm for real-time applications that captures both data flow and control flow information. It performs efficient routing of communication events through multihops, as well as efficient slack allocation among heterogeneous processors and communication links to maximize energy savings, while meeting all real-time constraints. Our experimental study shows that on an average, joint voltage scaling on processors and links can achieve 32% less power compared with voltage scaling on processors alone 相似文献

14.

A Generalized Technique for Register Counting and its Application to Cost-Optimal DSP Architecture Synthesis

Kazuhito Ito Keshab K. Parhi 《The Journal of VLSI Signal Processing》1997,16(1):57-72

In this paper we propose a generalized technique to count the required number of registers in a schedule which supports overlapped scheduling and can be applied to the case where a general digit-serial data format is used. This technique is integrated into an integer linear programming (ILP) model for time-constrained scheduling. In the ILP model, appropriate processors of certain data formats are chosen from a library of processors and data format converters are automatically inserted between processors of different data formats if necessary. Then the required number of registers for each data format is evaluated correctly by the proposed technique. Hence an optimal architecture for a given digital signal processing algorithm is synthesized where the cost of registers as well as the cost of processors and data format converters are minimized. It is shown that by including the cost of registers in the synthesis task as proposed in this paper leads to up to 12.8% savings in the total cost of the synthesized architecture when compared with synthesis performed without including the register cost in the total cost. 相似文献

15.

Stability-based algorithms for high-level synthesis of digitalASICs

Nourani M. Papachristou C. 《Very Large Scale Integration (VLSI) Systems, IEEE Transactions on》2000,8(4):431-435

This paper presents new algorithms for the scheduling and allocation phases in high-level synthesis under time and resource constraints. This is achieved by formulating these problems in terms of Liapunov's stability theorem using a transformation technique between the design space and the dynamic system space. These algorithms are based on moves in the design space, which correspond to the moves toward the equilibrium point in the dynamic system space. The scheduling algorithm (MFS) takes care of mutually exclusive operations, loop folding, multicycle operations, chained operations, and pipelining (structural and functional). The mixed scheduling-allocation algorithm (MFSA) can handle all of the above scheduling applications as well as simultaneously performing allocation of functional units, registers, and interconnects while minimizing the overall cost 相似文献

16.

一种使用改进预测成本矩阵任务优先排序的异构计算系统列表调度算法

姚宇宋宇鲲杨国伟黄英张多利《电子与信息学报》2023,45(1):125-133

异构计算系统执行应用效率的提高高度依赖有效的调度算法。该文提出一种新的列表调度算法,称为改进的预测优先任务和乐观处理器选择调度(IPPOSS)。通过在任务优先级排序阶段引入任务的后向预测成本,来减少调度长度。与现有工作相比,该文使用改进预测成本矩阵(IPCM),更合理地进行了任务优先级排序,从而在处理器选择阶段获得了更好的解,并保持2次时间复杂度。IPCM考虑了任务优先级排序阶段的各种计算、通信因素,比预测优先任务调度(PPTS)提出的预测成本矩阵(PCM)更容易获得合理的优先级列表。随机生成应用的有向无环图(DAG)和真实世界应用的DAG的实验结果分析表明,IPPOSS的性能优于相关算法。相似文献

17.

A game theoretic approach for power optimization during behavioral synthesis

Murugavel A.K. Ranganathan N. 《Very Large Scale Integration (VLSI) Systems, IEEE Transactions on》2003,11(6):1031-1043

In this paper, we describe a new methodology based on game theory for minimizing the average power of a circuit during scheduling and binding in behavioral synthesis. The problems are formulated as auction-based noncooperative finite games for which solutions are proposed based on the Nash equilibrium. In the scheduling algorithm, a first-price sealed-bid auction approach is used while, for the binding algorithm, each functional unit in the datapath is modeled as a player bidding for executing an operation with the estimated power consumption as the bid. Further, the techniques of functional unit sharing, path balancing, and register assignment are incorporated within the binding algorithm for power reduction. The combined scheduling and binding algorithm is formulated as a single noncooperative auction game with the functional units in the datapath modeled as players bidding for executing the operation in a particular control cycle. The proposed algorithms yield power reduction without any increase in area overhead and only a slight increase in the latency for some of the benchmark circuits. Experimental results indicate that the proposed game theoretic solution for binding yields an improvement of 13.9% over the linear programming (LP) method, while the scheduling and the combined scheduling and binding algorithms yield average improvements of 6.3% and 11.8%, respectively, over the integer-linear programming (ILP) approach. 相似文献

18.

Next Generation Wireless Mobile System Efficient, Fair, Class Based Packet Scheduling Algorithm

Hussain Mohammed Pj Radcliffe 《Wireless Personal Communications》2013,72(4):1969-1991

Efficient utilization of network resources is a key goal for emerging broadband wireless access systems (BWAS). This is a complex goal to achieve due to the heterogeneous service nature and diverse quality of service (QoS) requirements of various applications that BWAS support. Packet scheduling is an important activity that affects BWAS QoS outcomes. This paper proposes a novel packet scheduling mechanism that improves QoS in mobile wireless networks which exploit IP as a transport technology for data transfer between BWAS base stations and mobile users at the radio transmission layer. In order to improve BWAS QoS the new packet algorithm makes changes at both the IP and the radio layers. The new packet scheduling algorithm exploits handoff priority scheduling principles and takes into account buffer occupancy and channel conditions. The packet scheduling mechanism also incorporates the concept of fairness. Performance results were obtained by computer simulation and compared to the well known algorithms. Results show that by exploiting the new packet scheduling algorithm, the transport system is able to provide a low handoff packet drop rate, low packet forwarding rate, low packet delay and ensure fairness amongst the users of different services. 相似文献

19.

基于哈希表的高效存储器内建自修复方法

下载免费PDF全文

郭旭峰于芳刘忠立《电子学报》2013,41(7):1371-1377

现有存储器内建自修复方法要么遍历式地址比较效率低,要么并行地址比较功耗高,都不适用于大故障数存储器.对此,本文提出一种高效的存储器内建自修复方法,该方法对占故障主体的单元故障地址以哈希表形式进行存储,以利用哈希表的快速搜索特性提升地址比较效率.本文方法修复后的存储器在1个时钟周期内即可完成地址比较,修复后存储器性能不受任何影响,与目前广泛采用的基于CAM的方法处于同一水平,但功耗方面却具有明显优势.计算机模拟实验表明,对于512×512×8bits的存储器在同等冗余开销的情况下本文方法修复率相对于ESP方法平均提高了32.25%. 相似文献

20.

OFDM系统分组调度算法仿真与分析

刘平张成车进《通信技术》2012,45(3):10-12

对于蜂窝正交频分复用(OFMA)系统,分组调度技术是影响系统性能提升的重要因素。分组调度算法是为分组业务提供资源分配及复用的方法。这里介绍了分组调度技术原理及几种经典的调度算法,并通过Matlab仿真,对几种调度算法的公平性和吞吐量进行了比较。通过仿真可知,比例公平算法(PF,Proportional Fair)平衡考虑了系统吞吐量与公平性,取得了较好的效果。相似文献