首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 46 毫秒
1.
Transformation techniques are usually applied to get optimal execution rates in parallel and/or pipeline systems. The retiming technique is a common and valuable tool in optimizing 1-D signal processing applications, represented by flow graphs. Such transformation can maximize the parallelism of a loop body. Few results on retiming have been obtained for multidimensional (MD) systems. The article develops a novel framework, which consists of a MD retiming technique that considers the final schedule as part of the optimization process. To the author's knowledge, this is the first retiming algorithm on general MD flow graphs  相似文献   

2.
In this paper we present an optimal and a heuristic approach to solve the binding problem which occurs in high-level synthesis of digital systems. The optimal approach is based on an integer linear programming formulation. Given that such an approach is not practical for large problems, we then derive a heuristic from the ILP formulation which produces very good solutions in order of seconds. The heuristic is based on a network flow model and also considers floorplanning during the design process to minimize the interconnection area  相似文献   

3.
Parallelization of Digital Signal Processing (DSP) software is an important trend in Multiprocessor System-on-Chip (MPSoC) implementation. The performance of DSP systems composed of parallelized computations depends on the scheduling technique, which must in general allocate computation and communication resources for competing tasks, and ensure that data dependencies are satisfied. In this paper, we formulate a new type of parallel task scheduling problem called Parallel Actor Scheduling (PAS) for MPSoC mapping of DSP systems that are represented as Synchronous Dataflow (SDF) graphs. In contrast to traditional SDF-based scheduling techniques, which focus on exploiting graph level (inter-actor) parallelism, the PAS problem targets the integrated exploitation of both intra- and inter-actor parallelism for platforms in which individual actors can be parallelized across multiple processing units. We first address a special case of the PAS problem in which all of the actors in the DSP application or subsystem being optimized are parallel actors (i.e., they can be parallelized to exploit multiple cores). For this special case, we develop and experimentally evaluate a two-phase scheduling framework with three work flows that involve particle swarm optimization (PSO) — PSO with a mixed integer programming formulation, PSO with simulated annealing, and PSO with a fast heuristic based on list scheduling. Then, we extend our scheduling framework to support the general PAS problem, which considers both parallel actors and sequential actors (actors that cannot be parallelized) in an integrated manner. We demonstrate that our PAS-targeted scheduling framework provides a useful range of trade-offs between synthesis time requirements and the quality of the derived solutions. We also demonstrate the performance of our scheduling framework from two aspects: simulations on a diverse set of randomly generated SDF graphs, and implementations of an image processing application and a software defined radio benchmark on a state-of-the-art multicore DSP platform.  相似文献   

4.
In integer linear programming (ILP), formulating a “good” model is of crucial importance to solving that model. In this paper, we begin with a mathematical analysis of the structure of the assignment, timing, and resource constraints in high-level synthesis, and then evaluate the structure of the scheduling polytope described by these constraints. We then show how the structure of the constraints can be exploited to develop a well-structured ILP formulation, which can serve as a solid theoretical foundation for future improvement. As a start in that direction, we also present two methods to further tighten the formulation. The contribution of this paper is twofold: 1) it provides the first in-depth formal analysis of the structure of the constraints, and it shows how to exploit that structure in a well-designed ILP formulation, and 2) it shows how to further improve a well-structured formulation by adding new valid inequalities  相似文献   

5.
This paper presents novel techniques for computing the minimum number of memory locations in statically scheduled digital signal processing (DSP) programs. Two related problems are considered. In the first problem, we compute the minimum number of memory locations required for a scheduled program assuming that no circuit transformations (such as pipelining and retiming) are to be performed after scheduling. For this problem, we consider memory minimization for theoperation-constrained, processor-constrained andunconstrained memory models which represent various restrictions on how data can be allocated to memory. Then we consider the second problem, where memory minimization for a scheduled program is considered simultaneously with retiming using a variation of the retiming problem referred to as theminimum physical storage location (MPSL) retiming. While both problems consider memory minimization for scheduled programs, the second problem minimizes memory using retiming whereas the first problem performs no retiming. The scheduling results obtained from the MARS design system are used to compare memory requirements in the context of both of these problems. Our experiments show that MARS performs an optimal retiming for the schedule it generates. These memory requirements are then compared with an integer linear programming solution to the scheduling problem which is optimal under the unconstrained memory model. It is concluded that the schedule obtained by the MARS system achieves optimality or near-optimality with respect to register minimization.  相似文献   

6.
The incessant market demand for higher and higher processor performance called for a continuous increase of clock frequencies as well as an impressive evolution of the microarchitecture. In this paper, we focus on the latter, highlighting major microarchitectural improvements that were introduced to more effectively utilize instruction level parallelism (ILP) in commercial performance-oriented microprocessors. We will show that designers increased the throughput of the microarchitecture at the ILP level basically by subsequently introducing temporal, issue, and intrainstruction parallelism in such a way that exploiting parallelism along one dimension compelled to introduce parallelism along a new dimension as well to further increase performance. In addition, each basic technique used to implement parallel operation along a certain dimension inevitably caused processing bottlenecks in the microarchitecture, whose elimination gave birth to the introduction of innovative auxiliary techniques. On the other hand, the auxiliary techniques applied allow the basic technique of parallel operation to reach its limits, evoking the debut of a new dimension of parallel operation in the microarchitecture. The sequence of basic and auxiliary techniques coined to increase the efficiency of microarchitectures constitutes a fascinating framework for the evolution of microarchitectures, as presented in our paper.  相似文献   

7.
Reducing multicast traffic load for cellular networks using ad hoc networks   总被引:3,自引:0,他引:3  
There has been recent extensive research on integrating cellular networks and ad hoc networks to overcome the limitations of cellular networks. Although several schemes have been proposed to use such hybrid networks to improve the performance of individual multicast groups, they do not address quality of service (QoS) issues when multiple groups are present. This paper, on the other hand, considers an interesting scenario of hybrid networks when an ad hoc network cannot accommodate all the groups and a base station has to select a subset of groups to optimize its bandwidth savings and maximize the utilization of the ad hoc network while providing QoS support for multicast users. In this paper, a network model for multicast admission control that takes wireless interference into account is developed, the group selection problem is formulated as a multidimensional knapsack problem, and an integer linear programming (ILP) formulation and a polynomial-time dynamic algorithm are proposed. A distributed implementation of the dynamic algorithm in real systems is also examined. Simulation studies demonstrate that the dynamic algorithm is able to achieve very competitive performance under various conditions, in comparison with the optimal solution computed by the ILP approach.  相似文献   

8.
On the routing and wavelength assignment in multifiber WDM networks   总被引:1,自引:0,他引:1  
This paper addresses the problem of routing and wavelength assignment (RWA) in multifiber WDM networks with limited resources. Given a traffic matrix, the number of fibers per link, and the number of wavelengths a fiber can support, we seek to maximize the carried traffic of connections. We formulate the problem as an integer linear program (ILP), and show that the lightpaths selected by this formulation can indeed be established by properly configuring the optical switches. An upper bound on the carried traffic can be computed by solving the linear programming (LP)-relaxation of the ILP formulation. It is shown that this bound can be also computed exactly, and in polynomial-time, by solving a significantly simplified LP which considers only one wavelength. The bound can, thus, easily scale to an arbitrarily large number of wavelengths. Furthermore, we demonstrate that any instance of the RWA problem is also an instance of the more general maximum coverage problem. This allows us to take a greedy algorithm for maximum coverage and obtain an algorithm which provides solutions for the RWA problem that are guaranteed to be within a factor of (1-(1/e)) of the optimal solution. Each iteration of the greedy algorithm selects a set of lightpaths that realizes, using one wavelength, the maximum number of connection requests not previously realized. Computational results confirm the high efficiency of our proposed algorithm.  相似文献   

9.
The problem of minimizing dynamic power consumption by scaling down the supply voltage of computational elements off critical paths is widely addressed in the literature for the case of combinational designs. The problem is NP-hard in general. To address the problem in the case of synchronous sequential digital designs, one needs to move some registers while applying voltage scaling. Moving these registers shifts some computational elements from critical paths, and can be done by basic retiming. Integrating basic retiming and supply voltage scaling to address this NP-hard problem cannot in general be done in polynomial run time. In this paper, we propose to first apply a guided retiming and then to apply supply voltage scaling on the retimed design. We devise new polynomial time algorithms to realize this guided retiming, and the supply voltage scaling on the retimed design. Also, we show that the problem in the case of combinational designs is not NP-hard for some combinational circuits with certain structure, and give a polynomial time algorithm to optimally solve it. Methods to determine lower bounds on the optimal reduction of dynamic power consumption are also provided. Experimental results on known benchmarks have shown that the proposed approach can reduce dynamic power consumption by factors as high as 61% for single-phase designs with minimal clock period. Also, they have shown that it can solve optimally the problem, and produce converter-free designs with reduced dynamic power consumption. For large size circuits from ISCAS'89 benchmark suite, the proposed algorithms run in 15 s-1 h.  相似文献   

10.
11.
Static routing and wavelength assignment (RWA) is usually formulated as an optimization problem with the objective of minimizing wavelength usage (MWU). Existing solution methodologies for the MWU problem are usually based on a two-step approach, where routing and wavelength assignment are done independently. Though this approach can reduce computational cost, the optimality of the solution is compromised. We propose a novel tabu search (TS) algorithm, which considers routing and wavelength assignment jointly without increasing the computational complexity. The performance of the proposed TS algorithm is compared with the integer linear programming (ILP) method, which is known to solve the MWU to optimality. The results for both small and large networks show that our proposed TS algorithm works almost as well as the ILP solution and is much more computationally efficient.  相似文献   

12.
This paper addresses the problem of maximizing lifetime of directional wireless sensor networks, i.e., where sensors can monitor targets in an angular sector only and not all the targets around them. These sectors usually do not overlap, and each sensor can monitor at most one sector at a time. An exact method is proposed using a column generation scheme where a two level strategy, consisting of a genetic algorithm and an integer linear programming approach, is used to solve the auxiliary problem. The role of integer linear programming (ILP) approach is limited to either escaping from local optima or proving the optimality of the current solution. Computational results clearly show the advantage of the proposed approach over a column generation approach based on solving the auxiliary problem through ILP approach alone as the proposed approach is several times faster.  相似文献   

13.
为了充分利用多处理器平台所提供的计算资源,需要将应用以适当的方式映射到不同处理器,从而最大程度地挖掘应用所提供的并发性以满足应用严格的实时性要求。提出了并发图来量化、建模应用任务间的并发性,提出了一种基于自同步调度的并发图构建算法,并将任务映射问题转换成图分割问题,然后将并发图分割问题建模为纯0-1整数线性规划模型并采用ILP求解器获得最优解。采用了大量随机生成的同步数据流图以及一组实际应用对所提方法进行性能评估,实验结果表明所提方法性能优于已有算法。  相似文献   

14.
While in the last decade image and video processing (IVP) have gradually moved from special purpose computer architectures based on massive parallelism (MP) to general purpose computer architectures based on instruction-level parallelism (ILP), a new challenge is now to be faced by the IVP community, namely the application of IVP also in small-size embedded systems (e.g., video players, smart cameras, digital diaries, etc.) based on ILP processors. Because of the requirements of low size, weight, and power consumption, these embedded systems do not take advantage of processors that feature advanced dynamic code optimization mechanisms such as those based on instruction reordering and register renaming. On the other hand, the compile time techniques of present generation compilers do not appear to be aggressive enough to exploit the massive parallelism of IVP tasks in ILP architectures, thus leading to inefficient programs. This paper analyzes the efficiency of IVP programs on ILP CPUs. In particular it presents: (1) a reference model for the efficient design and implementation of highly parallel programs, such as the ones of the IVP domain; (2) an analysis of the inefficiencies of IVP programs implemented on ILP processors; and (3) a set of techniques, deriving from the reference model, that overcome these inefficiencies. These techniques are based on a novel computing paradigm called bucket processing.  相似文献   

15.
Multicast applications such as IPTV, video conferencing, telemedicine and online multiplayer gaming are expected to be major drivers of Internet traffic growth. The disparity between the bandwidth offered by a wavelength and the bandwidth requirement of a multicast connection can be tackled by grooming multiple low bandwidth multicast connections into a high bandwidth wavelength channel or light-tree. Light-trees are known to be especially suited for networks that carry ample multicast traffic. In this paper, we propose new algorithms to address the problem of multicast traffic grooming. In particular, an Integer Linear Programming (ILP) formulation is proposed for optimal assignments of hop constrained light-trees for multicast connections so that network throughput can be maximized. Hop constrained light-trees improve the scalability of the approach by reducing the search space of the ILP formulation. Since solving the ILP problem is very time consuming for realistically large networks, we are motivated to propose a heuristic algorithm with a polynomial complexity, called Dividable Light-Tree Grooming (DLTG) algorithm. This algorithm is based on grooming traffic to constrained light-trees and also divides a light-tree to smaller constrained light-trees on which traffic is groomed for better resource utilization. Simulations show that the proposed DLTG heuristic performs better than other algorithms. It achieves network throughputs which are very close to the ILP formulation results, but with far lower running times.  相似文献   

16.
We discuss the problem of designing translucent optical networks composed of restorable, transparent subnetworks interconnected via transponders. We develop an integer linear programming (ILP) formulation for partitioning an optical network topology into subnetworks, where the subnetworks are determined subject to the constraints that each subnetwork satisfies size limitations, and it is two-connected. A greedy heuristic partitioning algorithm is proposed for planar network topologies. We use section restoration for translucent networks where failed connections are rerouted within the subnetwork which contains the failed link. The network design problem of determining working and restoration capacities with section restoration is formulated as an ILP problem. Numerical results show that fiber costs with section restoration are close to those with path restoration for mesh topologies used in this study. It is also shown that the number of transponders with the translucent network architecture is substantially reduced compared to opaque networks.  相似文献   

17.
Optical wavelength division multiplexing (WDM) rings are being deployed to support SONET/SDH self-healing rings. In such systems, multiple SONET/SDH self-healing rings are realized over a single physical optical ring through wavelength division multiplexing. The cost of such a system is dominated by the SONET add/drop multiplexers (ADMs). To minimize the system cost, algorithms must be developed to assign wavelengths to lightpaths in the system so that the number of ADMs required is minimized. This problem of optimal wavelength assignment to minimize the number of SONET ADMs is known to be NP-hard. Existing heuristic algorithms for this problem include the assign first heuristic, the iterative matching heuristic and the iterative merging heuristic. In this paper, we develop an integer linear programming (ILP) formulation for this problem, propose a new wavelength assignment heuristic, and evaluate the existing and the newly proposed heuristic using the ILP formulation. We conclude that the performance of the newly proposed heuristic is very close to optimal.  相似文献   

18.
We develop new fast algorithms for 2-D integer circular convolutions and 2-D number theoretic transforms (NTT). These new algorithms, which offer improved computational complexity, are constructed based on polynomial transforms over Zp; these transforms are Fourier-like transforms over Zp, which is the integral domain of polynomial forms over Zp[x]. Having defined such polynomial transforms over Zp we prove several necessary and sufficient conditions for their existence. We then apply the existence conditions to recognize two applicable polynomial transforms over Zp. One is for p equal to Mersenne numbers and the other for Fermat numbers. Based on these two transforms, referred to as Mersenne number polynomial transforms (MNPT) and Fermat number polynomial transforms (FNPT), we develop fast algorithms for 2-D integer circular convolutions, 2-D Mersenne number transforms, and 2-D Fermat number transforms. As compared to the conventional row-column computation of 2-D NTT for 2-D integer circular convolutions and 2-D NTT, the new algorithms give rise to reduced computational complexities by saving more than 25 or 42% in numbers of operations for multiplying 2 i, i⩾1; these percentages of savings also grow with the size of the 2-D integer circular convolutions or the 2-D NTT  相似文献   

19.
The increasing trend in the number of cores on a single chip has led to scalability and bandwidth issues in bus-based communication. Network-on-chip (NoC) techniques have emerged as a solution that provides a much needed flexibility and scalability in the era of multi-cores. This article presents an optimal integer linear programming (ILP) formulation and a simulated annealing (SA) solution to thermal and power-aware test scheduling of cores in an NoC-based SoC using multiple clock rates. The methods have been implemented and results on various benchmarks are presented.  相似文献   

20.
The primary goal of this paper is to show that a clever use of redundant number systems in some parts of designs can significantly increase their speed, without noticeably increasing their area and power consumption. This can be achieved by automatically using, in the same design, redundant (e.g., carry save or borrow save) as well as non-redundant (i.e., conventional) number systems: this approach can be called mixed arithmetic. This implies specific constraints in the scheduling process. We propose an integer linear programming (ILP) formulation. It finds an optimal solution for examples of reasonable sizes. In some cases, the ILP computational delay may become huge. To solve this problem, we introduce a general solution, based on a constraint graph partitioning. This leads to an ILP formulation partitioning. This partitioning approach can be used for other similar problems in synthesis, also formulated as ILPs.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号