首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
选择分模块的数据通道作为高层次综合的目标结构,完整地定义了同时实现算子调度和数据流图划分的高层次综合算法,并提出一种有效的启发式求解方法.与传统的结构相比,由于在关键路径中消除了全局连线的延时,分模块的结构可以有效地减小时钟周期、优化电路性能.实验结果验证了该方法的有效性.  相似文献   

2.
王磊  魏少军 《半导体学报》2004,25(4):383-387
选择分模块的数据通道作为高层次综合的目标结构,完整地定义了同时实现算子调度和数据流图划分的高层次综合算法,并提出一种有效的启发式求解方法.与传统的结构相比,由于在关键路径中消除了全局连线的延时,分模块的结构可以有效地减小时钟周期、优化电路性能.实验结果验证了该方法的有效性  相似文献   

3.
Temperature affects not only the performance but also the power, reliability, and cost of the embedded system. This paper proposes a temperature-aware task allocation and scheduling algorithm for MPSoC embedded systems. Thermal-aware heuristics are developed, and a temperature-aware floorplanning tool is used to reduce the peak temperature and achieve a thermally even distribution while meeting real time constraints. The paper investigates both power-aware and thermal-aware approaches to the task allocation and scheduling. The experimental results show that the thermal-aware approach outperforms the power-aware schemes in terms of maximal and average temperature reductions. To the best of our knowledge, this is the first MPSoC task allocation and scheduling algorithm that takes temperature into consideration.
  相似文献   

4.
High temperature has become a major problem for system-on-chip testing. In order to reduce the test application time while keeping the temperatures of the cores under test within safe ranges, a thermal-aware test scheduling technique is required. This paper presents an approach to minimize the test application time and, at the same time, prevent the temperatures of cores under test going beyond given limits. We employ test set partitioning to divide test sets into shorter test sequences, and add cooling periods between test sequences so that overheating can be avoided. Moreover, test sequences from different test sets are interleaved, such that the cooling periods and the bandwidth of the test bus can be utilized for test data transportation, and hence the test application time can be reduced. The test scheduling problem is formulated as a combinatorial optimization problem, and we use the constraint logic programming (CLP) to build the optimization model and find the optimal solution. As the optimization time of the CLP-based approach increases exponentially with the problem size, we also propose a heuristic which generates longer test schedules but requires substantially shorter optimization time. Experimental results have shown the efficiency of the proposed approach.  相似文献   

5.
Non-volatile memories (NVMs) show great potential in replacing DRAM as the main memory in many embedded systems because of their attractive characteristics such as low cost, high density, and low energy consumption. However, the problem of asymmetric read and write costs has to be addressed before the advantages of NVM can be fully exploited. That is, the cost of write operation is much more expensive than the cost of read operation on NVMs. The existing techniques for loop optimization cannot be used effectively with non-volatile main memory because this special feature is not considered. In this paper, we propose an efficient loop scheduling algorithm, the Rotation with Maximum Bipartite Matching (RMBM) algorithm, to address the problem of expensive write operations on non-volatile main memory for chip multiprocessors (CMPs). It achieves high parallelism for a loop and, at the same time, reduces the number of write operations on NVM. The experimental results show that the RMBM algorithm reduces the number of write activities on NVM by 34.5 % on average compared with the traditional rotation scheduling algorithm. The execution time is reduced by 20.5 %, and the energy consumption is also reduced by 15.03 % on average using the RMBM algorithm. In other words, the average lifetime of NVM can be extended by more than 2 times using the proposed technique.  相似文献   

6.
韩曙  刘明业 《电子学报》2000,28(11):12-15
本文提出一种适合于地址快速映射的数组划分算法.这个算法具有通用性好,算法简单的优点.由该算法构造的地址映射电路硬件成本低,运算速度快,地址空间利用率高,不仅适用于存储器的高级综合,也可应用于存储器系统的手工设计.  相似文献   

7.
Bergida  S. Shavitt  Y. 《IEEE network》2007,21(4):46-50
Two-rate SLAs become increasingly popular in today's Internet, allowing a customer to save money by paying one price for committed traffic and a much lower price for additional traffic that is not guaranteed. These types of SLAs are suggested for all types of traffic from best effort to QoS constrained applications. In access networks, where these SLAs are prevalent, shared memory switches are a common feature of architecture. Thus, dimensioning and management of shared memory queues for multiple priorities, each with two levels of guarantees, becomes an interesting challenge. We present a simple analysis of a multipriority multi-discard-level system controlled by a buffer occupancy threshold policy aimed at assuring SLA compliance for conforming (i.e., committed) traffic, and performance maximization for nonconforming (i.e., excess) traffic. Our analysis shows how to calculate the different system parameters: total buffer size, threshold position, and offered load control performance for the committed and excess traffic. Our suggested design enables assuring high SLA compliance for conforming traffic and performance maximization for nonconforming traffic.  相似文献   

8.
Available energy becomes a critical design issue for the increasingly complex real-time embedded systems. Phase Change Memory (PCM), with high density and low idle power, has recently been extensively studied as a promising alternative of DRAM. Hybrid PCM-DRAM main memory architecture has been proposed to leverage the low power of PCM and high speed of DRAM. In this paper, we propose energy-aware real-time task scheduling strategies for hybrid PCM-DRAM based embedded systems. Given the execution time variation when a task is loaded into PCM or DRAM, we re-design the static table-driven scheduling for a set of fixed tasks, as well as the Rate-Monotonic (RM) and Earliest Deadline First (EDF) scheduling policies for periodic task sets. Furthermore, since the actual execution time can be much shorter than the worst-case execution time in the actual execution, we propose online schedulers which migrates the tasks between PCM and DRAM to optimize the energy consumption by utilizing the slack time resulted from the completed tasks. All the proposed algorithms minimize the number of task migrations from PCM to DRAM by ensuring that aperiodic tasks are not migrated while each periodic task instance can be migrated at most once. Experimental results show our proposed scheduling algorithms satisfy the real-time constraints and significantly reduce the energy consumption.  相似文献   

9.
The paper proposes a novel heuristic technique for integrated hardware-software partitioning, hardware design space exploration and scheduling. The technique maps an application specified as a task graph on a heterogeneous architecture with an objective to minimize the latency of the task graph subject to the area constraint on the hardware coprocessor. The technique uses an iterative approach where the partitioner decides the processor mapping and HW design points of some tasks. The scheduler then simultaneously decides the processor mapping, HW design point and schedule time of the remaining tasks. There exists a tight coupling between the two design stages allowing them to produce superior quality designs in fewer iterations. The technique accounts for the time overheads due to inter-processor /intra-processor communication and shared memory access conflicts. It can therefore be used for both communication intensive and computation intensive applications. The technique also considers dynamic reconfiguration capability of the hardware coprocessor. The technique performs tradeoff analysis and maps hardware tasks to mutually exclusive temporal segments if this results in lower latency. The effectiveness of the technique is demonstrated by a case study of the JPEG image compression algorithm, comparison with an optimal ILP based approach and experimentation with synthetic graphs.  相似文献   

10.
Parallel simulation generally needs efficient, reliable and order-preserving communication. In this article, a zero-copy, reliable and order-preserving intra-node message passing approach named Ze ROshm is proposed. This mechanism partitions shared memory into segments assigned to processes for receiving messages.Each segment consists of two levels of index L1 and L2 that recordes the order of messages in the host segment,and the processes also read from and write to the segments directly accord...  相似文献   

11.
GNU/Hurd操作系统是GNU设计用来替代Unix内核的新一代操作系统内核。Hurd基于微内核Mach 3.0架构之上,并与Mach系统独具各自不同的优点与特点。首先,简单介绍Hurd与Mach系统的关系,并且介绍了作为新一代操作系统内核的优点和特点,其次,重点分析了Mach微内核的内存管理以及Mach实现内存管理的接口。最后,讨论了如何在Mach微内核下完成内存共享,给出并分析了三种不同的实现方案。  相似文献   

12.
针对任意周期雷达任务(侦查、干扰、探测等)在多功能组网雷达中的规划问题,提出了一种可变参数的任务规划方法。首先,分析雷达任务特性,建立可变参数雷达任务模型。然后,设定组网雷达中各雷达任务规划时间窗宽度和开槽宽度,使组网雷达可执行任务的周期与现有任务周期最接近。接着,针对雷达任务任意周期的特点,以任务周期变化量最小为原则对任务周期进行调整,保证多个任务在同一部雷达中不重叠,并且具有周期性。最后,利用启发式算法,以任务驻留时间的平均隶属度最大为目标完成任务规划。仿真结果与实例证明了该方法的有效性。  相似文献   

13.

Mobile edge computing (MEC) is a promising technology that has the potential to meet the latency requirements of next-generation mobile networks. Since MEC servers have limited resources, an orchestrator utilizes a scheduling algorithm to decide where and when each task should execute so that the quality of service (QoS) of each task is achieved. The scheduling algorithm should use the least possible resources required to meet the service demands. In this paper, we develop a two-level cooperative scheduling algorithm with a centralized orchestrator layer. The first scheduling level is used to schedule tasks locally on MEC servers. In contrast, the second level resides at the orchestrator and assigns tasks to a neighboring base station or the cloud. The tasks serve in accordance with their priority, which is determined by the latency and required throughput. We also present a resource optimization algorithm for determining resource distribution in the system in order to ensure satisfactory service availability at the minimum cost. The resource optimization algorithm contains two variations that can be employed depending on the traffic model. One variant is used when the traffic is uniformly distributed, and the other is used when the traffic load is unbalanced among base stations. Numerical results show that the cooperative model of task scheduling outperforms the non-cooperative model. Furthermore, the results show that the suggested scheduling algorithm performs better than other well-known scheduling algorithms, such as shortest job first scheduling and earliest deadline first scheduling.

  相似文献   

14.
Variable length encoding can considerably decrease code size in VLIW processors by reducing the number of bits wasted on encoding No Operations(NOPs). A processor may have different instruction templates where different execution slots are implicitly NOPs, but all combinations of NOPs may not be supported by the instruction templates. The efficiency of the NOP encoding can be improved by the compiler trying to place NOPs in such way that the usage of implicit NOPs is maximized. Two different methods of optimizing the use of the implicit NOP slots are evaluated: (a) prioritizing function units that have fewer implicit NOPs associated with them and (b) a post-pass to the instruction scheduler which utilizes the slack of the schedule by rescheduling operations with slack into different instruction words so that the available instruction templates are better utilized. Three different methods for selecting basic blocks to apply FU priorization on are also analyzed: always, always outside inner loops, and only outside inner loops only in basic blocks after testing where it helped to decrease code size. The post-pass optimizer alone saved an average of 2.4 % and a maximum of 10.5 % instruction memory, without performance loss. Prioritizing function units in only those basic blocks where it helped gave the best case instruction memory savings of 10.7 % and average savings of 3.0 % in exchange for an average 0.3 % slowdown. Applying both of the optimizations together gave the best case code size decrease of 12.2 % and an average of 5.4 %, while performance decreased on average by 0.1 %.  相似文献   

15.
In Wireless Sensor Networks, collection of data from the sensor nodes without data loss is a major challenge of great concern. Nodes are deployed statically and will relay the data to the base station which lead to the problem of energy-drain to the nodes near the base station since these nodes have to constantly relay data to the base station. Data collection from the sensor nodes by the mobile node or element without data loss is termed as the scheduling of the Mobile Element (ME). The proposed problem can be classified into three phases. In the initial phase, the nodes are clustered according to their geographical region in a hierarchical fashion. In the second phase, the nodes within each cluster which are in the active state are only visited by the mobile element. Quad-tree based partitioning is performed in order to schedule the visit by ME to the nodes in the active state within each cluster. In the third phase, the ME visits only the boundary-near nodes and the speed of the ME is varied based on the simplex method such that the data loss is minimized.  相似文献   

16.
We address the problem of congestion resolution in optical packet switching (OPS). We consider a fairly generic all-optical packet switch architecture with a feedback optical buffer constituted of fiber delay lines (FDL). Two alternatives of switching granularity are addressed for a switch operating in a slotted transfer mode: switching at the slot level (i.e., fixed length packets of a single slot) or at the burst level (variable length packets that are integer multiples of the slot length). For both cases, we show that in spite of the limited queuing resources, acceptable performance in terms of packet loss can be achieved for reasonable hardware resources with an appropriate design of the time/wavelength scheduling algorithms. Depending on the switching units (slots or bursts), an adapted scheduling algorithm needs to be deployed to exploit the bandwidth and buffer resources most efficiently.  相似文献   

17.
In contemporary multi-core systems, memory is shared among a number of concurrent threads. Memory contention and interference are becoming increasingly severe incurring such problems as performance degradation, unfair resource sharing and priority inversion. In this paper, we aim at the challenge of improving performance and fairness for concurrent threads while minimizing energy consumption in main memory. Therefore, we propose PUMA, a novel solution that reduces memory contention and interference by judiciously partitioning threads among cores and allocating each core exclusive memory banks and bandwidth based on thread’s characteristics. Our results demonstrate that PUMA is able to improve both performance and fairness while reducing energy consumption significantly compared to existing memory management approaches.  相似文献   

18.
Over the last 20 years, the performance gap between CPU and memory has been steadily increasing. As a result, a variety of techniques has been devised to hide that performance gap, from intermediate fast memories (caches) to various prefetching and memory management techniques for manipulating the data present in these caches. In this paper we propose a new memory management technique that takes advantage of access pattern information that is available at compile time by prefetching certain data elements before explicitly being requested by the CPU, as well as maintaining certain data in the local memory over a number of iterations. In order to better take advantage of the locality of reference present in loop structures, our technique also uses a new approach to memory by partitioning it and reducing execution to each partition, so that information is reused at much smaller time intervals than if execution followed the usual pattern. These combined approaches—using a new set of memory instructions as well as partitioning the memory—lead to improvements in total execution time of approximately 25% over existing methods.  相似文献   

19.
赵新胜  鞠涛  尤肖虎 《电子学报》2005,33(7):1173-1176
本文针对后三代(B3G)移动通信系统中的宽带无线信道特性和流媒体业务特征,分析了可用于高速下行共享信道的各种传统分组调度算法,提出面向流媒体业务能够提高系统吞吐量的基于优先级公平调度(Priority-Based Fairness Scheduling,PBFS)算法.该算法根据各移动用户收发信道质量和业务传输的QoS要求动态调整各用户的业务传输优先级,确定下行共享信道的调度方案.并给出该算法的简化形式S-PBFS.仿真结果表明,与传统调度算法相比,S-PBFS算法在数据包传输时延受限的条件下具有无线信道利用率高、实现复杂度低等特点.  相似文献   

20.
分析了军事通信中传统恒模盲均衡算法的基本原理,并针对恒模算法收敛速度慢、易陷入局部极小点、收敛速度与稳态剩余误差之间矛盾突出的缺点,提出一种基于峭度和记忆梯度的变步长算法。经过理论分析和实验仿真,证明改进后的算法较传统算法具有更好的均衡效果。  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号