共查询到20条相似文献,搜索用时 15 毫秒
1.
2.
Shao Z. Wang M. Chen Y. Xue C. Qiu M. Yang L. T. Sha E. H. -M. 《Circuits and Systems II: Express Briefs, IEEE Transactions on》2007,54(5):445-449
In this brief, we propose a novel real-time loop-scheduling technique to minimize energy consumption via dynamic voltage scaling (DVS) for applications with loops considering transition overhead. One algorithm, dynamic voltage loop scheduling (DVLS), is designed integrating with DVS. In DVLS, we repeatedly regroup a loop based on rotation scheduling and decrease the energy by DVS as much as possible within a timing constraint. We conduct the experiments on a set of digital signal processing benchmarks. The experimental results show that DVLS achieves big energy saving compared with the traditional time-performance-oriented scheduling algorithm 相似文献
3.
Jiayi Du Yan Wang Qingfeng Zhuge Jingtong Hu Edwin H. -M. Sha 《Journal of Signal Processing Systems》2013,71(3):261-273
Non-volatile memories (NVMs) show great potential in replacing DRAM as the main memory in many embedded systems because of their attractive characteristics such as low cost, high density, and low energy consumption. However, the problem of asymmetric read and write costs has to be addressed before the advantages of NVM can be fully exploited. That is, the cost of write operation is much more expensive than the cost of read operation on NVMs. The existing techniques for loop optimization cannot be used effectively with non-volatile main memory because this special feature is not considered. In this paper, we propose an efficient loop scheduling algorithm, the Rotation with Maximum Bipartite Matching (RMBM) algorithm, to address the problem of expensive write operations on non-volatile main memory for chip multiprocessors (CMPs). It achieves high parallelism for a loop and, at the same time, reduces the number of write operations on NVM. The experimental results show that the RMBM algorithm reduces the number of write activities on NVM by 34.5 % on average compared with the traditional rotation scheduling algorithm. The execution time is reduced by 20.5 %, and the energy consumption is also reduced by 15.03 % on average using the RMBM algorithm. In other words, the average lifetime of NVM can be extended by more than 2 times using the proposed technique. 相似文献
4.
《Industrial Informatics, IEEE Transactions on》2008,4(3):164-174
5.
无线移动网络具有性能不稳、易受外界干扰等特点,要求实时调度必须考虑网络变化影响.因为网络性能变化可直接导致网络延时变化,即任务耗费在网络传输上的时间发生变化,从而对任务集可调度性产生影响.传统调度算法大多未考虑网络因素.由此提出适应于网络性能不稳定环境下的混合式任务调度策略,测试结果表明该策略有利于提高混合实时任务在网络性能动态变化环境下的可调度性. 相似文献
6.
ZHAO Yong CHEN Liang LI Youfu TIAN Wenhong 《中国通信》2014,(12):125-140
Many Task Computing(MTC)is a new class of computing paradigm in which the aggregate number of tasks,quantity of computing,and volumes of data may be extremely large.With the advent of Cloud computing and big data era,scheduling and executing large-scale computing tasks efficiently and allocating resources to tasks reasonably are becoming a quite challenging problem.To improve both task execution and resource utilization efficiency,we present a task scheduling algorithm with resource attribute selection,which can select the optimal node to execute a task according to its resource requirements and the fitness between the resource node and the task.Experiment results show that there is significant improvement in execution throughput and resource utilization compared with the other three algorithms and four scheduling frameworks.In the scheduling algorithm comparison,the throughput is 77%higher than Min-Min algorithm and the resource utilization can reach 91%.In the scheduling framework comparison,the throughput(with work-stealing)is at least 30%higher than the other frameworks and the resource utilization reaches 94%.The scheduling algorithm can make a good model for practical MTC applications. 相似文献
7.
Hui Liu Zili Shao Meng Wang Junzhao Du Chun Jason Xue Zhiping Jia 《Journal of Signal Processing Systems》2009,57(2):249-262
In this paper, we combine coarse-grained software pipelining with DVS (Dynamic Voltage/Frequency Scaling) for optimizing energy
consumption of stream-based multimedia applications on multi-core embedded systems. By exploiting the potential of multi-core
architecture and the characteristic of streaming applications, we propose a two-phase approach to solve the energy minimization
problem for periodic dependent tasks on multi-core processors with discrete voltage levels. With our approach, in the first
phase, we propose a coarse-grained task-level software pipelining algorithm called RDAG to transform the periodic dependent
tasks into a set of independent tasks based on the retiming technique (Leiserson and Saxe, Algorithmica 6:5–35, 1991). In the second phase, we propose two DVS scheduling algorithms for energy minimization. For single-core processors, we propose
a pseudo-polynomial algorithm based on dynamic programming that can achieve optimal solution. For multi-core processors, we
propose a novel scheduling algorithm called SpringS which works like a spring and can effectively reduce energy consumption
by iteratively adjusting task scheduling and voltage selection. We conduct experiments with a set of benchmarks from E3S (Dick
2008) and TGFF () based on the power model of the AMD Mobile Athlon4 DVS processor. The experimental results show that our technique can achieve
12.7% energy saving compared with the algorithms in Zhang et al. (2002) on average.
相似文献
Zhiping JiaEmail: |
8.
Aaraj N. Ravi S. Raghunathan A. Jha N. K. 《Very Large Scale Integration (VLSI) Systems, IEEE Transactions on》2007,15(3):296-308
In this paper, we propose an efficient and secure embedded processing architecture that addresses various challenges involved in using face-based biometrics for authenticating a user to an embedded system. Our paper considers the use of robust face verifiers (PCA-LDA, Bayesian), and analyzes the computational workload involved in running their software implementations on an embedded processor. We then present a suite of hardware and software enhancements to accelerate these algorithms-fixed-point arithmetic, various code optimizations, generic custom instructions and dedicated coprocessors, and exploitation of parallel processing capabilities in multiprocessor systems-on-chip (SoCs). We also identify attacks targeted against the authentication process, and develop security measures to ensure the integrity of biometric code/data. We evaluated the proposed architectures in the context of popular open-source software implementations of face authentication algorithms running on a commercial embedded processor (Xtensa from Tensilica). Our paper shows that fast, in-system verification is possible even in the context of many resource-constrained embedded systems. We also demonstrate that the security of the authentication process for the given attack model can be achieved with minimum hardware overheads 相似文献
9.
In this paper, we present an approach to hardware-software partitioning for real-time embedded systems. Hardware and software components are modeled at the system level, so that cost and performance tradeoffs can be studied early in the design process and a large design space can be explored. Feasibility factor is introduced to measure the possibility of a real-time system being feasible, and is used as both a constraint and an attribute during the optimization process. An imprecise value function is employed to model the tradeoffs among multiple performance attributes. Optimal partitioning is achieved through the use of an existing computer-aided design tool. We demonstrate the application of our approach through the design of an example embedded system. 相似文献
10.
11.
提出了一种新的用于开放式系统的调度机制,即二维优先级实时调度,它不仅划分任务优先级,还划分调度策略优先级.任务的执行顺序由其调度策略优先级和任务优先级共同决定.它不仅可以解决传统优先级调度机制中机制与调度策略不能相分离的问题,还提高了效率.这种机制中引入的CPU带宽控制策略,可以根据需要实现硬实时、软实时、混合实时不同目标的实时系统,并简化了任务可调度性分析,且可以为不同权限或级别的用户提供不同QoS服务.这种调度架构不仅效率高,而且具有很强的开放性,适用广、易扩展. 相似文献
12.
To rapidly explore the design space of a real-time embedded system, it is essential to be able to efficiently analyze the timing behaviors of different system architectures. This includes not only determining if a design can satisfy all the timing constraints but also comparing the timing performance of different designs for tradeoff purposes. Understanding the exact timing behavior of a large system can be computationally prohibitive. Previous work in this area has mostly focused on producing a yes/no answer to the schedulability of a system architecture under the worst-case scenario. This not only often leads to overly pessimistic designs, but also provides no insight as how to rank different architectural designs with respect to their timing performance. In this paper, we present several metrics that may be used to measure the timing performance of a design. The metrics were analyzed using workloads from both real-world task systems and randomly generated task systems. A superior metric has been identified through analysis of large sets of experiments. We also show, through an example, how this metric can be used effectively during a design exploration process. 相似文献
13.
Tian Qiao Li Jingmei Xue Di Wu Weifei Wang Jiaxiang Chen Lei Wang Juzhen 《Mobile Networks and Applications》2020,25(4):1518-1527
Mobile Networks and Applications - Based on the problem of task communication overhead being higher than the task execution time has a direct negative impact on the makespan of task scheduling in... 相似文献
14.
Mangalampalli Sudheer Swain Sangram Keshari Mangalampalli Vamsi Krishna 《Wireless Personal Communications》2022,126(3):2231-2247
Wireless Personal Communications - Task Scheduling is one of the important aspects in Cloud Computing. The Primary Objective of the task scheduling is to effectively map tasks on to the... 相似文献
15.
Senyo Apewokin Brian Valentine Jee Choi Linda Wills Scott Wills 《Journal of Signal Processing Systems》2011,62(1):65-76
Current trends in microprocessor design integrate several autonomous processing cores onto the same die. These multicore architectures are particularly well-suited for computer vision applications, where it is typical to perform the same set of operations repeatedly over large datasets. These memory- and computation-intensive applications can reap tremendous performance and accuracy benefits from concurrent execution on multi-core processors. However, cost-sensitive embedded platforms place real-time performance and efficiency demands on techniques to accomplish this task. Furthermore, parallelization and partitioning techniques that allow the application to fully leverage the processing capabilities of each computing core are required for multi-core embedded vision systems. In this paper, we evaluate background modeling techniques on a multicore embedded platform, since this process dominates the execution and storage costs of common video analysis workloads. We introduce a new adaptive backgrounding technique, multimodal mean, which balances accuracy, performance, and efficiency to meet embedded system requirements. Our evaluation compares several pixel-level background modeling techniques in terms of their computation and storage requirements, and functional accuracy for three representative video sequences, across a range of processing and parallelization configurations. We show that the multimodal mean algorithm delivers comparable accuracy of the best alternative (Mixture of Gaussians) with a 3.4× improvement in execution time and a 50% reduction in required storage for optimal block processing on each core. In our analysis of several processing and parallelization configurations, we show how this algorithm can be optimized for embedded multicore performance, resulting in a 25% performance improvement over the baseline processing method. 相似文献
16.
Frans List 《电子产品世界》2004,(22):98-100
今天,从手机到智能卡和机顶盒,闪存在众多消费类电子中得以广泛应用,它们用在电源被关断或电池未连接时存储系统程序、用户选项、系统配置和安全代码等关键信息.虽然许多应用中采用分立的闪存芯片能够提供256 Mb甚至更高的存储能力,然而在需要32 Mb及以下的非易失性存储器的应用中,出于对成本和小引脚要求的考虑,将闪存与嵌入式处理器、外围器件和其他存储器一起集成到主逻辑芯片则更具优势. 相似文献
17.
Richard P. Kleihorst René J. Van Der Vleuten 《The Journal of VLSI Signal Processing》2000,24(1):31-41
Hybrid video compression schemes such as MPEG2 and H.263 use an image memory for motion-compensated coding. In VLSI implementations, this image is usually stored in external RAM because of its large size. To reduce the overall system costs, we propose to compress the image by a factor of 4 to 5 before storage, which then enables embedding of the image memory on the encoder IC itself. The proposed encoder architecture remains in the DCT-domain, so motion estimation and compensation are now performed from this domain. To control and guarantee the actual storage, scalable compression is used. A hardware implementation is feasible and worthwhile compared to traditional encoders with no noticeable loss in performance. 相似文献
18.
19.
One of the important issues in the design of future generation high-speed networks is the provision of real-time services to different types of traffic with various time constraints. In this paper we study the problem of providing real-time service to hard and soft real-time messages in Wavelength-Division-Multiplexing (WDM) optical networks. We propose a set of scheduling algorithms which prioritize and manage message transmissions in single-hop WDM passive star networks based on specific message time constraints. In particular, we develop time-based priority schemes for scheduling message transmissions in order to increase the real-time performance of a WDM network topology. We formulated an analytical model and conducted extensive discrete-event simulations to evaluate the performance of the proposed algorithms. We compared their performances with that of the state-of-the-art WDM scheduling algorithms which typically do not consider the time constraint of the transmitted messages. This study suggests that when scheduling real-time messages in WDM networks, one has to consider not only the problem of resources allocation in the network but also the problem of sequencing messages based on their time constraints. 相似文献
20.
Currently available application frameworks that target at the automatic design of real-time embedded software are poor in
integrating functional and non-functional requirements for mobile and ubiquitous systems. In this work, we present the internal
architecture and design flow of a newly proposed framework called Verifiable Embedded Real-Time Application Framework (VERTAF), which integrates three techniques namely software component-based reuse, formal synthesis, and formal verification.
The proposed architecture for VERTAF is component-based which allows plug-and-play for the scheduler and the verifier. The
architecture is also easily extensible because reusable hardware and software design components can be added. Application
examples developed using VERTAF demonstrate significantly reduced relative design effort, which shows how high-level reuse
of software components combined with automatic synthesis and verification increases design productivity. 相似文献