首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.

Achieving high performance in task-parallel runtime systems, especially with high degrees of parallelism and fine-grained tasks, requires tuning a large variety of behavioral parameters according to program characteristics. In the current state of the art, this tuning is generally performed in one of two ways: either by a group of experts who derive a single setup which achieves good – but not optimal – performance across a wide variety of use cases, or by monitoring a system’s behavior at runtime and responding to it. The former approach invariably fails to achieve optimal performance for programs with highly distinct execution patterns, while the latter induces overhead and cannot affect parameters which need to be set at compile time. In order to mitigate these drawbacks, we propose a set of novel static compiler analyses specifically designed to determine program features which affect the optimal settings for a task-parallel execution environment. These features include the parallel structure of task spawning, the granularity of individual tasks, the memory size of the closure required for task parameters, and an estimate of the stack size required per task. Based on the result of these analyses, various runtime system parameters are then tuned at compile time. We have implemented this approach in the Insieme compiler and runtime system, and evaluated its effectiveness on a set of 12 task parallel benchmarks running with 1 to 64 hardware threads. Across this entire space of use cases, our implementation achieves a geometric mean performance improvement of 39%. To illustrate the impact of our optimizations, we also provide a comparison to current state-of-the art task-parallel runtime systems, including OpenMP, Cilk, HPX, and Intel TBB.

  相似文献   

2.
Programmable controllers (PC's), conceived in 1968 by a team of engineers at G. M.'s Hydramatics Division, have experienced a revolutionary development and gone through four generations of design in a decade. The exponential increase of functional capability was accompanied by a similar development trend in the I/O area from passive I/O modules to intelligent I/O and distributed I/O systems. As these programmable controllers became larger, their chip count increased substantially and, thus, their meantime between failures (MTBF) was lowered despite use of LSI and highly reliable components (?-values substantially above industrial grade) with elevated temperature ratings (85°C) and adherence to conservative component stress levels.  相似文献   

3.
In this paper, we present an approach to hardware-software partitioning for real-time embedded systems. Hardware and software components are modeled at the system level, so that cost and performance tradeoffs can be studied early in the design process and a large design space can be explored. Feasibility factor is introduced to measure the possibility of a real-time system being feasible, and is used as both a constraint and an attribute during the optimization process. An imprecise value function is employed to model the tradeoffs among multiple performance attributes. Optimal partitioning is achieved through the use of an existing computer-aided design tool. We demonstrate the application of our approach through the design of an example embedded system.  相似文献   

4.
Increasing soft error rates for semiconductor devices manufactured in later technologies enforce the usage of fault tolerant techniques such as Roll-back Recovery with Checkpointing (RRC). As RRC introduces time overhead that increases the completion (execution) time, time constraints (deadlines) might be violated. This is a drawback for a class of computer systems where the correct operation is defined not only by providing the correct outcome of an operation but also by ensuring that the deadlines are met. These computer systems are referred to as real-time systems (RTSs). In general RTSs are classified as soft and hard RTSs depending on the consequences of violating the deadlines. For soft RTSs, where consequences of violating the deadlines are not very severe, research have focused on optimizing RRC and shown that it is possible to find the optimal number of checkpoints such that the average execution time (AET) is minimal. While minimal AET is important for soft RTSs, it is more important to provide a high probability that deadlines are met for hard RTSs, where consequences of violating the deadlines may be catastrophic. Hence, there is a need of probabilistic guarantees that jobs employing RRC complete before a given deadline. Traditionally, AET analysis have been used for soft RTSs and worst case execution time (WCET) analysis along with schedule feasibility have been used for hard RTSs. In this paper we introduce a reliability metric, Level of Confidence (LoC), which is equally applicable to both soft and hard RTS. LoC is used as a metric to evaluate to what extent a deadline is met. The main contributions of this paper are as follows. First, we present a mathematical framework for the evaluation of LoC when RRC is employed. Second, we provide a proof to verify the correctness of the proposed expression. Third, in the context of hard RTSs, we provide a method to obtain the optimal number of checkpoints that maximizes the LoC. Fourth, in the context of soft RTSs where the maximal LoC may not be needed, but instead some LoC requirement is needed, we present an optimization method for RRC that finds the number of checkpoints that results in the minimal completion time while the minimal completion time satisfies a given LoC requirement. Fifth, we use the proposed framework to evaluate and compare probabilistic guarantees when RRC is optimized towards soft RTSs.  相似文献   

5.
用于开放式系统的二维优先级实时调度   总被引:3,自引:1,他引:3       下载免费PDF全文
谭朋柳  金海  张明虎 《电子学报》2006,34(10):1773-1777
提出了一种新的用于开放式系统的调度机制,即二维优先级实时调度,它不仅划分任务优先级,还划分调度策略优先级.任务的执行顺序由其调度策略优先级和任务优先级共同决定.它不仅可以解决传统优先级调度机制中机制与调度策略不能相分离的问题,还提高了效率.这种机制中引入的CPU带宽控制策略,可以根据需要实现硬实时、软实时、混合实时不同目标的实时系统,并简化了任务可调度性分析,且可以为不同权限或级别的用户提供不同QoS服务.这种调度架构不仅效率高,而且具有很强的开放性,适用广、易扩展.  相似文献   

6.
To rapidly explore the design space of a real-time embedded system, it is essential to be able to efficiently analyze the timing behaviors of different system architectures. This includes not only determining if a design can satisfy all the timing constraints but also comparing the timing performance of different designs for tradeoff purposes. Understanding the exact timing behavior of a large system can be computationally prohibitive. Previous work in this area has mostly focused on producing a yes/no answer to the schedulability of a system architecture under the worst-case scenario. This not only often leads to overly pessimistic designs, but also provides no insight as how to rank different architectural designs with respect to their timing performance. In this paper, we present several metrics that may be used to measure the timing performance of a design. The metrics were analyzed using workloads from both real-world task systems and randomly generated task systems. A superior metric has been identified through analysis of large sets of experiments. We also show, through an example, how this metric can be used effectively during a design exploration process.  相似文献   

7.
面向紧急通道的实时应用是实时CORBA系统的一类应用,目前实时CORBA规范不能完全支持这一应用需求,通过改进实时CORBA系统RTORBUS,使其支持面向紧急通道的应用,改进的RTORBUS系统支持紧急通道的切换,并提供动态调整机制以提高紧急通道的切换效率,避免了由切换导致优先级上升超过系统限制所造成的优先级倒置,仿真实验结果表明,改进后的RTORBUS系统能够达到预期的设计要求.  相似文献   

8.
In the era of Big Data, typical architecture of distributed real-time stream processing systems is the combination of Flume, Kafka, and Storm. As a kind of distributed message system, Kafka has the characteristics of horizontal scalability and high throughput, which is manly deployed in many areas in order to address the problem of speed mismatch between message producers and consumers. When using Kafka, we need to quickly receive data sent by producers. In addition, we need to send data to consumers quickly. Therefore, the performance of Kafka is of critical importance to the performance of the whole stream processing system. In this paper, we propose the improved design of real-time stream processing systems, and focus on improving the Kafka's data loading process. We use Kafka cat to transfer data from the source to Kafka topic directly, which can reduce the network transmission. We also utilize the memory file system to accelerate the process of data loading, which can address the bottleneck and performance problems caused by disk I/O. Extensive experiments are conducted to evaluate the performance, which show the superiority of our improved design.  相似文献   

9.
In optical and wireless communications systems, the goal is to reach 10 Gbps or above data rates. In order to support such extremely high data rates, the physical layer generally uses orthogonal frequency division multiplexing (OFDM) modulation. Unlike serial transmission of symbols, the OFDM modulation transmits data with many parallel sub-carriers, which help to provide high bandwidth. Field programmable gate arrays (FPGAs) and digital signal processors (DSPs) are usually employed to process OFDM blocks in real time. However, FPGAs and DSPs are not cost effective, and they are difficult to adapt to new standards. One of the most computationally intensive functions in OFDM systems is the fast Fourier transform (FFT) computation process. This paper aims to accelerate the FFT process to achieve high communication throughput in real time. Two parallel approaches are implemented for two different NVIDIA graphics processing unit (GPU) architectures. To obtain the best performance values, several optimizations are implemented. Our general purpose graphics processing unit (GPGPU)-based FFT computation achieves up to 24 Gbps throughput in real time.  相似文献   

10.
Current trends in microprocessor design integrate several autonomous processing cores onto the same die. These multicore architectures are particularly well-suited for computer vision applications, where it is typical to perform the same set of operations repeatedly over large datasets. These memory- and computation-intensive applications can reap tremendous performance and accuracy benefits from concurrent execution on multi-core processors. However, cost-sensitive embedded platforms place real-time performance and efficiency demands on techniques to accomplish this task. Furthermore, parallelization and partitioning techniques that allow the application to fully leverage the processing capabilities of each computing core are required for multi-core embedded vision systems. In this paper, we evaluate background modeling techniques on a multicore embedded platform, since this process dominates the execution and storage costs of common video analysis workloads. We introduce a new adaptive backgrounding technique, multimodal mean, which balances accuracy, performance, and efficiency to meet embedded system requirements. Our evaluation compares several pixel-level background modeling techniques in terms of their computation and storage requirements, and functional accuracy for three representative video sequences, across a range of processing and parallelization configurations. We show that the multimodal mean algorithm delivers comparable accuracy of the best alternative (Mixture of Gaussians) with a 3.4× improvement in execution time and a 50% reduction in required storage for optimal block processing on each core. In our analysis of several processing and parallelization configurations, we show how this algorithm can be optimized for embedded multicore performance, resulting in a 25% performance improvement over the baseline processing method.  相似文献   

11.
Security for Industrial Communication Systems   总被引:2,自引:0,他引:2  
Modern industrial communication networks are increasingly based on open protocols and platforms that are also used in the office IT and Internet environment. This reuse facilitates development and deployment of highly connected systems, but also makes the communication system vulnerable to electronic attacks. This paper gives an overview of IT security issues in industrial automation systems which are based on open communication systems. First, security objectives, electronic attack methods, and the available countermeasures for general IT systems are described. General security objectives and best practices are listed. Particularly for the TCP/IP protocol suite, a wide range of cryptography-based secure communication protocols is available. The paper describes their principles and scope of application. Next, we focus on industrial communication systems, which have a number of security-relevant characteristics distinct from the office IT systems. Confidentiality of transmitted data may not be required; however, data and user authentication, as well as access control are crucial for the mission critical and safety critical operation of the automation system. As a result, modern industrial automation systems, if they include security measures at all, emphasize various forms of access control. The paper describes the status of relevant specifications and implementations for a number of standardized automation protocols. Finally, we illustrate the application of security concepts and tools by brief case studies describing security issues in the configuration and operation of substations, plants, or for remote access.  相似文献   

12.
13.
针对分布式系统的快速能耗估计方法及应用   总被引:1,自引:1,他引:0       下载免费PDF全文
粟雅娟  魏少军 《电子学报》2005,33(9):1706-1709
本文针对包含可变电压处理单元的实时分布式系统提出了快速能耗估计算法FAEE,并在此基础上改进了低能耗分配方法.和现有方法相比可获得几乎相同优化结果,而CPU时间降低了约2个数量级.  相似文献   

14.
Real-Time Dynamic Voltage Loop Scheduling for Multi-Core Embedded Systems   总被引:1,自引:0,他引:1  
In this brief, we propose a novel real-time loop-scheduling technique to minimize energy consumption via dynamic voltage scaling (DVS) for applications with loops considering transition overhead. One algorithm, dynamic voltage loop scheduling (DVLS), is designed integrating with DVS. In DVLS, we repeatedly regroup a loop based on rotation scheduling and decrease the energy by DVS as much as possible within a timing constraint. We conduct the experiments on a set of digital signal processing benchmarks. The experimental results show that DVLS achieves big energy saving compared with the traditional time-performance-oriented scheduling algorithm  相似文献   

15.
In this paper we describe an event timer which was designed to be used in a neurophysiological laboratory. The timer is used with an LSI 11/03 computer and interfaces with the computer through a standard Digital Equipment Corporation DRV 11 interface board. The time of occurrence of pulses on up to 15 different input lines can be recorded with an accuracy determined by the time base of the timer, which can be varied from 1 to 5000 gs. In order to record events that occur simultaneously on different channels or in very rapid succession, we employ a first-in, first-out (FIFO) register as a buffer. An input scanner allows one timer to be used for timing events that occur on several input channels. This device may be useful in other applicatoins in which the time of occurrence of multiple events needs to be accurately timed.  相似文献   

16.
The integration of physical systems and processes with networked computing has led to the emergence of a new generation of engineered systems, called Cyber-Physical Systems (CPS). These systems are large networked systems of systems, in which a component system may itself be a grid. In this paper we survey the current state of the art of CPS security, identify the issues surrounding secure control, and investigate the extent to which context information may be used to improve security and survivability of CPS.  相似文献   

17.
This paper presents a novel approach to computing tight upper bounds on the processor utilization for general real-time systems where tasks are composed of subtasks and precedence constraints may exist among subtasks of the same task. By careful analysis of preemption effects among tasks, the problem is formulated as a set of linear programming (LP) problems. Observations are made to reduce the number of LP problem instances required to be solved, which greatly improves the computation time of the utilization bounds. Furthermore, additional constraints are allowed to be included under certain circumstances to improve the quality of the bounds.  相似文献   

18.
研究了单处理器嵌入式实时系统的多任务优先级分配方法,首先,选择截止期单调(DM)优先级分配方法为实现航电中继系统的采集激励软件而划分的多任务分配优先级,然后,结合VxWorks的优先级抢占式调度算法对该任务集进行可调度性理论分析和计算,给出了基于VxWorks的数据采集激励软件的一般调度设计过程,最后,在基于该调度算法...  相似文献   

19.
Currently available application frameworks that target at the automatic design of real-time embedded software are poor in integrating functional and non-functional requirements for mobile and ubiquitous systems. In this work, we present the internal architecture and design flow of a newly proposed framework called Verifiable Embedded Real-Time Application Framework (VERTAF), which integrates three techniques namely software component-based reuse, formal synthesis, and formal verification. The proposed architecture for VERTAF is component-based which allows plug-and-play for the scheduler and the verifier. The architecture is also easily extensible because reusable hardware and software design components can be added. Application examples developed using VERTAF demonstrate significantly reduced relative design effort, which shows how high-level reuse of software components combined with automatic synthesis and verification increases design productivity.  相似文献   

20.
信息系统安全策略研究   总被引:1,自引:0,他引:1       下载免费PDF全文
李守鹏  孙红波 《电子学报》2003,31(7):977-980
安全策略是信息系统安全的关键.信息系统安全的前提是确保安全策略的完备、正确和一致.安全策略的复杂性与系统本身的复杂程度密切相关.安全策略必须得到有效的实施.本文对安全策略的实施、要求和一致性进行了研究,给出了访问控制策略的一致性定理和一致性检查方法.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号