首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 0 毫秒
1.
肖鹏  胡志刚  屈喜龙 《通信学报》2015,36(1):149-158
随着数据中心规模的扩大,高能耗问题已经成为高性能计算领域的一个重要问题。针对数据密集型工作流的高能耗问题,提出通过引入“虚拟数据访问节点”的方法来量化评估工作流任务的数据访问能耗开销,并在此基础上设计了一种“最小能耗路径”的启发式策略。在经典的HEFT算法和CPOP算法基础上,通过引入该启发式策略设计并实现了2种具有能耗感知能力的调度算法(HEFT-MECP和CPOP-MECP)。实验结果显示,基于最小能耗路径的启发式调度算法能有效降低数据访问操作的能耗开销,在面对大型的数据密集工作流任务时,该启发式调度策略体现了较好的适应性。  相似文献   

2.
We present a holistic approach to estimation that uses rough sets theory to determine a similarity template and then compute a runtime estimate using identified similar applications. We tested the technique in two real-life data-intensive applications: data mining and high-performance computing.  相似文献   

3.
A run-time reconfigurable multiply-accumulate (MAC) architecture is introduced. It can be easily reconfigured to trade bitwidth for array size (thus maximizing the utilization of available hardware); process signed-magnitude, unsigned or 2's complement data; make use of part of its structure or adapt its structure based on the specified throughput requirements and the anticipated computational load. The proposed architecture consists of a reconfigurable multiplier, a reconfigurable adder, an accumulation unit, and two units for data representation conversion and incoming and outgoing data stream transfer. Reconfiguration can be done dynamically by using only a few control bits and the main component modules can operate independently from each other. Therefore, they can be enabled or disabled according to the required function each time. Comparison results in terms of performance, area and power consumption prove the superiority of the proposed reconfigurable module over existing realizations in a quantitative and qualitative manner.  相似文献   

4.
The performance of computation-intensive digital signal processing applications running on parallel systems is highly dependent on communication delays imposed by the parallel architecture. In order to obtain a more compact task/processor assignment, a scheduling algorithm considering the communication time between processors needs to be investigated. Such applications usually contain iterative or recursive segments that are modeled as communication sensitive data flow graphs (CS-DFGs), where nodes represent computational tasks and edges represent dependencies between them. Based on the theorems derived, this paper presents a novel efficient technique called cyclo-compaction scheduling, which is applied to a CS-DFG to obtain a better schedule. This new method takes into account the data transmission time, loop carried dependencies, and the target architecture. It implicitly uses the retiming technique (loop pipelining) and a task remapping procedure to allocate processors and to iteratively improve the parallelism while handling the underlying communication and resource constraints. Experimental results on different architectures demonstrate that this algorithm yields significant improvement over existing methods. For some applications, the final schedule length is less than one half of its initial length  相似文献   

5.
Aiming to meet the growing demand for observation and analysis in power systems that based on Internet of Things (IoT), machine learning technology has been adopted to deal with the data-intensive power electronics applications in IoT. By feeding previous power electronic data into the learning model, accurate information is drawn, and the quality of IoT-based power services is improved. Generally, the data-intensive electronic applications with machine learning are split into numerous data/control constrained tasks by workflow technology. The efficient execution of this data-intensive Power Workflow (PW) needs massive computing resources, which are available in the cloud infrastructure. Nevertheless, the execution efficiency of PW decreases due to inappropriate sub-task and data placement. In addition, the power consumption explodes due to massive data acquisition. To address these challenges, a PW placement method named PWP is devised. Specifically, the Non-dominated Sorting Differential Evolution (NSDE) is used to generate placement strategies. The simulation experiments show that PWP achieves the best trade-off among data acquisition time, power consumption, load distribution and privacy preservation, confirming that PWP is effective for the placement problem.  相似文献   

6.
We present a framework for integrated scheduling of continuous media (CM) and other applications. The framework, called ARC scheduling, consists of a rate-controlled on-line CPU scheduler, an admission control interface, a monitoring module, and a rate adaptation interface. ARC scheduling allows threads to reserve CPU time for guaranteed progress. It provides firewall protection between threads such that the progress guarantee to a thread is independent of how other threads actually make scheduling requests. Rate adaptation allows a CM application to adapt its rate to changes in its execution environment. We have implemented the framework as an extension to Solaris 2.3. We present experimental results which show that ARC scheduling is highly effective for integrated scheduling of CM and other applications in a general purpose workstation environment. ARC scheduling is a key component of an end system architecture we have designed and implemented to support networking with quality of service guarantees. In particular, it enables protocol threads to make guaranteed progress  相似文献   

7.
This article presents a systematic approach to hardware/software codesign targeting data-intensive applications. It focuses on the application processes that can be represented in directed acrylic graphs (DAGs) and use a synchronous dataflow (SDF) model, the popular form of dataflow employed in DSP systems when running the process. The codesign system is based on the ultrasonic reconfigurable platform, a system designed jointly at Imperial College and the SONY Broadcast Laboratory. This system is modeled as a loosely coupled structure consisting of a single instruction processor and multiple reconfigurable hardware elements. The paper also introduces and demonstrates a task-based hardware/software codesign environment specialized for real-time video applications. Both the automated partitioning and scheduling environment and the task manager program help to provide a fast robust for supporting demanding applications in the codesign system.  相似文献   

8.
Bursts consist of a varying number of asynchronous transfer mode cells corresponding to a datagram. Here, we generalized weighted fair queueing to a burst-based algorithm with preemption. The new algorithm enhances the performance of the switch service for real-time applications, and it preserves the quality of service guarantees. We study this algorithm theoretically and via simulations.  相似文献   

9.
We present the InSyn algorithm for high-level synthesis of DSP applications. InSyn combines allocation and scheduling of functional, storage, and interconnect units into a single phase and uses the following unique optimizations. (i) The concept of register states (free, busy, and undecided) is used for optimizing registers in a partial schedule where lifetimes of data values are not yet available. (ii) Reusable data values and broadcast are used to alleviate bus contention. (iii) InSyn can alternate between performance-guided and resource-guided measures. For example, InSyn can forgo its priority in favor of completing partially evaluated paths when the availability of allocated registers becomes low. (iv) InSyn ran selectively increase execution time of noncritical operations to alleviate bus contention. (V) InSyn can optimize and trade off distinct (functional units, interconnect, and registers) resource sets concurrently leading to more area-delay efficient designs. (vi) InSyn utilizes estimation tools towards resource allocation, design space pruning, and evaluation of synthesized designs. The experiments show that the features incorporated in inSyn result in very good designs  相似文献   

10.
This letter presents packet scheduling disciplines based on application utility functions and network traffic measurements. The disciplines support different classes of adaptive applications over the Internet, providing differentiation, fairness, and dynamic allocation of network resources. They are composed of a decision procedure, where a fairness criterion based on utility functions is used; and a measurement procedure, which considers the statistics involving packet arrivals and departures. The underlying algorithm is then applied to emulate the proportional differentiation services, and is shown-via simulation-that its results outperform the best alternative algorithms published in the literature  相似文献   

11.
Modern information technology has been utilized progressively to store and distribute a large amount of healthcare data to reduce costs and improve medical facilities. In this context, the emergence of e-Health clouds offers novel opportunities, like easy and remote accessibility of medical data. However, this achievement produces plenty of new risks and challenges like how to provide integrity, security, and confidentiality to the highly susceptible e-Health data. Among these challenges, authentication is a major issue that ensures that the susceptible medical data in clouds is not available to illegal participants. The smart card, password and biometrics are three factors of authentication which fulfill the requirement of giving high security. Numerous three-factor ECC-based authentication protocols on e-Health clouds have been presented so far. However, most of the protocols have serious security flaws and produce high computation and communication overheads. Therefore, we introduce a novel protocol for the e-Health cloud, which thwarts some major attacks, such as user anonymity, offline password guessing, impersonation, and stolen smart card attacks. Moreover, we evaluate our protocol through formal security analysis using the Random Oracle Model (ROM). The analysis shows that our proposed protocol is more efficient than many existing protocols in terms of computation and communication costs. Thus, our proposed protocol is proved to be more efficient, robust and secure.  相似文献   

12.
In this paper, we propose and analyze a methodology for providing absolute differentiated services for real-time applications. We develop a method that can be used to derive delay bounds without specific information on flow population. With this new method, we are able to successfully employ a utilization-based admission control approach for flow admission. This approach does not require explicit delay computation at admission time and, hence, is scalable to large systems. We assume the underlying network to use static-priority schedulers. We design and analyze several priority assignment algorithms and investigate their ability to achieve higher utilization bounds. Traditionally, schedulers in differentiated services networks assign priorities on a class-by-class basis, with the same priority for each class on each router. In this paper, we show that relaxing this requirement, that is, allowing different routers to assign different priorities to classes, achieves significantly higher utilization bounds.  相似文献   

13.
Garrett  M.W. 《IEEE network》1996,10(3):6-14
This article derives a rationale for the service architecture of the ATM Forum's Traffic Management 4.0 specification. This model distinguishes a small number of general ways to provide quality of service (QoS) which are appropriate for different classes of applications. We construct the set of ATM service categories by first analyzing the QoS and traffic requirements for a reasonably comprehensive list of applications. The most important application properties and the complexity of the related network mechanisms are used to structure the services. This method has the desirable property that the number of service categories does not expand rapidly with the introduction of new applications. We also discuss packet scheduling as the key component for realizing such a set of services, and report on an experimental realization of a fair queuing scheduler  相似文献   

14.
IEEE 802.16 networks are designed based on differentiated services concept to provide better Quality of Service (QoS) support for a wide range of applications, from multimedia to typical web services, and therefore they require a fair and efficient scheduling scheme. However, this issue is not addressed in the standard. In this paper we present a new fair scheduling scheme which fulfills the negotiated QoS parameters of different connections while providing fairness among the connections of each class of service. This scheme models scheduling as a knapsack problem, where a fairness parameter reflecting the specific requirements of the connections is defined to be used in the optimization criterion. The proposed scheduler is evaluated through simulation in terms of delay, throughput and fairness index. The results show fairness of the scheduling scheme to all connections while the network guarantees for those connections are fulfilled.  相似文献   

15.
In the last decades transistor scaling has driven electronics toward an extraordinary evolution. The ability to squeeze millions of transistors on a single chip makes it possible to have an incredible computational power in very small size. Many computational systems are still based on the Von Neumann architecture, where computational units and memory blocks are two separate entities. Nanometer-sized transistors enable the development of incredibly fast logic units that cannot work at full speed due to limitations in data transfer from memory. To further evolve electronic circuits, new innovative architectural solutions must be developed to overcome the main limitations of current systems. In this work, we present an architectural implementation of the Logic-In-Memory (LIM) concept that we characterize by considering three data-intensive benchmarks: the odd even sort, the integral image and the binomial filter. The architecture is synthesized on a 28 nm CMOS technology and it is validated by comparing it to a previous version of the LIM structure and to conventional architectures, showing an impressive increase in performance, in terms of speed gain and power consumption reduction.  相似文献   

16.
马凯 《电子测试》2016,(20):61-62
体育中田径运动的编排、管理、记录是一项复杂的工作,要求具有较高的准确性及及时性,传统的人工操作已不满足这项工作,不仅浪费精力和人力,人工编排管理还比较容易出错,不能保证编排管理的工作效率及质量.随着互联网信息时代的来临,就可以利用网络及数据库技术设计田径运动编排管理系统,改变传统的田径运动管理工作,提高田径运动管理水平,大大节省了编排管理人员的人力、精力及物理,也避免了人工管理中的常见差错.  相似文献   

17.
A physical distribution system has a number of optimization problems. Most of them belong to a combinatorial problem, to which conventional mathematical programming methods may hardly be applied. This paper reports on two applications of the genetic algorithm (GA) to physical distribution scheduling problems, which arise at real physical distribution centers. The developed GA schedulers took the place of conventional schedulers, which were coded by rule-based technologies. Advantages of the introduction of GA schedulers into the physical distribution system are as follows: (1) the GA becomes a general problem-solver engine. Once we develop this engine, we only have to develop interfaces for the applications; and (2) fitness functions necessary for the GA force the physical distribution schedulers to have approximate performance estimation. This was not taken into consideration when the rule-based scheduler was used. Two applications of the discussed schedulers were implemented with real distribution centers, and they brought much efficiency to their management  相似文献   

18.
数字阵列雷达波束驻留调度间隔分析算法   总被引:1,自引:1,他引:1       下载免费PDF全文
针对数字阵列雷达波束驻留调度问题,研究了基于调度间隔分析的调度算法。该算法综合分析了1个调度间隔内申请执行的波束驻留任务,且调度过程中进行了脉冲交错。调度准则充分考虑了任务的工作方式优先级和截止期,并以任务丢失率、实现价值率、系统时间利用率作为评估指标。仿真结果表明修正截止期准则主要强调任务的紧迫性,修正工作方式优先级主要强调任务的重要性,而截止期—工作方式优先级和工作方式—截止期调度准则可以在二者间更好地折中,在总体性能上要优于其他调度准则。  相似文献   

19.
Packet scheduling for OFDMA based relay networks   总被引:2,自引:0,他引:2  
The combination of relay networks with orthogonal frequency division multiple access (OFDMA) has been proposed as a promising solution for the next generation wireless system. Considering different traffic classes and user quality of service (QoS), three efficient scheduling algorithms are introduced in such networks. The round-robin (RR) algorithm in relay networks serves as a performance benchmark. Numerical results show that the proposed algorithms achieve significant improvement on system throughput and decrease system packet loss rate, compared with the RR and absence of relaying system (traditional network). Furthermore, comparisons have been carried out among the three proposed algorithms.  相似文献   

20.
In this paper, we consider the problem of hard-real-time (HRT) multiprocessor scheduling of embedded streaming applications modeled as acyclic dataflow graphs. Most of the hard-real-time scheduling theory for multiprocessor systems assumes independent periodic or sporadic tasks. Such a simple task model is not directly applicable to dataflow graphs, where nodes represent actors (i.e., tasks) and edges represent data-dependencies. The actors in such graphs have data-dependency constraints and do not necessarily conform to the periodic or sporadic task models. In this work, we prove that the actors in acyclic Cyclo-Static Dataflow (CSDF) graphs can be scheduled as periodic tasks. Moreover, we provide a framework for computing the periodic task parameters (i.e., period and start time) of each actor, and handling sporadic input streams. Furthermore, we define formally a class of CSDF graphs called matched input/output (I/O) rates graphs which represents more than 80 % of streaming applications. We prove that strictly periodic scheduling is capable of achieving the maximum achievable throughput of an application for matched I/O rates graphs. Therefore, hard-real-time schedulability analysis can be used to determine the minimum number of processors needed to schedule matched I/O rates applications while delivering the maximum achievable throughput. This can be of great use for system designers during the Design Space Exploration (DSE) phase.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号