首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 0 毫秒
1.
The paper addresses the integration of hybrid cloud with mobile applications. The challenge about hybrid mobile cloud resource provisioning is the trade-offs between energy consumption, performance provided to users and how resources, such as processing power and network, are being utilized. The proposed elastic hybrid mobile cloud resource provisioning model is jointly optimized to improve mobile user experience within the constraints of available resources and user QoS requirement. The paper presents the system utility of hybrid cloud system involving local cloud and public cloud infrastructure. From the perspectives of both mobile applications and cloud providers, the proposed system utility is optimized to improve the performance of mobile applications and the utilization of cloud resources. The proposed elastic hybrid mobile cloud resource provisioning algorithm includes two sub-algorithms. To evaluate and validate performance of the proposed algorithm, a series of experiments are conducted. The comparison results and analyses are discussed. The experimental results show the improvement to previous works.  相似文献   

2.
We introduce a coefficient update procedure into existing batch and online dictionary learning algorithms. We first propose an algorithm which is a coefficient updated version of the Method of Optimal Directions (MOD) dictionary learning algorithm (DLA). The MOD algorithm with coefficient updates presents a computationally expensive dictionary learning iteration with high convergence rate. Secondly, we present a periodically coefficient updated version of the online Recursive Least Squares (RLS)-DLA, where the data is used sequentially to gradually improve the learned dictionary. The developed algorithm provides a periodical update improvement over the RLS-DLA, and we call it as the Periodically Updated RLS Estimate (PURE) algorithm for dictionary learning. The performance of the proposed DLAs in synthetic dictionary learning and image denoising settings demonstrates that the coefficient update procedure improves the dictionary learning ability.  相似文献   

3.
A dynamic control model for processing computationally intensive tasks that allow for arbitrary data parallelization in the heterogeneous computing system is considered. Although execution times are not known for tasks, their deadlines should be approved and given. Guaranteed estimates of when each task is completed in the worst case are obtained. The model helps take into account and minimize the maximal possible excess in the assigned deadlines for any set of original tasks.  相似文献   

4.
Embedded systems often have conflicting constraints such as energy and time which considerably harden the design of those systems. In this context, this work proposes a mechanism for supporting design decisions on energy consumption and performance of embedded system applications. In order to depict the practical usability of the proposed methodology, a real case study as well as customized examples are presented. The estimates obtained through the conceived model are 93% close to the respective measures obtained from the real hardware platform.  相似文献   

5.
We explore optimization strategies and resulting performance of two stream-based video applications, video texture and color tracker, on a cluster of SMPs. The two applications are representative of a class of emerging applications, which we call “stream-based applications”, that are sensitive to both latency of individual results and overall throughput. Such applications require non-trivial parallelization techniques in order to improve both latency and throughput, given that the stream data emanates from a limited set of sources (exactly one in the two applications studied) and that the distribution of the data cannot be done a priori.We suggest techniques that address in a coordinated fashion the problems of data distribution and work partitioning. We believe the two problems are related and need to be addressed together. We have parallelized two applications using the Stampede cluster programming system that provides abstractions for implementing time- and throughput-sensitive applications elegantly and efficiently. For the Video Textures application we show that we can achieve a speedup of 24.26 on a 112 processor cluster. For the Color Tracker application, where latency is more crucial, we identify the extent of data parallelism that ensures that the slowest member of the pipeline is no longer the bottleneck for achieving a decent frame rate.  相似文献   

6.
In this paper, we address the problem of supporting stateful workflows following a Function-as-a-Service (FaaS) model in edge networks. In particular we focus on the problem of data transfer, which can be a performance bottleneck due to the limited speed of communication links in some edge scenarios and we propose three different schemes: a pure FaaS implementation, StateProp, i.e., propagation of the application state throughout the entire chain of functions, and StateLocal, i.e., a solution where the state is kept local to the workers that run functions and retrieved only as needed. We then extend the proposed schemes to the more general case of applications modeled as Directed Acyclic Graphs (DAGs), which cover a broad range of practical applications, e.g., in the Internet of Things (IoT) area. Our contribution is validated via a prototype implementation. Experiments in emulated conditions show that applying the data locality principle reduces significantly the volume of network traffic required and improves the end-to-end delay performance, especially with local caching on edge nodes and low link speeds.  相似文献   

7.
蒋炎华 《计算机工程与设计》2011,32(10):3428-3430,3476
提出与描述了一种网格环境下任务的执行时间预测的新方法,该方法不需要参考历史数据,可以让用户在提交网格作业之前进行任务执行时间的精确预测.该预测模块使用了R脚本作为基本工具,结合了静态分析、BenchMark数据库解析和基于编译器的方法等3种技术,其中R软件以命令行的方式很容易编写各种类型的任务代码,使其具有自适应性、灵...  相似文献   

8.
Execution profiles are important in analyzing the performance of computer programs on a given computer system. However, accurate and complete profiles are difficult to arrive at for programs that follow the client-server model of computing, as in the popular X Window System. In X Window applications, considerable computation is invoked at the display server and this computation is an important part of the overall execution profile. The profiler presented in this paper generates meaningful profiles for X Window applications by estimating the time spent in servicing the messages in the display server. The central idea is to analyze a protocol-level trace of the interaction between the application and the display server and thereby construct an execution profile from the trace and a set of metrics about the target display server. Experience using the profiler for examining bottlenecks is presented.  相似文献   

9.
Neural Computing and Applications - This paper presents a fault detection system for photovoltaic standalone applications based on Gaussian Process Regression (GPR). The installation is a...  相似文献   

10.
In this paper, we propose a method for estimating the PLC (Programmable Logic Controller) program execution time, using the informations available at the design stage. So far, the usual practice is to check execution time after complete development of PLC application program. In case, the developed PLC program does not meet the required response time, it is interactively modified. This situation can be avoided to a great extent with our proposed estimation method. We calculate the maximum theoretical execution time and later refine it using application statistics. Preliminary applications of our method show encouraging results.  相似文献   

11.
Nowadays, real-time embedded applications have to cope with an increasing demand of functionalities, which require increasing processing capabilities. With this aim real-time systems are being implemented on top of high-performance multicore processors that run multithreaded periodic workloads by allocating threads to individual cores. In addition, to improve both performance and energy savings, the industry is introducing new multicore designs such as ARM’s big.LITTLE that include heterogeneous cores in the same package.A key issue to improve energy savings in multicore embedded real-time systems and reduce the number of deadline misses is to accurately estimate the execution time of the tasks considering the supported processor frequencies. Two main aspects make this estimation difficult. First, the running threads compete among them for shared resources. Second, almost all current microprocessors implement Dynamic Voltage and Frequency Scaling (DVFS) regulators to dynamically adjust the voltage/frequency at run-time according to the workload behavior. Existing execution time estimation models rely on off-line analysis or on the assumption that the task execution time scales linearly with the processor frequency, which can bring important deviations since the memory system uses a different power supply.In contrast, this paper proposes the Processor–Memory (Proc–Mem) model, which dynamically predicts the distinct task execution times depending on the implemented processor frequencies. A power-aware EDF (Earliest Deadline First)-based scheduler using the Proc–Mem approach has been evaluated and compared against the same scheduler using a typical Constant Memory Access Time model, namely CMAT. Results on a heterogeneous multicore processor show that the average deviation of Proc–Mem is only by 5.55% with respect to the actual measured execution time, while the average deviation of the CMAT model is 36.42%. These results turn in important energy savings, by 18% on average and up to 31% in some mixes, in comparison to CMAT for a similar number of deadline misses.  相似文献   

12.
This note presents some new time update formulas for certain types of lattice algorithms used in autoregressive modeling of stationary time series. The new formulas enable the computation of the autoregressive coefficients in a number of operations per time step proportional to the model order.  相似文献   

13.
In order to accommodate the high demand for performance in smartphones, mobile cloud computing techniques, which aim to enhance a smartphone’s performance through utilizing powerful cloud servers, were suggested. Among such techniques, execution offloading, which migrates a thread between a mobile device and a server, is often employed. In such execution offloading techniques, it is typical to dynamically decide what code part is to be offloaded through decision making algorithms. In order to achieve optimal offloading performance, however, the gain and cost of offloading must be predicted accurately for such algorithms. Previous works did not try hard to do this because it is usually expensive to make an accurate prediction. Thus in this paper, we introduce novel techniques to automatically generate accurate and efficient method-wise performance predictors for mobile applications and empirically show they enhance the performance of offloading.  相似文献   

14.
Summary As a generalization of the multi-layer perceptron (MLP), the circular back-propagation neural network (CBP) possesses better adaptability. An improved version of the CBP (the ICBP) is presented in this paper. Despite having less adjustable weights, the ICBP has better adaptability than the CBP, which quite equals the famous Occams razor principle for model selection. In its application to time series, considering both structural changes and correlations of time series itself, we introduce the principle of the discounted least squares (DLS) in CBP and ICBP, respectively, and investigate their predicting capacity further. Introduction of DLS improves the predicting performance of both on a benchmark time series data set. Finally, the comparison of experimental results shows that ICBP with DLS (DLS-ICBP) has better predicting performance than DLS-CBP.Supported by Natural Science (grant No.BK2002092) and QingLan project foundations of Jiangsu province and Returnee foundation of China.  相似文献   

15.
Designing real-time systems is a challenging task and many conflicting issues arise in the process. Among them, the most fundamental one is the adjustment of appropriate values for task parameters such as task periods, deadlines, and computation times that directly influence the system feasibility. Task periods and deadlines are generally known at design stage and remains fixed throughout, however, task computation times fluctuates significantly. For a better quality of service or higher system utilization, higher task computation values are required, while this flexibility comes at the price of system infeasibility. To the best of our knowledge, no optimal solution exists for extracting the optimal task computation times in a given range so that the overall system remains feasible under a specific scheduling algorithm. In this paper, we present a generalized bound on the task schedulability defined as a nonlinear inequality h i ≤0 in the space of the execution times c i . Based on this bound, the adjustment problem of tasks execution times, which determines the optimum c i for a better system performance while still meeting all temporal requirements, is addressed by solving the standard nonlinear constrained optimization problem. Simulations on synthetic task sets are presented to compare the performance of our work with the most celebrated result, i.e., LL-bound by Liu and Layland in (J. ACM 20(1):40–61, 1973).  相似文献   

16.
To conserve space and power as well as to harness high performance in embedded systems, high utilization of the hardware is required. This can be facilitated through dynamic adaptation of the silicon resources in reconfigurable systems in order to realize various customized kernels as execution proceeds. Fortunately, the encountered reconfiguration overheads can be estimated. Therefore, if the scheduling of time-consuming kernels considers also the reconfiguration overheads, an overall performance gain can be obtained. We present our policy, experiments, and performance results of customizing and reconfiguring Field-Programmable Gate Arrays (FPGAs) for embedded kernels. Experiments involving EEMBC (EDN Embedded Microprocessor Benchmarking Consortium) and MiBench embedded benchmark kernels show high performance using our main policy, when considering reconfiguration overheads. Our policy reduces the required reconfigurations by more than 50% as compared to brute-force solutions, and performs within 25% of the ideal execution time while conserving 60% of the FPGA resources. Alternative strategies to reduce the reconfiguration overhead are also presented and evaluated.  相似文献   

17.
The steady state simulators, used for on-line performance prediction and for on-line optimization in crude distillation units are often sensitive to small variations in the feed composition, which is specified in terms of a True Boiling Point (TBP) vs volume percent distilled curve. The exact feed TBP is often not available during the plant operation. Also stratification of raw crude oil into layers in the large tank farm sections cause severe operating problems in terms of the stability of the column. If feed TBP can be predicted online, necessary feedforward action can considerably reduce the operating problems. A model has been developed for backcalculation of feed TBP using measured plant parameters. A heat balance is performed around an envelope encompassing the rectifying section of the fractionator and is followed by the calculation of Equilibrium Flash Vaporization (EFV) temperatures at six different locations of the column which are correlated with corresponding feed TBP temperatures. The second part of model tuning consists of calculating model parameters in the form of point efficiencies so as to minimize the discrepancy between the simulator predicted and measured column parameters which arises out of modelling approximations such as assumption of phase equilibria at each stage and use of imperfect thermodynamics correlations. The simulator results, after tuning, were found to match the plant measurements within two percent in all the cases investigated. The simulator output was used to predict various product properties using a Property Prediction package and these were also found to match well with those of laboratory measurements. Both the backcalculation of feed TBP and the efficiency tuning need to be implemented on-line for inferential control and supervisory optimization.  相似文献   

18.
Today’s embedded systems are exposed to variations in load demand due to complex software applications, dynamic hardware platforms, and the impact of the run-time environment. When these variations are large, and efficiency is required, adaptive on-line resource managers may be deployed on the system to control its resource usage. An often neglected problem is whether these resource managers are stable, meaning that the resource usage is controlled under all possible scenarios. In this paper we develop mathematical models for real-time embedded systems and we derive conditions which, if satisfied, lead to stable systems. For the developed system models, we also determine bounds on the worst case response times of tasks. We also give an intuition of what stability means in a real-time context and we show how it can be applied for several resource managers. We also discuss how our results can be extended in various ways.  相似文献   

19.
Despite using multiple concurrent processors, a typical high‐performance parallel application is long‐running, taking hours, even days to arrive at a solution. To modify a running high‐performance parallel application, the programmer has to stop the computation, change the code, redeploy, and enqueue the updated version to be scheduled to run, thus wasting not only the programmer's time, but also expensive computing resources. To address these inefficiencies, this article describes how dynamic software updates (DSU) can be used to modify a parallel application on the fly, thus saving the programmer's time and using expensive computing resources more productively. The net effect of updating parallel applications dynamically can reduce the total time that elapses between posing a problem and arriving at a solution, otherwise known as time‐to‐discovery. To explore the benefits of dynamic updates for high performance applications, this article takes a two‐pronged approach. First, we describe our experiences of building and evaluating a system for dynamically updating applications running on a parallel cluster. We then review a large body of literature describing the existing state of the art in DSU and point out how this research can be applied to high‐performance applications. Our experimental results indicate that DSU have the potential to become a powerful tool in reducing time‐to‐discovery for high‐performance parallel applications. Copyright © 2010 John Wiley & Sons, Ltd.  相似文献   

20.
Predicting resource consumption and run time of computational workloads is crucial for efficient resource allocation, or cost and energy optimization. In this paper, we evaluate various machine learning techniques to predict the execution time of computational jobs. For experiments we use datasets from two application areas: scientific workflow management and data processing in the ALICE experiment at CERN. We apply a two-stage prediction method and evaluate its performance. Other evaluated aspects include: (1) comparing performance of global (per-workflow) versus specialized (per-job) models; (2) impact of prediction granularity in the first stage of the two-stage method; (3) using various feature sets, feature selection, and feature importance analysis; (4) applying symbolic regression in addition to classical regressors. Our results provide new valuable insights on using machine learning techniques to predict the runtime behavior of computational jobs.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号