首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 0 毫秒
1.
Provenance is information about the origin and creation of data. In data science and engineering related with cloud environment, such information is useful and sometimes even critical. In data analytics, it is necessary for making data-driven decisions to trace back history and reproduce final or intermediate results, even to tune models and adjust parameters in a real-time fashion. Particularly, in cloud, users need to evaluate data and pipeline trustworthiness. In this paper, we propose a solution: LogProv, toward realizing these functionalities for big data provenance, which needs to renovate data pipelines or some of big data software infrastructure to generate structured logs for pipeline events, and then stores data and logs separately in cloud space. The data are explicitly linked to the logs, which implicitly record pipeline semantics. Semantic information can be retrieved from the logs easily since they are well defined and structured beforehand. We implemented and deployed LogProv in Nectar Cloud,* associated with Apache Pig, Hadoop ecosystem, and adopted Elasticsearch to provide query service. LogProv was evaluated and empirically case studied. The results show that LogProv is efficient since the performance overhead is no more than 10%; the query can be responded within 1 second; the trustworthiness is marked clearly; and there is no impact on the data processing logic of original pipelines.  相似文献   

2.
针对网格环境下的负载不均问题,提出了一种分层动态负载均衡机制,该机制采用随机服务模型描述网格任务流特性及其资源上的动态负载状态,将站点内负载平衡问题归结为目标约束规划问题。理论分析了分层负载均衡机制的有效性证明并设计了优化方案的求解算法,仿真实验结果显示,该分层负载均衡算法在平均响应时间、系统吞吐量方面优于以往的RBA算法和DBA算法。  相似文献   

3.
The problem of redistributing the work load on parallel computers is considered. An optimal redistribution algorithm, which minimises the Euclidean norm of the migrating load, is derived. The relationship between this algorithm and some existing algorithms is discussed and the convergence of the new algorithm is studied. Finally, numerical results on randomly generated graphs as well as on graphs related to real meshes are given to demonstrate the effectiveness of the new algorithm. © 1998 John Wiley & Sons, Ltd.  相似文献   

4.
刘滨  石峰 《计算机工程与设计》2007,28(6):1327-1329,1333
为了解决同构型多处理机系统中的负载不平衡问题,提出一种分布式控制、发送者驱动的动态平衡算法,该算法利用CPU队列长度衡量处理机负载状态、利用进程执行时间选取适合被迁移的负载、利用较完备的消息机制传播处理机负载状态和负载平衡要求,适用于计算密集型任务.实验结果验证了该算法的有效性.  相似文献   

5.
Multimedia Tools and Applications -  相似文献   

6.
一种改进的基于动态反馈的负载均衡算法   总被引:12,自引:0,他引:12  
负载均衡是集群系统研究的一个重要问题,负载均衡算法是集群任务分配的核心,介绍了LVS中的负载均衡算法,讨论了常用算法的不足,在分析这些算法各自优缺点的基础上,提出了一种改进的基于反馈的负载均衡算法,算法引入一个负载容余参数以更准确地描述集群节点的负载状况,在考虑服务节点真实负载,处理能力的基础上,尽量简化负载均衡器的任务分配算法.测试结果显示该算法优于静态算法.  相似文献   

7.
针对传统的物理集群系统无法灵活应对大型互联网应用的问题,提出一种云环境下虚拟机集群的综合负载均衡机制。该方法首先定期地采集集群中虚拟机节点的CPU、内存、连接数、响应时间,以及所在物理主机的负载状况等指标信息,然后加权计算节点的综合负载并得出其权值,最后通过调度器进行任务请求的合理分配,从而解决了传统集群系统负载不均且不能适应多变的网络环境等诸多问题。实验结果表明,与加权轮询法(WRR)和加权最少连接法(WLC)调度方案相比,该机制能够在并发量较大时维持较低的响应时间,并能够根据集群中综合负载的状态实时地增加或减少虚拟机数量,通常在5s之内达到整体集群的负载均衡。  相似文献   

8.
We present Stratosphere, an open-source software stack for parallel data analysis. Stratosphere brings together a unique set of features that allow the expressive, easy, and efficient programming of analytical applications at very large scale. Stratosphere’s features include “in situ” data processing, a declarative query language, treatment of user-defined functions as first-class citizens, automatic program parallelization and optimization, support for iterative programs, and a scalable and efficient execution engine. Stratosphere covers a variety of “Big Data” use cases, such as data warehousing, information extraction and integration, data cleansing, graph analysis, and statistical analysis applications. In this paper, we present the overall system architecture design decisions, introduce Stratosphere through example queries, and then dive into the internal workings of the system’s components that relate to extensibility, programming model, optimization, and query execution. We experimentally compare Stratosphere against popular open-source alternatives, and we conclude with a research outlook for the next years.  相似文献   

9.
Trends in big data analytics   总被引:1,自引:0,他引:1  
One of the major applications of future generation parallel and distributed systems is in big-data analytics. Data repositories for such applications currently exceed exabytes and are rapidly increasing in size. Beyond their sheer magnitude, these datasets and associated applications’ considerations pose significant challenges for method and software development. Datasets are often distributed and their size and privacy considerations warrant distributed techniques. Data often resides on platforms with widely varying computational and network capabilities. Considerations of fault-tolerance, security, and access control are critical in many applications (Dean and Ghemawat, 2004; Apache hadoop). Analysis tasks often have hard deadlines, and data quality is a major concern in yet other applications. For most emerging applications, data-driven models and methods, capable of operating at scale, are as-yet unknown. Even when known methods can be scaled, validation of results is a major issue. Characteristics of hardware platforms and the software stack fundamentally impact data analytics. In this article, we provide an overview of the state-of-the-art and focus on emerging trends to highlight the hardware, software, and application landscape of big-data analytics.  相似文献   

10.
针对蜜网动态负载均衡过程中产生的额外通信开销问题,首先分析了蜜网动态负载均衡的特点,建立了基于最小通信开销的动态负载均衡数学模型;然后设计和实现了一种利用遗传算法解决该问题的新方法。实验测试表明,与贪心算法相比,遗传算法可获得更小通信开销的负载分配方案,能进一步减少蜜网动态负载均衡中负载迁移次数,降低额外通信开销。  相似文献   

11.
针对Hadoop和Spark等大数据分析系统中无先验知识任务的高效执行问题,设计了基于累计工作量(CRW)的任务调度器CRWScheduler。该调度器根据CRW将任务在低权重队列与高权重队列间切换;在为作业分配资源时,同时考虑到作业所在的队列和其瞬时占用资源量,无需作业先验知识即显著提升系统性能。基于Apache Hadoop YARN实现了CRWScheduler原型,在28个节点的基准测试集群上的实验表明,与YARN的公平调度机制相比,作业流时间(JFT)平均降低21%,其中95百分位的作业流时间(JFT)最多降低了35%,并且在与任务级调度程序协作时可获得进一步的性能提升。  相似文献   

12.
针对传统网络的分布式架构使得负载均衡技术难以满足低成本、高灵活性、自适应调整的要求,提出一种基于SDN的数据中心网络负载均衡算法。首先,根据路径当前负载状况和链路负载波动为路径设置了一个权重,并以此作为路径选择依据;其次,设置了一个负载均衡度用于衡量网络负载状况;最后,针对需要调度的流,进一步限定了其流量大小范围,保证了高效的流调度。仿真结果表明,与其他算法相比,所提算法能有效提高网络资源利用率并均衡全网负载。  相似文献   

13.
随着网络中数据库应用的发展,使得分布式数据库系统的负载平衡问题突显出来。目前大多数分布式数据库管理系统没有负责平衡功能,其负载平衡依赖于操作系统的负载平衡机制来解决,这样带来的问题是系统负载的评价粒度细小和负载转移的开销增加。讨论了动态负载平衡策略的要素,针对分布式数据库系统的负载平衡问题,提出了以事务队列长度作为负载评价标准,并给出了一个动态负载平衡策略及算法。  相似文献   

14.
一种动态的入侵检测系统负载均衡算法   总被引:2,自引:0,他引:2  
目前的入侵检测不仅需要模式匹配,而且需要协议异常检测,提出了一种新动态的负载均衡算法,采用两层结构,对网络流量按照服务类型进行初步划分之后分别对每部分流量进行二次分配,并对每种类型的流量进行相应的协议异常检测。该算法能在不牺牲系统性能的前提下有效提高网络入侵检测系统的检测效率,降低误检率,并可有效地适应网络流量的变化,降低漏检率。  相似文献   

15.
In this paper we consider the application of accelerated techniques in order to increase the rate of convergence of the diffusive iterative load balancing algorithms. In particular, we compare the application of Semi-Iterative, Second Degree and Variable Extrapolation techniques on the basic diffusion method for various types of network graphs.  相似文献   

16.
针对考虑负载均衡的LEO卫星网络路由算法存在控制网络开销偏大、路由更新不及时以及流量调节机制分配不均等问题,提出了一种基于负载均衡的动态LEO卫星网络路由算法DRLB。根据卫星节点路径记录信息以及后向Agent读取策略设计新的路由机制,获得动态卫星拓扑结构;分析前向Agent的分组格式并删除冗余字段,达到减小网络开销目的;根据数据发送时间间隔构造前向Agent选址策略,提高路由更新效率,通过考虑卫星所处纬度流量分配不均问题,改进流量调节因子,获得更好的负载均衡效果。仿真结果表明,与SDRZ-MA算法相比,DRLB算法在减缓星地之间的控制开销、平均端到端时延等方面具有较好的优势。  相似文献   

17.
Summary This paper explores and applies the concept of cooperation to the load balancing problem in a computer network. We discuss an analytical model and propose a scheme which can be classified as distributed, dynamic, and stochastic. In the case of a homogeneous network, we guarantee that the load is balanced and no communication cost or information exchange is necessary.  相似文献   

18.
A new Some-Read-Any-Write (SRAW) fault tolerant algorithm for redundant services is presented that allows a system to adjust failures dynamically in order to keep the availability and improve the performance.SRAW is based upon dynamic and active load balancing. By introducing dynamic and active load balancings cheme into redundant services, not only the processing speed of requests can be greatly improved, but also the load balancing can be simply and efficiently achieved. Integrated with consistency protocol in this paper, SRAW can also be applied to state services. The performance of SRAW algorithm is also analyzed, and comparisons with other fault tolerant algorithms, especially with RAWA, indicate that SRAW efficiently improves the performance of redundant services with guaranteeing system availability.  相似文献   

19.
LTE网络中具备QoS保障的动态负载均衡算法   总被引:1,自引:0,他引:1  
研究了3GPPLTE网络中考虑不同服务质量(quality-of-service,QoS)要求的动态负载均衡算法.小区间的负载不均衡对于不同QoS要求用户有不同负面影响.对于有保障速率要求的用户,负载不均衡会导致较高的新呼叫阻塞率;而对于那些没有速率要求的用户,负载不均衡会导致繁忙小区中边缘用户过于恶化的吞吐量.全网中这两类用户的负载均衡问题紧密耦合,难以用一个统一的目标函数来分析研究.因此提出一个相应的多目标优化问题,其目标函数分别是针对全网内有QoS要求用户的负载均衡指示函数和针对全网内没有QoS要求用户的总效用函数,限制条件为实际小区物理资源和用户QoS要求.对该问题的复杂度进行了分析后,提出了一个实时的低复杂度低开销的分布式负载均衡算法结构,包括QoS保障的混合调度,QoS感知的负载均衡切换和呼叫准入控制.最后,系统级仿真结果显示提出的全新的负载均衡算法结构达到了较好的负载均衡效果,可以显著降低有QoS要求用户的新呼叫阻塞率,同时以略微损失全网无QoS要求用户总吞吐量的代价大幅提升繁忙小区中边缘用户的实际吞吐量.  相似文献   

20.
The emergence of big data analytics (BDA) has posed opportunities as well as multiple challenges to business practitioners, who have called for research on the behavioural factors underlying BDA adoption at the individual level. The purpose of this study is to extend the information systems (IS) research on storytelling and to explore the role and characteristics of deliberate storytelling in individual‐level BDA adoption. This case study used the grounded theory approach to extract qualitative data from 24 interviews, field notes, and documentary data. The explicit contributions of the study to the literature include (a) increasing our understanding of the facilitating role of deliberate storytelling in individual‐level BDA adoption, (b) identifying four deliberate storytelling patterns and seven underlying corporate stories disseminated by organizations to influence individual behaviour, and (c) defining the core characteristics of effective deliberate storytelling. This study has multiple implications for business practitioners and demonstrates how deliberate storytelling can be used as a facilitating mechanism in daily business practice.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号