首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
Cloud computing has emerged as a main approach for managing huge distributed data in different areas such as scientific operations and engineering experiments. In this regard, data replication in Cloud environments is a key strategy that reduces response time and improves reliability. One of the main features of a distributed environment is to replicate data in various sites such that popular data would be more available. Whenever a site does not have a needed data file, it will have to fetch it from other locations. Therefore, the parallel download approach is applied to reduce download time. It enables a user to get various parts of a file from several sites simultaneously. In this work, we present a data replication strategy, named the Dynamic Popularity aware Replication Strategy (DPRS), which is presented on Cloud system leveraging data access behavior. DPRS replicates only a small amount of frequently requested data file based on 80/20 idea. It determines to which site the file is replicated based on number of requests, free storage space, and site centrality. We introduce a parallel downloading approach that replicates data segments and parallel downloads replicated data fragments, to enhance the overall performance. We evaluate effective network usage, mean job execution time, hit ratio, total number of replications and percentage of storage filled by using the CloudSim simulator. Extensive experimentations demonstrate the effectiveness of DPRS under most of access patterns.  相似文献   

2.
In recent years, grid technology has had such a fast growth that it has been used in many scientific experiments and research centers. A large number of storage elements and computational resources are combined to generate a grid which gives us shared access to extra computing power. In particular, data grid deals with data intensive applications and provides intensive resources across widely distributed communities. Data replication is an efficient way for distributing replicas among the data grids, making it possible to access similar data in different locations of the data grid. Replication reduces data access time and improves the performance of the system. In this paper, we propose a new dynamic data replication algorithm named PDDRA that optimizes the traditional algorithms. Our proposed algorithm is based on an assumption: members in a VO (Virtual Organization) have similar interests in files. Based on this assumption and also file access history, PDDRA predicts future needs of grid sites and pre-fetches a sequence of files to the requester grid site, so the next time that this site needs a file, it will be locally available. This will considerably reduce access latency, response time and bandwidth consumption. PDDRA consists of three phases: storing file access patterns, requesting a file and performing replication and pre-fetching and replacement. The algorithm was tested using a grid simulator, OptorSim developed by European Data Grid projects. The simulation results show that our proposed algorithm has better performance in comparison with other algorithms in terms of job execution time, effective network usage, total number of replications, hit ratio and percentage of storage filled.  相似文献   

3.
Grid computing, in which a network of computers is integrated to create a very fast virtual computer, is becoming ever more prevalent. Examples include the TeraGrid and Planet-lab.org, as well as applications on the existing Internet that take advantage of unused computing and storage capacity of idle desktop machines, such as Kazaa, SETI@home, Climateprediction.net, and Einstein@home. Grid computing permits a network of computers to act as a very fast virtual computer. With many alternative computers available, each with varying extra capacity, and each of which may connect or disconnect from the grid at any time, it may make sense to send the same task to more than one computer. The application can then use the output of whichever computer finishes the task first. Thus, the important issue of the dynamic assignment of tasks to individual computers is complicated in grid computing by the option of assigning multiple copies of the same task to different computers. We show that under fairly mild and often reasonable conditions, maximizing task replication stochastically maximizes the number of task completions by any time. That is, it is better to do the same task on as many computers as possible, rather than assigning different tasks to individual computers. We show maximal task replication is optimal when tasks have identical size and processing times have a NWU (New Worse than Used; defined later) distribution. Computers may be heterogeneous and their speeds may vary randomly, as is the case in grid computing environments. We also show that maximal task replication, along with a c μ rule, stochastically maximizes the successful task completion process when task processing times are exponential and depend on both the task and computer, and tasks have different probabilities of completing successfully.  相似文献   

4.
adPD:一种速度自适应的动态并行下载技术   总被引:5,自引:0,他引:5  
本文在介绍了现有的并行下载算法的基础上提出了一种新的速度自适应的动态并行下载机制-adPD。adPD通过为速度不同的连接动态分配大小不同的下载任务,可以很好地适应传输连接速度的变化,做到按速度比例分配下载任务量,充分利用带宽。同时,通过划分大小不固定的文件分块,adPD还可以尽可能地减少发送数据请求的数量,缩短请求等待的空闲时间,在减轻提供服务的节点的负载的同时,提高了下载速度。最后,通过实验结果分析了adPD的实际性能,验证了adPD是一种高效的并行下载算法。  相似文献   

5.
Co-allocation architecture was developed to enable parallel transferring of files from multiple replicas stored in the different servers. Several co-allocation strategies have been coupled and used to exploit the different transfer rates among various client-server links and to address dynamic rate fluctuations by dividing files into multiple blocks of equal sizes. The paper presents a dynamic file transfer scheme, called dynamic adjustment strategy (DAS), for co-allocation architecture in concurrently transferring a file from multiple replicas stored in multiple servers within a data grid. The scheme overcomes the obstacle of transfer performance due to idle waiting time of faster servers in co-allocation based file transfers and, therefore, provides reduced file transfer time. A tool with user friendly interface that can be used to manage replicas and downloading in a data grid environment is also described. Experimental results show that our DAS can obtain high-performance file transfer speed and reduce the time cost of reassembling data blocks.  相似文献   

6.
Essential requirements for the use of computers in mineral exploration are: (1) an adaptable storage-retrieval system, (2) adequate data files, (3) a minimum establishment in terms of staff and organization, (4) integration of computer filing systems with paper filing systems so that they complement one another, and (5) understanding by users of the application of computer techniques to geological information.Data files applicable to mineral exploration fall into three categories: Field-Data files, Bibliographic-Index files, and Mineral-Deposit files. An increasing proportion of data files used by exploration companies will be acquired from government organizations and service agencies.Computer techniques must be used selectively. They become more applicable as: (1) projects last longer, (2) the amount of exploration work per unit area increases, and (3) the amount of previously generated information becomes greater.Limitations include: (1) nonapplicability of computer techniques to some types of exploration, (2) expense of file creation, (3) short-time span of many exploration projects, and (4) need to maintain a balance between a computer facility and the exploration department it is designed to serve.Causes of disappointment include preoccupation with statistical techniques at the expense of storage and retrieval, the nonselective application of computers, and failure to attain a minimum organizational establishment.Although individual computer-based approaches may lead directly to exploration targets which would not have been detected without computers, the essential advantage of a computer facility lies in its ability to improve the performance of each exploration geologist by increasing the quantity and quality of available data.  相似文献   

7.
Conventional remote data access middlewares usually provide client applications with either a pre‐staging scheme or an on‐demand access scheme to fetch data. The pre‐staging scheme uses parallel downloads to fetch a completed input file from multiple data sources, even when only a tiny file fragment is required. Such a transfer scheme consumes unnecessary data transmission time and storage space. In contrast, the on‐demand scheme downloads only the required data blocks from a single data source and does not fully utilize the downstream bandwidth of the computing nodes. This paper presents a middleware called ‘Spigot’ that facilitates legacy (grid‐unaware) applications to transparently access remote data by using native I/O function calls. Spigot uses the on‐demand concept to avoid unnecessary data transfer and adopts a co‐allocation download algorithm to improve the data transfer performance. Moreover, it uses the pre‐fetching strategy to reduce the data waiting time by overlapping data acquisition and data processing. It also provides the client application with its own user‐level cache, which is advantageous since a larger cache space is available in comparison with the kernel‐level cache. Further, it is easy to maintain data consistency between Spigot nodes. The experimental results indicate that Spigot achieves superior performance in reducing the data waiting time than the pre‐staging and the on‐demand access schemes. Copyright © 2010 John Wiley & Sons, Ltd.  相似文献   

8.
非结构网格上求解中子输运方程的并行流水线Sn扫描算法   总被引:11,自引:4,他引:7  
间断有限元离散纵标方法(Sn)是广泛应用于求解高维非定常中子输运方程的数值方法,它涉及几何网格空间、速度相空间和中子能群的离散,计算量很大.该文基于非结构网格,提出了基于区域分解的并行流水线Sn扫描算法,通过设计具有不同内在并行度和通信面体比的区域分解方法和队列插入算法,对两个不同物理模型,分别使用两台并行机的92个和256个CPU,获得72倍和78倍以上的加速.可扩展性能分析表明,算法的性能非常依赖于并行机的点对点通信延迟.  相似文献   

9.
在应用计算机模拟病例训练与考试系统的过程中,客户端时常需要在线下载许多大数据文件、音频和视频混合文件,系统响应速度是一个关键问题。研究了在RIA中实现多线程的技术方案,提出在多核计算机上有效实现多线程并行下载大数据文件、音频和视频混合文件的优化方法。算法分析与实验结果表明,提出的多线程并行下载技术能够加速计算机模拟病例系统模块的在线下载,显著优化了系统运行性能。  相似文献   

10.
The application of data-management systems in economic geology is becoming increasingly user oriented and machine independent. Many mineral-deposit computer systems remain data dependent or fixed formatted but an efficient data system with a proper data-manipulation language needs specific data structures, free format, and a prospective package of generalized modules for the user. Data management primarily requires clear formulation by geoscientist in cooperation with system analyst to serve many activities including problem solving, data content, and decision-making.File-management systems in economic geology involve: file definitions, data validation, data loading, and updating and retrieval. Structuring of a data file seems to be a clue to a successful and long-term file-management system in applied geosciences. Many commercially available file-management systems should be applied in economic geology because the functions which modern computers can perform in assisting file-management development do not essentially differ from those of other data-processing undertakings. The only specific problem in the geosciences is location and graphic form of most geoscience data presented as point, lineal, areal, spatial (three dimensional) input/output information.Data-base management systems provide interconnection between files themselves as well as between different items for different files. Few typical file-management or data-base management systems for geoscience and resources data will be compared in the table form. The paper concludes the problem of choosing and developing an effective data-base management system for its application in economic geology.  相似文献   

11.
Data Grids enable the sharing, selection, and connection of a wide variety of geographically distributed computational and storage resources for content needed by large‐scale data‐intensive applications such as high‐energy physics, bioinformatics, and virtual astrophysical observatories. In Data Grids, co‐allocation architectures were developed to enable parallel downloads of data sets from selected replica servers. As Internet is usually the underlying network of a grid, network bandwidth plays as the main factor affecting file transfers between clients and servers. In this paradigm, there are still some challenges that need to be solved, such as to reduce differences in finish times between selected replica servers, to avoid traffic congestion resulting from transferring the same blocks in different links among servers and clients, and to manage network performance variations among parallel transfers. In this paper, we propose the Anticipative Recursively Adjusting Mechanism (ARAM) scheme to adjust the workloads on selected replica servers and handle unpredictable variations in network performance by those servers. Our algorithm is based on using the finish rates for previously assigned transfers to anticipate the bandwidth status for the next section to adjust workloads, and to reduce file transfer times in grid environments. Our approach is useful in grid environments with unstable network link. It not only reduces idle time wasted waiting for the slowest server, but also decreases file transfer completion times. Copyright © 2010 John Wiley & Sons, Ltd.  相似文献   

12.
Rendering is a crucial process in the production of computer generated animation movies. It executes a computer program to transform 3D models into series of still images, which will eventually be sequenced into a movie. Due to the size and complexity of 3D models, rendering process becomes a tedious, time-consuming and unproductive task on a single machine. Accordingly, animation rendering is commonly carried out in a distributed computing environment where numerous computers execute in parallel to speedup the rendering process. In accordance with distribution of computing, data dissemination to all computers also needs certain mechanisms which allow large 3D models to be efficiently moved to those distributed computers to ensure the reduction of time and cost in animation production. This paper presents and evaluates BitTorrent file system (BTFS) for improving the communication performance of distributed animation rendering. The BTFS provides an efficient, secure and transparent distributed file system which decouples the applications from complicated communication mechanism. By having data disseminated in a peer-to-peer manner and using local cache, rendering time can be reduced. Its performance comparison with a production-grade 3D animation favorably shows that the BTFS outperforms traditional distributed file systems by more than 3 times in our test configuration.  相似文献   

13.
Loop scheduling on parallel and distributed systems has been thoroughly investigated in the past. However, none of these studies considered the multi-core architecture feature for emerging grid systems. Although there have been many studies proposed to employ the hybrid MPI and OpenMP programming model to exploit different levels of parallelism for a distributed system with multi-core computers, none of them were aimed at parallel loop self-scheduling. Therefore, this paper investigates how to employ the hybrid MPI and OpenMP model to design a parallel loop self-scheduling scheme adapted to the multi-core architecture for emerging grid systems. Three different featured applications are implemented and evaluated to demonstrate the effectiveness of the proposed scheduling approach. The experimental results show that the proposed approach outperforms the previous work for the three applications and the speedups range from 1.13 to 1.75.  相似文献   

14.
云存储是云计算技术的重要组成部分,包括存储位置的选择和文件的传输,文件传输环节包括上传和下载。传输作为存储的重要组成部分,对于存储效率有较大的的影响。近年来针对云存储技术的研究主要集中在数据存储和数据传输的效率上,针对大量流媒体文件上传至云存储服务器过程中传输效率低的问题,在私有云环境下提出了一种针对大量流媒体文件的传输机制THU,在该机制中提出了一种对于不同的云平台环境和传输客户端存在文件大小值fk的思路,将小于该值的文件无损打包成一定数量的大小为fk的文件进行传输,而将大于fk值的文件切割成一定数量大小为fk的文件进行传输,相比较打包或者切割成其他大小的文件进行传输时消耗的时间较少。本文在私有云环境下进行了大量的流媒体ftp传输实验,实验结果显示这样的fk值是存在的,当文件打包大小或等于该值时,打包、解包和传输消耗的总时间处于相对优化的水平,从而证明了THU机制的正确性和有效性。  相似文献   

15.
一种云计算架构的实现方法研究   总被引:8,自引:0,他引:8       下载免费PDF全文
本文提出了云计算机体系架构,对此架构下云应用的实现进行了研究,并通过一个模型云脑系统进行了验证。在云脑系统的实现方法中引入了云并行存储的技术,实现了文件的并行上传与并行下载,克服了以往存储服务器的负载不均衡及传输瓶颈等问题。  相似文献   

16.
We derive cost formulae for three different parallelisation techniques for training both supervised and unsupervised networks. These formulae are parameterised by properties of the target computer architecture. It is therefore possible to decide both which technique is best for a given parallel computer, and which parallel computer best suits a given technique. One technique, exemplar parallelism, is far superior to almost all parallel computer architectures. Formulae also take into account optimal batch learning as the overall training approach. Cost predictions are made for several of today's popular parallel computers.  相似文献   

17.
在车联网中,车辆可通过无线通信从路侧单元(RSU)下载文件数据,但由于车辆移动性高、RSU部署间距大及传输范围有限,使得车辆可下载数据量受到较大影响。为此,提出一种协作数据分发方案。利用双向车辆和RSU的资源可用性对目标车辆进行辅助下载,使得目标车辆在未覆盖RSU的盲区(DA)内仍能获取所需文件数据,在此基础上,考虑资源节点的竞争接入与传输以及转发完成时间对数据传输的影响,建立理论分析框架来说明数据传输过程。仿真结果表明,相较于同向协助下载和反向协助下载机制,该方案能够提升目标车辆下载数据量,减少DA内数据传输中断的影响,提高DA利用率。  相似文献   

18.
PeiZong Lee 《Parallel Computing》1995,21(12):1895-1923
It is widely accepted that distributed memory parallel computers will play an important role in solving computation-intensive problems. However, the design of an algorithm in a distributed memory system is time-consuming and error-prone, because a programmer is forced to manage both parallelism and communication. In this paper, we present techniques for compiling programs on distributed memory parallel computers. We will study the storage management of data arrays and the execution schedule arrangement of Do-loop programs on distributed memory parallel computers. First, we introduce formulas for representing data distribution of specific data arrays across processors. Then, we define communication cost for some message-passing communication operations. Next, we derive a dynamic programming algorithm for data distribution. After that, we show how to improve the communication time by pipelining data, and illustrate how to use data-dependence information for pipelining data. Jacobi's iterative algorithm and the Gauss elimination algorithm for linear systems are used to illustrate our method. We also present experimental results on a 32-node nCUBE-2 computer.  相似文献   

19.
The service capacities of a source peer at different times in a peer-to-peer (P2P) network exhibit temporal correlation. Unfortunately, there is no analytical result which clearly characterizes the expected download time from a source peer with stochastic and time-varying service capacity. The main contribution of this paper is to analyze the expected file download time in P2P networks with stochastic and time-varying service capacities. The service capacity of a source peer is treated as a stochastic process. Analytical results which characterize the expected download time from a source peer with stochastic and time-varying service capacity are derived for the autoregressive model of order 1. Simulation results are presented to validate our analytical results. Numerical data are given to show the impact of the degree of correlation and the strength of noise on the expected file download time. For any chunk allocation method, an analytical result of the expected parallel download time from a source peer with stochastic and time-varying service capacity is derived. It is shown that the algorithm which chooses chunk sizes proportional to the expected service capacities has better performance than the algorithm which chooses equal chunk sizes. It is also shown that multiple source peers do reduce the parallel download time significantly.  相似文献   

20.
《Ergonomics》2012,55(1-3):188-196
Hand signs are considered as one of the important ways to enter information into computers for certain tasks. Computers receive sensor data of hand signs for recognition. When using hand signs as computer inputs, we need to (1) train computer users in the sign language so that their hand signs can be easily recognized by computers, and (2) design the computer interface to avoid the use of confusing signs for improving user input performance and user satisfaction. For user training and computer interface design, it is important to have a knowledge of which signs can be easily recognized by computers and which signs are not distinguishable by computers. This paper presents a data mining technique to discover distinct patterns of hand signs from sensor data. Based on these patterns, we derive a group of indistinguishable signs by computers. Such information can in turn assist in user training and computer interface design.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号