首页 | 本学科首页   官方微博 | 高级检索  
 共查询到20条相似文献,搜索用时 31 毫秒
Shortest distance and reliability of probabilistic networks   总被引:1,自引:0,他引:1  
When the “length” of a link is not deterministic and is governed by a stochastic process, the “shortest” path between two points in the network is not necessarily always composed of the same links and depends on the state of the network. For example, in communication and transportation networks, the travel time on a link is not deterministic and the fastest path between two points is not fixed. This paper presents an algorithm to compute the expected shortest travel time between two nodes in the network when the travel time on each link has a given independent discrete probability distribution. The algorithm assumes the knowledge of all the paths between two nodes and methods to determine the paths are referenced.In reliability (i.e. the probability that two given points are connected by a path) computations, associated with each link is a probability of “failure” and a probability of “success”. Since “failure” implies infinite travel time, the algorithm simultaneously computes reliability. The paper also discusses the algorithm's capability to simultaneously compute some other performance measures which are useful in the analysis of emergency services operating on a network.  相似文献   

This paper presents a performance model of a two-dimensional disk array (TIDA) system, which is composed of several major subsystems including disk cache, intelligent disk array controller, SCSI-like I/O bus, and two-dimensional array of disk devices. Accessing conflict in these subsystems and fork/join synchronization of physical disk requests are considered in the model. The representation for the complex behavior, including the interactions among subsystems, of a whole disk array system distinguishes the model from others that model only individual subsystems. To assist evaluating the architectural alternatives of TIDA, we employ a subsystem access time modeling methodology, in which we model for each subsystem the mean subsystem access time per request (SATPR). Fed with a given set of representative workload parameters, the performance model is used to conduct performance evaluation and the SATPRs of the subsystems are utilized to identify the bottleneck subsystem for performance improvement. The results show that (1) the values of some key design parameters, such as data block size and I/O bus bandwidth that yield the best system throughput are dependent not only on the subsystem performance but also on the interaction among subsystems; (2) an I/O bus bandwidth of 5 Mbytes/s per disk device is large enough for data transfers from/to disk devices equipped with a cache of 1 Mbytes; and (3) the activity of fork/join synchronization of physical disk requests may cause performance degradation, which can be improved by using large I/O bus bandwidth and/or placing a cache in each disk device.  相似文献   

Exponential fork/join queueing networks (FJQNs) with finite buffers have been used as a major tool for evaluating the performances of manufacturing systems. In this study, we first suggest the throughput upper and lower bounds. Our upper-bounding method is elaborated on with general network configuration (acyclic configuration), while our lower bounds can be obtained only for networks with more specialized configuration. Next, developed is a simple approximation method for throughputs, which are based on decomposition/aggregation principles and structurally equivalent relations between different configurations.  相似文献   

Fast joins using join indices   总被引:1,自引:0,他引:1  
Two new algorithms, “Jive join” and “Slam join,” are proposed for computing the join of two relations using a join index. The algorithms are duals: Jive join range-partitions input relation tuple ids and then processes each partition, while Slam join forms ordered runs of input relation tuple ids and then merges the results. Both algorithms make a single sequential pass through each input relation, in addition to one pass through the join index and two passes through a temporary file, whose size is half that of the join index. Both algorithms require only that the number of blocks in main memory is of the order of the square root of the number of blocks in the smaller relation. By storing intermediate and final join results in a vertically partitioned fashion, our algorithms need to manipulate less data in memory at a given time than other algorithms. The algorithms are resistant to data skew and adaptive to memory fluctuations. Selection conditions can be incorporated into the algorithms. Using a detailed cost model, the algorithms are analyzed and compared with competing algorithms. For large input relations, our algorithms perform significantly better than Valduriez's algorithm, the TID join algorithm, and hash join algorithms. An experimental study is also conducted to validate the analytical results and to demonstrate the performance characteristics of each algorithm in practice. Received July 21, 1997 / Accepted June 8, 1998  相似文献   

Much of the research work into artificial intelligence (AI) has been focusing on exploring various potential applications of intelligent systems with successful results in most cases. In our attempts to model human intelligence by mimicking the brain structure and function, we overlook an important aspect in human learning and decision making: the emotional factor. While it currently sounds impossible to have “machines with emotions,” it is quite conceivable to artificially simulate some emotions in machine learning. This paper presents a modified backpropagation (BP) learning algorithm, namely, the emotional backpropagation (EmBP) learning algorithm. The new algorithm has additional emotional weights that are updated using two additional emotional parameters: anxiety and confidence. The proposed “emotional” neural network will be implemented to a facial recognition problem, and the results will be compared to a similar application using a conventional neural network. Experimental results show that the addition of the two novel emotional parameters improves the performance of the neural network yielding higher recognition rates and faster recognition time.   相似文献   

Queueing network models have been used extensively to analyze performance of computer systems. However, queueing network models with product form solutions are not directly applicable to systems that process programs with internal concurrency/synchronization. An exact solution of such systems is often not feasible because of its large state space.Approximation techniques, based on queueing network theory, are presented which analyze the performance of closed systems with a specific scheme of concurrency/synchronization. The techniques are applicable to multitasking systems, distributed database systems, packet routing environments, and fork/join situations.This research was partially supported by CNPq/Brazil, Hospital Corporation of America (HCA) and Northern Telecom.  相似文献   

We present a flowchart language for parallel processing: in addition to the “standard” components, our flowcharts contain fork, join and synchronizing nodes. Extending the work of Mills, we suggest restrictions on controlling computation flow and show that any proper program can be algorithmically transformed to an equivalent structured program.  相似文献   

The ratio of disk capacity to disk transfer rate typically increases by 10× per decade. As a result, disk is becoming slower from the view of applications because of the much larger data volume that they need to store and process. In database systems, the less the data volume that is involved in query processing, the better the performance that is achieved. Disk-based join operation is a common but time-consuming database operation, especially in an environment of massive data in which I/O cost dominates the execution time. However, current join algorithms are only suitable for moderate or small data volume. They will incur high I/O cost when performing on massive data because of multi-pass I/O operations on the joined tables and the insensitivity to join selectivity. This paper proposes PI-Join a novel disk-based join algorithm that can efficiently process join queries involving massive data. PI-Join consists of two stages: JPIPT construction stage (JCS) and result output stage (ROS). JCS performs a cache-conscious construction algorithm on join attributes which are kept in column-oriented model to obtain join positional index pair table (JPIPT) of join results faster. The obtained JPIPT is used in ROS to retrieve results in a one-pass sequential selective scan on each table. We provide the correctness proof and cost analysis of PI-Join. Our experimental results indicate that PI-Join has a significant advantage over the existing join algorithms.  相似文献   

基于数据网格环境的连接操作算法   总被引:5,自引:1,他引:5  
数据网格是一种分布式数据管理体系结构,能够为分布在网格中的资源提供协同的管理机制.数据库管理系统在数据网格中发挥着重要作用,在各种数据库操作中,连接操作是一种最常用也是最耗时的操作,到目前为止,尚未有文献提出数据网格环境下的连接操作算法.主要对数据网格环境下海量数据的连接操作算法进行了研究,针对网格中各结点之间网络带宽异构的特点,采取关系缩减算法、行分块传输技术和流水线并行机制来减少查询的响应时间.理论分析和实验结果证明,算法在减少网络通信开销、增加I/0和CPU并行、降低响应时间方面具有较好的性能.  相似文献   

This paper explores several variants of the Chandy-Misra Null Message algorithm for distributed simulation. The Chandy-Misra algorithm is one of a class of “conservative” algorithms that maintains the correct order of simulation throughout the execution of the model by means of constraints on simulation time advance. The algorithms developed in this paper incorporate an “event-oriented” view of the physical process and message-passing. The effects of the computational workload to compute each event is related to speedup attained over an equivalent sequential simulation. The effects of network topology are investigated, and performance is evaluated for the variants on transmission of null messages. The performance analysis is supported with empirical results based on an implementation of the algorithm on an Intel iPSC 32-node hypercube multiprocessor. Results show that speedups over sequential simulation of greater than N, using N processors, can be achieved in some circumstances.  相似文献   

为了解决高维数据相似性连接查询中存在的维度灾难和计算代价高等问题,基于p-稳态分布,将高维数据映射到低维空间。根据卡方分布的性质,证明了如果低维空间的距离大于,则原始空间距离大于ε的概率具有一定的下界,从而可以在低维空间以较低的计算代价进行有效过滤。在此基础上,提出了基于卡方分布的高维数据相似性连接查询算法。为了进一步提高查询效率,提出了基于双重过滤的高维数据相似性连接查询算法。利用真实数据集进行了实验,实验结果表明所提方法具有较好的性能。基于卡方分布的相似性连接查询算法召回率可以达到90%以上。基于双重过滤的相似性连接查询算法可以进一步提高性能,但是会损失一定的召回率。对时间性能要求比较高、对召回率要求不太严格的查询任务可以采用基于双重过滤的相似性连接查询算法;反之,可以采用基于卡方分布的相似性连接查询算法。  相似文献   

宋杰  李甜甜  朱志良  鲍玉斌  于戈 《软件学报》2015,26(6):1438-1456
数据的指数级增长给数据管理和分析带来了严峻的挑战.连接查询是数据分析中一种常用运算,而MapReduce是一种用于大规模数据集并行处理的编程模型,研究基于MapReduce的连接查询代价评估和查询优化,有着学术意义和应用价值.MapReduce连接查询算法的性能主要取决于I/O代价(包括本地和网络I/O),而I/O代价与数据集以及连接运算的特征参数相关,通过对二元连接的I/O代价评估可以优化多元连接执行计划.基于此,首先提出了二元连接查询的I/O代价模型;随后,对现有二元连接算法进行形式化定义和简单扩展,归纳出6种基于MapReduce连接查询算法,并通过算法白盒分析定义它们的I/O代价函数;最后,提出一种多元连接最优执行计划的选择算法.通过实验表明I/O代价模型的正确性且能够准确地反映算法的性能优劣.  相似文献   

A general method for the identification of the independent subsets in loops with constant dependence vectors is presented. It is shown that the dependence relation remains invariant under a unimodular transformation. Then a unimodular transformation is used to bring the dependence matrix into a form where the independent subsets are obtained by a direct and inexpensive partitioning algorithm. This leads to a procedure for the automatic conversion of a serial loop into a nest of parallel DO-ALL loops. Another unimodular transformation results in an algorithm to label the dependent iterations of an n-fold nested loop in O(n2) time. This provides a multithreaded dynamic scheduling scheme requiring only one fork and one join primitive  相似文献   

Vector similarity join, which finds similar pairs of vector objects, is a computationally expensive process. As its number of vectors increases, the time needed for join operation increases proportional to the square of the number of vectors. Various filtering techniques have been proposed to reduce its computational load. On the other hand, MapReduce algorithms have been studied to manage large datasets. The recent improvements, however, still suffer from its computational time and scalability. In this paper, we propose a MapReduce algorithm FACET(FAst and sCalable maprEduce similariTy join) to efficiently solve the vector similarity join problem on large datasets. FACET is an all-pair exact join algorithm, composed of two stages. In the first stage, we use our own novel filtering techniques to eliminate dissimilar pairs to generate non-redundant candidate pairs. The second stage matches candidate pairs with the vector data so that similar pairs are produced as the output. Both stages employ parallelism offered by MapReduce. The algorithm is currently designed for cosine similarity and Self Join case. Extensions to other similarity measures and R-S Join case are also discussed. We provide the I/O analysis of the algorithm. We evaluate the performance of the algorithm on multiple real world datasets. The experiment results show that our algorithm performs, on average, 40 % upto 800 % better than the previous state-of-the-art MapReduce algorithms.  相似文献   

Aiming at the problem of top-k spatial join query processing in cloud computing systems, a Spark-based top-k spatial join (STKSJ) query processing algorithm is proposed. In this algorithm, the whole data space is divided into grid cells of the same size by a grid partitioning method, and each spatial object in one data set is projected into a grid cell. The Minimum Bounding Rectangle (MBR) of all spatial objects in each grid cell is computed. The spatial objects overlapping with these MBRs in another spatial data set are replicated to the corresponding grid cells, thereby filtering out spatial objects for which there are no join results, thus reducing the cost of subsequent spatial join processing. An improved plane sweeping algorithm is also proposed that speeds up the scanning mode and applies threshold filtering, thus greatly reducing the communication and computation costs of intermediate join results in subsequent top-k aggregation operations. Experimental results on synthetic and real data sets show that the proposed algorithm has clear advantages, and better performance than existing top-k spatial join query processing algorithms.  相似文献   

The authors compare the performance of two join algorithms on both cube and ring interconnections for message-based multicomputers, and investigate the effects that the number of processors and the type of interconnection scheme have on the performance. First, the parallel hybrid-hash join algorithm and the parallel join-index join algorithm for both the cube and ring connected multicomputers are presented. The performance of these algorithms is then compared through analytical cost modeling. The result shows that the join-index join algorithm gives good performance only when the join selectivity is very small, and the hybrid-hash join algorithm performs consistently well under most situations. It is shown that the cube topology yields better execution time than the same algorithm on the ring topology. Furthermore, increasing the number of processors has a more significant improvement on the execution time of the cube than for the ring configuration. The applicability of join indexes on the parallel database algorithms is also discussed  相似文献   

为解决P2P网络中Chord算法众多节点性能不一、节点频繁离开和加入制约系统性能的问题, 提出了基于信息相关度的分组改进算法。该算法通过引入节点信息相关度的概念, 对原Chord进行信息相关度的一个分组调整。从每个组选出两个超级节点组成超级组, 同时为每个节点增加了逆时针路由, 在两个超级节点顺逆两个方向上选择出最短路径进行查找。实验表明, 改进后的算法使得系统的性能和适应性都得到了加强, 提高了Chord在对等网中的查找效率。  相似文献   

In this paper, we re-examine the results of prior work on methods for computing ad hoc joins. We develop a detailed cost model for predicting join algorithm performance, and we use the model to develop cost formulas for the major ad hoc join methods found in the relational database literature. We show that various pieces of “common wisdom” about join algorithm performance fail to hold up when analyzed carefully, and we use our detailed cost model to derive op timal buffer allocation schemes for each of the join methods examined here. We show that optimizing their buffer allocations can lead to large performance improvements, e.g., as much as a 400% improvement in some cases. We also validate our cost model's predictions by measuring an actual implementation of each join algorithm considered. The results of this work should be directly useful to implementors of relational query optimizers and query processing systems. Edited by M. Adiba. Received May 1993 / Accepted April 1996  相似文献   

DSC: scheduling parallel tasks on an unbounded number of processors   总被引:1,自引:0,他引:1  
We present a low-complexity heuristic, named the dominant sequence clustering algorithm (DSC), for scheduling parallel tasks on an unbounded number of completely connected processors. The performance of DSC is on average, comparable to, or even better than, other higher-complexity algorithms. We assume no task duplication and nonzero communication overhead between processors. Finding the optimum solution for arbitrary directed acyclic task graphs (DAG's) is NP-complete. DSC finds optimal schedules for special classes of DAG's, such as fork, join, coarse-grain trees, and some fine-grain trees. It guarantees a performance within a factor of 2 of the optimum for general coarse-grain DAG's. We compare DSC with three higher-complexity general scheduling algorithms: the ETF by J.J. Hwang, Y.C. Chow, F.D. Anger, and C.Y. Lee (1989); V. Sarkar's (1989) clustering algorithm; and the MD by M.Y. Wu and D. Gajski (1990). We also give a sample of important practical applications where DSC has been found useful  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号