首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 46 毫秒
1.
Burdened by their popularity, recommender systems increasingly take on larger datasets while they are expected to deliver high quality results within reasonable time. To meet these ever growing requirements, industrial recommender systems often turn to parallel hardware and distributed computing. While the MapReduce paradigm is generally accepted for massive parallel data processing, it often entails complex algorithm reorganization and suboptimal efficiency because mid-computation values are typically read from and written to hard disk. This work implements an in-memory, content-based recommendation algorithm and shows how it can be parallelized and efficiently distributed across many homogeneous machines in a distributed-memory environment. By focusing on data parallelism and carefully constructing the definition of work in the context of recommender systems, we are able to partition the complete calculation process into any number of independent and equally sized jobs. An empirically validated performance model is developed to predict parallel speedup and promises high efficiencies for realistic hardware configurations. For the MovieLens 10 M dataset we note efficiency values up to 71 % for a configuration of 200 computing nodes (eight cores per node).  相似文献   

2.
Object Composition Petri Nets, Priority Petri Nets, Dynamic OCPN, and Enhanced P-Nets have extended the original Petri Net to achieve the modeling of media synchronization and asynchronous user interactions during multimedia playback. The dynamic Petri Net (DPN) has been conceptualized to tackle existing problems in these two areas of modeling distributed multimedia systems. DPN features dynamic modeling elements which allows iteration and hence is able to reduce graph sizes of synchronous playback models while allowing greater details to be shown. DPN also introduces asynchronous event handling techniques that are powerful and effective. DPN was used in the design and modeling of a multimedia orchestration tool which is a typical representation of an application that works in a distributed multimedia system.  相似文献   

3.
基于Java技术的分布式计算环境研究   总被引:7,自引:2,他引:5  
李治  任波  王乘 《计算机工程与设计》2004,25(6):912-914,920
J2EE具有强大的分布式处理能力,其Rmi、Jini和JavaSpaces技术为实现异构的分布式计算环境提供了坚实的技术基础。JavaSpaces是建立在Jini之上的,它可以作为一种共享分布式通信和对象存储的机制。针对分布式计算环境的任务分解、并行同步与通信控制等特点,提出了基于Java技术的分布式计算环境。应用实例表明,该环境能够在异构的复杂系统中有效地进行分布式计算。  相似文献   

4.
The use of modularity in the design and implementation of complex software simplifies the development process, as well as facilitating the construction of customized configurations. This paper describes our experience using modularity in Consul, a communication substrate used for constructing fault-tolerant distributed programs. First, Consul is presented as a case study of how modularity is feasible in both the design and the implementation of such systems. Secondly, general lessons about modularity in fault-tolerant systems based on our experience with Consul are given. Issues that are addressed include deciding how the system is divided into various modules, dealing with problems that result when protocols are combined, and ensuring that the underlying object infrastructure provides adequate support. The key observation is that the modularization process is most affected by dependencies between modules, both direct dependencies caused by one module explicitly using another's operation and indirect dependencies where one module is affected by another without direct interaction. Although our observations are based on designing and implementing Consul, the lessons are applicable to any fault-tolerant distributed system.  相似文献   

5.
The performance of distributed text document retrieval systems is strongly influenced by the organization of the inverted text. This article compares the performance impact on query processing of various physical organizations for inverted lists. We present a new probabilistic model of the database and queries. Simulation experiments determine those variables that most strongly influence response time and throughput. This leads to a set of design trade-offs over a wide range of hardware configurations and new parallel query processing strategies.  相似文献   

6.
State-switched systems are a large class of hybrid systems with numerous applications. In this work, the representation of state-switched systems in a Differential Petri Net (DPN) formalism becomes possible through a novel transformation of the fundamental equation of the DPN model into a form compatible with state-switching. Furthermore, we use the approach of the switching hyperplanes to provide means for the stability analysis of state-switched systems in a DPN framework. We also investigate stability issues of the resulting DPNs and synthesize an algorithm that can determine the stability of the state-switched DPN. Stability conditions are also expressed as Linear Matrix Inequalities (LMI) so that they can be easily determined using commercial software.  相似文献   

7.
Fluctuations in demand patterns and products?? mixes, driven by continuous changes in customer requirements, are inducing significant changes on the operations of manufacturing organisations. How to respond to such changes rapidly and at minimum cost constitutes a major challenge for manufacturers. The DIMS project (Dynamically Integrated Manufacturing Systems) has developed an agent-based approach that enables manufacturing systems to be modelled using multi-agent systems such that optimal and timely responses to changes are generated from the interactions taking place within the multi-agents systems. This approach also incorporates a distributed discrete event simulation mechanism that enables ??what-if?? system configurations that have been generated through agent interactions to be evaluated dynamically for system restructure. This paper presents the approach with particular focus on the distributed simulation mechanism.  相似文献   

8.
Task parallelism is an attractive approach to automatically load balance the computation in a parallel system and adapt to dynamism exhibited by parallel systems. Exploiting task parallelism through work stealing has been extensively studied in shared and distributed‐memory contexts. In this paper, we study the design of a system that uses work stealing for dynamic load balancing of task‐parallel programs executed on hybrid distributed‐memory CPU‐graphics processing unit (GPU) systems in a global‐address space framework. We take into account the unique nature of the accelerator model employed by GPUs, the significant performance difference between GPU and CPU execution as a function of problem size, and the distinct CPU and GPU memory domains. We consider various alternatives in designing a distributed work stealing algorithm for CPU‐GPU systems, while taking into account the impact of task distribution and data movement overheads. These strategies are evaluated using microbenchmarks that capture various execution configurations as well as the state‐of‐the‐art CCSD(T) application module from the computational chemistry domain. Copyright © 2016 John Wiley & Sons, Ltd.  相似文献   

9.
There is an ever-increasing demand for more complex transactions and higher throughputs in transaction processing systems leading to higher degrees of transaction concurrency and, hence, higher data contention. The conventional two-phase locking (2PL) Concurrency Control (CC) method may, therefore, restrict system throughput to levels inconsistent with the available processing capacity. This is especially a concern in shared-nothing or data-partitioned systems due to the extra latencies for internode communication and a reliable commit protocol. The optimistic CC (OCC) is a possible solution, but currently proposed methods have the disadvantage of repeated transaction restarts. We present a distributed OCC method followed by locking, such that locking is an integral part of distributed validation and two-phase commit. This method ensures at most one re-execution, if the validation for the optimistic phase fails. Deadlocks, which are possible with 2PL, are prevented by preclaiming locks for the second execution phase. This is done in the same order at all nodes. We outline implementation details and compare the performance of the new OCC method with distributed 2PL through a detailed simulation that incorporates queueing effects at the devices of the computer systems, buffer management, concurrency control, and commit processing. It is shown that for higher data contention levels, the hybrid OCC method allows a much higher maximum transaction throughput than distributed 2PL in systems with high processing capacities. In addition to the comparison of CC methods, the simulation study is used to study the effect of varying the number of computer systems with a fixed total processing capacity and the effect of locality of access in each case. We also describe several interesting variants of the proposed OCC method, including methods for handling access variance, i.e., when rerunning a transaction results in accesses to a different set of objects  相似文献   

10.
The execution context in which pervasive systems or mobile computing run changes continually. Hence, applications for these systems require support for self-adaptation to the continual context changes. Most of the approaches for self-adaptive systems implement a reconfiguration service that receives as input the list of all possible configurations and the plans to switch between them. In this paper we present an alternative approach for the automatic generation of application configurations and the reconfiguration plans at runtime. With our approach, the generated configurations are optimal as regards different criteria, such as functionality or resource consumption (e.g. battery or memory). This is achieved by: (1) modelling architectural variability at design-time using the Common Variability Language (CVL), and (2) using a genetic algorithm that finds nearly-optimal configurations at run-time using the information provided by the variability model. We also specify a case study and we use it to evaluate our approach, showing that it is efficient and suitable for devices with scarce resources.  相似文献   

11.
Although a large number of formal methods have been reported in the literature, most of them are applicable only at the initial stages of software development. A major reason for this situation is that those formalisms lack expressiveness to describe the behavior of systems with respect to their underlying configurations. On the other hand, recent experience has shown that the complex nature of distributed systems is conveniently described, constructed and managed in terms of their configuration. In this context, with the twin objectives of accurately modelling the real-timed behavior of distributed systems and supporting the analysis of timing behavior with respect to their underlying configurations, we formulate a logic language called distributed logic (DL). DL is a first-order logic augmented with temporal and spatial modalities. The semantics of DL are based on ideas drawn from both the interleaving and partial order models. In addition to the syntax and semantics of the logic, a formal proof scheme for a distributed programming model is also presented. Finally, use of the proof method is illustrated through the analysis of the real-time properties of a sample problem.  相似文献   

12.
The antipodes of the class of sequential computers, executing tasks with a single CPU, are the parallel computers containing large numbers of computing nodes. In the shared-memory category, each node has direct access through a switching network to a memory bank, that can be composed of a single but large or multiple but medium sized memory configurations. Opposite to the first category are the distributed memory systems, where each node is given direct access to its own local memory section. Running a program in especially the latter category requires a mechanism that gives access to multiple address spaces, that is, one for each local memory. Transfer of data can only be done from one address space to another. Along with the two categories are the physically distributed, shared-memory systems, that allow the nodes to explore a single globally shared address space. All categories, the performances of which are subject to the way the computing nodes are linked, need either a direct or a switched interconnection network for inter-node communication purposes. Linking nodes and not taking into account the prerequisite of scalability in case of exploiting large numbers of them is not realistic, especially when the applied connection scheme must provide for fast and flexible communications at a reasonable cost. Different network topologies, varying from a single shared bus to a more complex elaboration of a fully connected scheme, and with them the corresponding intricate switching protocols have been extensively explored. A different vision is introduced concerning future prospects of an optically coupled distributed, shared-memory organized multiple-instruction, multiple-data system. In each cluster, an electrical crossbar looks after the interconnections between the nodes, the various memory modules and external I/O channels. The clusters itself are optically coupled through a free space oriented data distributing system. Analogies found in the design of the Convex SPP1000 substantiate the closeness to reality of such an architecture. Subsequently to the preceding introduction also an idealized picture of the fundamental properties of an optically based, fully connected, distributed, (virtual) shared-memory architecture is outlined.  相似文献   

13.
Foreign functions have been considered in the advanced database systems to support complex applications. We consider optimizing queries with foreign functions in a distributed environment. In traditional distributed query processing, selection operations are locally processed before joins as much as possible so that the size of relations being transmitted and joined can be reduced. However, if selection predicates involve foreign functions, the cost of evaluating selections cannot be ignored. As a result, the execution order of selections and joins becomes significant, and the trade-off for reducing the costs of data transmission, join processing, and selection predicate evaluation needs to be carefully considered in query optimization. A response time model is developed for estimating the cost of distributed query processing involving foreign functions. We explore the property of the problem and find an optimal algorithm with polynomial complexity for a special case of it. However, finding the optimal execution plan for the general case is NP-hard. We propose an efficient heuristic algorithm for solving the problem and the simulation result shows its good quality. The research result can also be applied to the advanced database systems and the multidatabase systems where the conversion function defined for the need of schema integration can be considered a type of foreign functions  相似文献   

14.
In this paper, we propose a challenging research direction for Constraint Programming and optimization techniques in general. We address problems where decisions to be taken affect and are affected by complex systems, which exhibit phenomena emerging from a collection of interacting objects, capable to self organize and to adapt their behaviour according to their history and feedback. Such systems are unfortunately impervious to modeling efforts via state-of-the-art combinatorial optimization techniques. We provide some hints on approaches to connect and integrate decision making and optimization technology with complex systems via machine learning, game theory and mechanism design. In the first case, the aim is to extract modeling components to express the relation between global decisions and observables emerging from the real system, or from an accurate predictive model (e.g. a simulator). In the second case, the idea is to exploit game theory, mechanism design and distributed decision making to drive the process toward realistic equilibrium points avoiding globally optimal, but unrealistic, configurations. We conclude by observing how dealing with the complexity of the considered problems will require to greatly extend the capabilities of state of the art solvers: in this context, we identify some key issues and highlight future research directions.  相似文献   

15.
This paper presents an analytical model for evaluating the availability of distributed processing systems. The model, which is based on a well-known probability formula, has been implemented in FORTRAN and used for the design of a real-time distributed processing system that requires high availability. It is applied to a sample distributed processing system to illustrate its usefulness and to provide good insight into the design of highly available distributed processing systems. Effects of homogeneous/heterogeneous clustering and standby spare are illustrated. The concept of availability threshold where the distributed system may gain or lose availability over the individual processor availability is introduced. It has proven to be a valuable tool for the tradeoffs in the allocation of system requirements. The concluding remarks summarize the lessons learned and suggest cost-effective ways to achieve higher availability for the design of distributed processing systems.  相似文献   

16.
In this article, the authors examine how the internal audit function maintains its legitimacy when enterprise resource planning systems are introduced. This work centers on an in-depth case study of a multinational bank and finds that enterprise resource planning systems impose an institutional logic of control based on interlinked assumptions. These assumptions motivate changes in the practice and structure of the internal audit function to become an integrated and comprehensive function to maintain its legitimacy.  相似文献   

17.
We develop an approximate analytical method to estimate the customer service levels in automated multiple part-type production lines. The production line consists of several processing stations in series with finite intermediate buffers, one for each part-type. The main contributions include the analysis of multiple part-type systems with machine setups, bypass routings and stations having combinations of shared and dedicated machines. This research is motivated by observations of real production lines. We use the continuous material approximation in modeling the system behaviour and develop a new approximate decomposition method to analyze the performance of the system. Validation experiments conducted on production lines with different configurations show good accuracy in the estimation of customer service levels compared to simulation. We use an example case study to demonstrate the application of the model in the performance improvement of a system that is based on a real production line. The analytical model is proposed as a reliable and fast performance analysis tool for the optimization of automated multiple part-type production lines with complex configurations.  相似文献   

18.
Cooperative defensive systems communicate and cooperate in their response to worm attacks, but determine the presence of a worm attack solely on local information. Distributed worm detection and immunization systems track suspicious behavior at multiple cooperating nodes to determine whether a worm attack is in progress. Earlier work has shown that cooperative systems can respond quickly to day-zero worms, while distributed detection systems allow detectors to be more conservative (i.e., paranoid) about potential attacks because they manage false alarms efficiently. In this paper we present our investigation into the complex tradeoffs in such systems between communication costs, computation overhead, accuracy of the local tests, estimation of viral virulence, and the fraction of the network infected before the attack crests. We evaluate the effectiveness of different system configurations in various simulations. Our experiments show that distributed algorithms are better able to balance effectiveness against worms and viruses with reduced cost in computation and communication when faced with false alarms. Furthermore, cooperative, distributed systems seem more robust against malicious participants in the immunization system than earlier cooperative but non-distributed approaches.  相似文献   

19.
Millions of handwritten bank cheques are processed manually every day in banks and other financial institutions all over the world. Substitution of manual cheque processing with automatic cheque reader system saves time and the cost of processing. In the recent years, systems such as A2iA have been made in order to automate processing of Latin cheques. Normally, these systems are based on the standard structures of cheques such as Check 21 in the USA or Check 006 in Canada. There are major problems in traditional (currently used) Persian bank cheques, which yield low accuracy and computational cost in their automatic processing. In this paper, in order to solve these problems, a novel structure for Persian handwritten bank cheques is presented. Importance and supremacy of this new structure for Persian handwritten bank cheques is shown by conducting several experiments on our created database of cheques based on the new structure. The created database includes 500 handwritten bank cheques based on the presented structure. Experimental results verify the usefulness and importance of the new structure in automatic processing of Persian handwritten bank cheques which provides a standard guideline for automatic processing of Persian handwritten bank cheques comparable to Check 21 or Check 006.  相似文献   

20.
A dynamic pushdown network (DPN) is a set of pushdown systems (PDSs) where each process can dynamically create new instances of PDSs. DPNs are a natural model of multi-threaded programs with (possibly recursive) procedure calls and thread creation. Thus, it is important to have model checking algorithms for DPNs. We consider in this work model checking DPNs against single-indexed LTL and CTL properties of the form \({\bigwedge f_i}\) such that f i is a LTL/CTL formula over the PDS i. We consider the model checking problems w.r.t. simple valuations (i.e., whether a configuration satisfies an atomic proposition depends only on its control location) and w.r.t. regular valuations (i.e., the set of the configurations satisfying an atomic proposition is a regular set of configurations). We show that these model checking problems are decidable. We propose automata-based approaches for computing the set of configurations of a DPN that satisfy the corresponding single-indexed LTL/CTL formula.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号