首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
First-in-first-out (FIFO) queues are among the most fundamental and highly studied concurrent data structures. The most effective and practical dynamic-memory concurrent queue implementation in the literature is the lock-free FIFO queue algorithm of Michael and Scott, included in the standard Java TM Concurrency Package. This work presents a new dynamic-memory concurrent lock-free FIFO queue algorithm that in a variety of circumstances performs better than the Michael and Scott queue. The key idea behind our new algorithm is a novel way of replacing the singly-linked list of Michael and Scott, whose pointers are inserted using a costly compare-and-swap (CAS) operation, by an “optimistic” doubly - linked list whose pointers are updated using a simple store, yet can be “fixed” if a bad ordering of events causes them to be inconsistent. We believe it is the first example of such an “optimistic” approach being applied to a real world data structure. A preliminary version of this paper appeared in the proceedings of the 18th International Conference on Distributed Computing (DISC) 2004, pages 117–131.  相似文献   

2.
The problem of resource management for many processor architectures can be viewed as the problem of simultaneously updating data structures that hold system state. An approach in which the possibility of using structures with weakened specifications is examined, is presented. Specifically, data structures that weaken the specification of a priority queue, permitting it to be updated simultaneously by multiple processes are introduced. Two structures, the concurrent heap and the software banyan are proposed, along with their associated algorithms for update. The algorithms are shown to possess attractive properties of simultaneous update and throughput. The results of simulation and actual implementations show that such data structures can improve the execution times of parallel algorithms quite significantly. These structures are proposed as possible basic building blocks for implementation of resource allocation in operating systems  相似文献   

3.
In asynchronous event systems, the production of an event is decoupled from its consumption via an event queue. The loose coupling of such systems allows great flexibility as to where and when within the system the event handler of a given event is actually executed, allowing them to scale with increasing number of processors on a given node or across multiple nodes in a distributed system. Although this flexibility is useful in heterogeneous distributed systems, such as SOA, it may appear not to be well suited for real-time systems, which require an upper bound on how long an event can remain unhandled in a queue. However, we show how the weighted fair scheduling algorithms developed for QoS (quality of service) routing can be adapted to event queues to bound the delay between production and consumption. We achieve this, despite the fact that the underlying execution environment is only weakly controlled, through the use of predictive techniques on a calibrated model of the event system. Our choice of algorithms and data structures is driven in part by the highly concurrent nature of the system. We show how a weighted fair scheduler can be built on a lock-free concurrent priority queue. We analyze the performance of various such queues proposed in the literature and show that a very simple linear quantizing queue yields the best performance.  相似文献   

4.
The authors examine the design, implementation, and experimental analysis of parallel priority queues for device and network simulation. They consider: 1) distributed splay trees using MPI; 2) concurrent heaps using shared memory atomic locks; and 3) a new, more general concurrent data structure based on distributed sorted lists, designed to provide dynamically balanced work allocation and efficient use of shared memory resources. We evaluate performance for all three data structures on a Cray-TSESOO system at KFA-Julich. Our comparisons are based on simulations of single buffers and a 64×64 packet switch which supports multicasting. In all implementations, PEs monitor traffic at their preassigned input/output ports, while priority queue elements are distributed across the Cray-TBE virtual shared memory. Our experiments with up to 60000 packets and two to 64 PEs indicate that concurrent priority queues perform much better than distributed ones. Both concurrent implementations have comparable performance, while our new data structure uses less memory and has been further optimized. We also consider parallel simulation for symmetric networks by sorting integer conflict functions and implementing a packet indexing scheme. The optimized message passing network simulator can process ~500 K packet moves in one second, with an efficiency that exceeds ~50 percent for a few thousand packets on the Cray-T3E with 32 PEs. All developed data structures form a parallel library. Although our concurrent implementations use the Cray-TSE ShMem library, portability can be derived from Open-MP or MP1-2 standard libraries, which will provide support for one-way communication and shared memory lock mechanisms  相似文献   

5.
Queue processors are a viable alternative for high performance embedded computing and parallel processing. We present the design and implementation of a compiler for a queue-based processor. Instructions of a queue processor implicitly reference their operands making the programs free of false dependencies. Compiling for a queue machine differs from traditional compilation methods for register machines. The queue compiler is responsible for scheduling the program in level-order manner to expose natural parallelism and calculating instructions relative offset values to access their operands. This paper describes the phases and data structures used in the queue compiler to compile C programs into assembly code for the QueueCore, an embedded queue processor. Experimental results demonstrate that our compiler produces good code in terms of parallelism and code size when compared to code produced by a traditional compiler for a RISC processor.  相似文献   

6.
Most multiprocessors are multiprogrammed to achieve acceptable response time and to increase their utilization. Unfortunately, inopportune preemption may significantly degrade the performance of synchronized parallel applications. To address this problem, researchers have developed two principal strategies for a concurrent, atomic update of shared data structures: (1)preemption-safe lockingand (2)nonblocking(lock-free)algorithms. Preemption-safe locking requires kernel support. Nonblocking algorithms generally require a universal atomic primitive such ascompare-and-swaporload-linked/store-conditionaland are widely regarded as inefficient.  We evaluate the performance of preemption-safe lock-based and nonblocking implementations of important data structures—queues, stacks, heaps, and counters—including nonblocking and lock-based queue algorithms of our own, in microbenchmarks and real applications on a 12-processor SGI Challenge multiprocessor. Our results indicate that our nonblocking queue consistently outperforms the best known alternatives and that data-structure-specific nonblocking algorithms, which exist for queues, stacks, and counters, can work extremely well. Not only do they outperform preemption-safe lock-based algorithms on multiprogrammed machines, they also outperform ordinary locks on dedicated machines. At the same time, since general-purpose nonblocking techniques do not yet appear to be practical, preemption-safe locks remain the preferred alternative for complex data structures: they outperform conventional locks by significant margins on multiprogrammed systems.  相似文献   

7.
Concurrent FIFO queues are a common component of concurrent systems. Using a single shared lock to prevent concurrent manipulations of queue contents reduces system concurrency. Therefore, many algorithms were suggested to increase concurrency while maintaining the correctness of queue manipulations. This paper shows how to automatically verify partial correctness of concurrent FIFO queue algorithms using existing abstract interpretation techniques. In particular, we verify all the safety properties originally specified for two concurrent queue algorithms without imposing an a priori bound on the number of allocated objects and threads.  相似文献   

8.
We present a highly concurrent priority queue algorithm based on the B-link tree, which is a B+-tree in which every node has a pointer to its right sibling. The algorithm is built on the concurrent B-link tree algorithms. Since the priority queue is based on highly concurrent search structure algorithms, a large number of insert operations can execute concurrently with little or no interference. We first present an algorithm that executes deletemin operations serially. We extend the serialized-deletemin algorithm to allow both parallel and concurrent deletemin operations. We discuss a decisive operation serializable algorithm that permits concurrent deletemin operations, and an algorithm in which p processors cooperate to perform p deletemin operations in O(log p) time.  相似文献   

9.
We analyze the performance of queues that serve readers and writers. Readers are served concurrently, while writers require exclusive service. We approximately analyze a first-come-first-serve (FCFS) reader/writer queue, and derive simple formulae for computing waiting times and capacity under the assumption of Poisson arrivals and exponential service. We extend the analysis to handle a one writer queue, and a queue that includes write intention locks. The simple analyses that we present can be used as rules of thumb for designing concurrent systems  相似文献   

10.
基于队列结构的嵌入式系统多进程应用   总被引:2,自引:1,他引:2  
在基于uClinux的嵌入式图像采集与传输系统中,为解决嵌入式中多进程对文件资源的竞争,加快嵌入式系统的工作速度,可采用循环队列数据结构描述图像资源。文中描述了嵌入式中文件队列资源的数据结构以及嵌入式中文件节点的数据结构。实现了图像采集进程和文件传输进程的核心算法,建立了多进程。使用访问状态标志控制进程同步,解决了资源的竞争和共享。采用基于队列结构的多进程在MCF5272处理器平台上获得了成功。应用实验结果表明,基于队列结构的多进程并行算法,缩短了图像信息的获取时间,提高了嵌入式系统信息采集速度。  相似文献   

11.
We present a modular method for schedulability analysis of real time distributed systems. We extend the actor model, as the asynchronous model for concurrent objects, with real time using timed automata, and show how actors can be analyzed individually to make sure that no task misses its deadline. We introduce drivers to specify how an actor can be safely used. Using these drivers we can verify schedulability, for a given scheduler, by doing a reachability check with the Uppaal model checker. Our method makes it possible to put a finite bound on the process queue and still obtain schedulability results that hold for any queue length.  相似文献   

12.
优先队列广泛地使用在许多并行算法中(例如,多处理机调度和某些组合优化算法)。在这些算法中,共享优先队列的存取冲突限制了加速比的提高。本文提出一种链表优先队列的并行插入和删除方法,具有较小并行开销和较大的并行度,并且保证和串行存取算法的优先顺序完全一致,即删除操作返回已经插入和正在插入的所有元素中的最佳元素。同时,我们还介绍了目前性能最好的堆的并行插入和删除算法,并对准和链表结构并行插入和删除算法的性能和适用范围进行了比较,进一步提出了散列结构的优先队列。在ENCORE Multimax520多处理机上的实验结果验证了我们的理论分析结果:使用链表结构的并行分枝限界算法性能上可获得很大提高。  相似文献   

13.
DNA计算机中队列数据结构的设计及实现   总被引:9,自引:0,他引:9  
提出了DNA计算机中队列数据结构的设计方法,该方法利用两种不同的限制性内切酶完成队列的入队和出队操作,并给出了队列的DNA编码和仿真实例.首先给出了DNA计算机中队列存储结构的形式描述;然后详细给出了DNA计算机中队列初始化、入队和出队等操作的生物实现方法;最后给出了一个具体算法的实例,仿真了DNA计算机上该算法的运行机制.仿真结果表明文中提出的队列的设计方法在DNA计算机上切实可行.这种方法可推广到DNA计算机上其他类型的数据结构,帮助DNA计算机合理、有效地组织需要处理的信息,从而使DNA计算机走向实际应用.  相似文献   

14.
High reliability is the primary requirement for messaging system. Messaging system always utilizes disk queue to temporarily store message to be delivered. Experiments show that Disk queue I/O is the primary performance bottleneck in the messaging system. In this paper we present a high performance disk queue storage management mechanism-FastQueue. The FastQueue utilizes a large file to serve as disk queue to reduce file manage overhead, in which adjacent messages are stored in adjacent disk block. Several messages are written to disk queue in a one large write by Lazy Gathering Write. Several adjacent messages are read into buffer in a one read by Sequential Grouping Prcfetch. The Lazy Gathering Write and Sequential Grouping Prefetch policies take full advantage of the disk bandwidth. Experiment shows that performance of the FastQueue is more than an order of magnitude higher than that of traditional disk queue.  相似文献   

15.
宋玲玲  蒋泽军  王丽芳 《微处理机》2012,33(1):78-80,84
针对用户对磁盘阵列多样性配置方式的需求,设计了一种可支持多终端并发操作的磁盘阵列管理系统。该系统利用分层架构思想实现,并进行了三点改进:第一,将队列管理加入到分层设计中,避免了系统升级带来的级联变更;第二,利用XML文档组织队列消息,增强了消息管理的灵活性,提高了系统开发的并行度;第三,利用XML为用户界面建模,实现了界面的"零编码"开发。因此,利用改进后的分层架构,使得系统具有较好的可扩展性、可维护性以及开发简单的特点。  相似文献   

16.
随着计算机硬件技术的发展,如今我们已经迈入了多核CPU时代.然而,作为软件核心的数据结构仍然是按照单核CPU和顺序型准则来设计的.在基于共享内存的多核时代,大量并发运行的线程会交替地修改数据,产生不可预期的结果,因而我们面临着严峻挑战.针对基于共享内存多核时代数据结构的相关研究进行综述.首先,对比了并发与并行的区别,归纳了基于演进条件(progress condition)的多核数据结构分类,对近年来学术界对各种类型并发数据结构的研究进行综述.在此基础上,剖析了并发数据结构设计和实现的关键技术,并从并发数据结构的开发流程、正确性验证等方面进行了归纳阐述.最后,基于这些讨论,对多核架构下并发数据结构未来的研究趋势和应用前景进行了展望.  相似文献   

17.
刘恒  杨小帆 《计算机应用研究》2012,29(10):3772-3775
动态内存管理的问题对无锁动态数据结构的性能尤为关键,因为多线程环境下的动态内存管理涉及开销较高的同步操作。提出一种构建用于动态无锁数据结构的内存池的方法来减少动态内存使用和与之相伴的动态内存管理开销。该方法通过平衡线程的动态内存消耗来减小内存开销,利用本方法构建的内存池基于线程私有的支持节点窃取的无锁循环队列。本方法具有以下优点:a)用本方法构建的内存池是无锁的;b)能够平衡线程的堆内存消耗;c)可以方便地与动态无锁数据结构集成。实验结果显示,用该方法构造的资源窃取型内存池扩展性较强,且能够在高负载下有效降低无锁数据结构的堆内存消耗和操作执行时间;平衡算法在很大程度上决定内存消耗量,内存池在高负载下的扩展性也受到它所用的数据结构自身多线程访问性能的影响。  相似文献   

18.
Dandamudi  S.P. 《Computer》1997,30(3):82-89
The performance of parallel processing systems, especially large systems, is sensitive to various types of overhead and contention. Performance consequences may be serious when contention occurs for hardware resources such as memory or the interconnection network. Contention can also occur for software resources such as critical data structures maintained by either system or application software. A run queue is one such critical data structure that can affect overall system performance. There are two basic types of run queues, centralized and distributed. Both present performance problems. There are also several techniques to mitigate their drawbacks, but none is completely satisfactory. Instead, the author proposes a different run queue organization, a hierarchical organization that inherits the best features of the centralized and the distributed queue organizations while avoiding their pitfalls. Thus, the hierarchical organization is suitable for building large-scale multiprocessor systems  相似文献   

19.
讨论了数据结构中循环队列入队算法的设计思想并提出了解决队列满时的可行方法,解决了数据结构教材中没有解决的问题。  相似文献   

20.
“V+V”是现代汉语中的常见结构,能够形成兼语、连动等多种完全不同的句法结构,给句法和语义解析造成困难。针对“V+V”形成的句法结构类型和序列关系识别问题,设计并制定了一套语料库标注规范,以解决语料库中存在的“V+V”结构的嵌套标注问题,并据此构建起一个包含5 381个兼语句子、7 987个连动句子,以及1 212个兼语连动嵌套句子的“V+V”语料库。提出一个基于BiLSTM-CRF和多头注意力机制的模型,能够同时识别结构中的多个动词和名词的句法、语义角色。相比于以往只研究单项识别兼语或者连动结构,该模型不仅可以同时识别兼语结构、连动结构,还可以解决兼语连动嵌套结构的识别问题。实验结果表明:该方法能够很好地解决“V+V”序列关系的识别问题,在测试集语料上达到92.12%的F1值。  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号