期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

1.

陈天麒曾庆华孙世新《计算机科学》2003,30(8):176-177

LogP is becoming a practical parallel computation model that meets the demanding of parallel computers and parallel algorithms. So it is important to re-design parallel algorithms on the LogP model. This paper studies the parallel algorithm of computing converse matrix on the simplified LogP model, and gets the simulating results. 相似文献

2.

Performance evaluation of a parallel cascade semijoin algorithm for computing path expressions in object database systems

下载免费PDF全文

王国仁于戈《计算机科学技术学报》2002,17(2):0-0

With the emerging of new applications,especially in Web,Such as E-Commerce,Digital Library and DNA Bank,object database systems show their stronger funcitons than other kinds of database systems due to their powerful representation ability on complex semantics and relationshiop.One distinguished feature of object database systems is path expression,and most queries on an object database ar based on path expression because it is the most natural and convenient way to access the object databse,for example,to navigate the hyper-links in a web-based database,The execution of path expression is usually extremely expensive on a very large database.Therefore,the improvement of path expression eecution efficiency is critical for the performance ofobject databases.As an importan approach realizing high-performance query processing ,the parallel processing of path expression on distributed object databases is explored in this paper.Up to now,some algorithms about how to compute path expressions and how to optimize path expression processing have been proposed for centralizedenvironments.But,few approaches have been presented for computing path expressions in parallel.In this paper,a new paralle algorithm for computing path expression named Parallel Cascade Semijoin(PCSJ)is proposed.Moreover,a new scheduling strategy called right-deep zigzag tree is designed to further improve the performance of the PCSJ algorithm.The exper-iments have been implemented in an NOW distributed and parallel environment.The results show that the PCSJ algorithm outperforms the other two parallel algorithms(the parallel version of forward pointer chasing algorithm(PFPC)and the index splitting parallel algorithm(IndexSplit) when computing path expressions with restrictive predicates and that the right-deep zigzage tree scheduling strategy has better performance than the right-deep tree scheduling strategy. 相似文献

3.

Parallel Data Cube Storage Structure for Range Sum Queries and Dynamic Updates

下载免费PDF全文

Hong Gao Jian-Zhong Li 《计算机科学技术学报》2005,20(3):345-356

I/O parallelism is considered to be a promising approach to achieving high performance in parallel data warehousing systems where huge amounts of data and complex analytical queries have to be processed. This paper proposes a parallel secondary data cube storage structure (PHC for short) to efficiently support the processing of range sum queries and dynamic updates on data cube using parallel computing systems. Based on PHC, two parallel algorithms for processing range sum queries and updates are proposed also. Both the algorithms have the same time complexity, O(logdn/P). The analytical and experimental results show that PHC and the parallel algorithms have high performance and achieve optimum speedup. 相似文献

4.

Fast parallel algorithms for discrete Gabor expansion and transform based on multirate filtering

TAO Liang & GU JuanJuan 《中国科学:信息科学(英文版)》2012,(2):293-300

The Gabor transform has long been recognized as a very useful tool for the joint time and frequency analysis in signal processing.Its real time applications,however,were limited due to the high computational complexity of the Gabor transform algorithms.In this paper,some novel and fast parallel algorithms for the finite discrete Gabor expansion and transform are presented based on multirate filtering.An analysis filter bank is designed for the finite discrete Gabor transform(DGT)and a synthesis filter bank is designed for the finite discrete Gabor expansion(DGE).Each of the parallel channels in the two filter banks has a unified structure and can apply the FFT and the IFFT to reduce its computational load.The computational complexity of each parallel channel does not change as the oversampling rate increases.In fact,it is very low and depends only on the length of the input discrete signal and the number of the Gabor frequency sampling points.The computational complexity of the proposed parallel algorithms is analyzed and compared with that of the major existing parallel algorithms for the finite DGT and DGE.The results indicate that the proposed parallel algorithms for the finite DGT and DGE based on multirate filtering are very attractive for real time signal processing. 相似文献

5.

DPGL： The Direct3D9-based Parallel Graphics Library for Multi-display Environment

Zhen Liu Jiao-Ying Shi 《国际自动化与计算杂志》2007,4(1):30-37

The emergence of high performance 3D graphics cards has opened the way to PC clusters for high performance multi- display environment.In order to exploit the rendering ability of PC clusters,we should design appropriate parallel rendering algorithms and parallel graphics library interfaces.Due to the rapid development of Direct3D,we bring forward DPGL,the Direct3D9-based parallel graphics library in D3DPR parallel rendering system,which implements Direct3D9 interfaces to support existing Direct3D9 application parallelization with no modification.Based on the parallelism analysis of Direct3D9 rendering pipeline,we briefly introduce D3DPR parallel rendering system.DPGL is the fundamental component of D3DPR.After presenting DPGL three layers architecture, we discuss the rendering resource interception and management.Finally,we describe the design and implementation of DPGL in detail, including rendering command interception layer,rendering command interpretation layer and rendering resource parallelization layer. 相似文献

6.

并行计算性能的“双流”分析 总被引：1，自引：1，他引：0

乔香珍《计算机科学》2001,28(10):7-12

The generalized speed-up is estimated according to the "double-stream" analyses. The term"decreasing ratio" is used to describe the influence of the hierarchical memory and the characteristics of parallel application on the performance. The optimization principles for parallel computation are also given. 相似文献

7.

Task scheduling of parallel programs to optimize communications for cluster of SMPs

郑纬民杨博林伟坚李志光《中国科学F辑(英文版)》2001,44(3):213-225

This paper discusses the compile time task scheduling of parallel program running on cluster of SMP workstations. Firstly, the problem is stated formally and transformed into a graph parti-tion problem and proved to be NP-Complete. A heuristic algorithm MMP-Solver is then proposed to solve the problem. Experiment result shows that the task scheduling can reduce communication over-head of parallel applications greatly and MMP-Solver outperforms the existing algorithms. 相似文献

8.

A Performance Analysis Tool for PVM Parallel Programs 总被引：1，自引：0，他引：1

ChenWang YinLiu ChangjunJiang ZhaoqingZhang 《计算机工程与应用》2004,40(29):103-105,112

In this paper,we introduce the design and implementation of ParaVT,which is a visual performance analysis and parallel debugging tool.In ParaVT,we propose an automated instrumentation mechanism.Based on this mechanism,ParaVT automatically analyzes the performance bottleneck of parallel applications and provides a visual user interface to monitor and analyze the performance of parallel programs.In addition,it also supports certain extensions. 相似文献

9.

Implementation and evaluation of parallel FFT on Engineering and Scientific Computation Accelerator (ESCA) architecture

Dan WU Xue-cheng ZOU Kui DAI Jin-li RAO Pan CHEN Zhao-xia ZHENG 《浙江大学学报:C卷英文版》2011,(12):976-989

The fast Fourier transform (FFT) is a fundamental kernel of many computation-intensive scientific applications.This paper deals with an implementation of the FFT on the accelerator system,a heterogeneous multi-core architecture to accelerate computation-intensive parallel computing in scientific and engineering applications.The Engineering and Scientific Computation Accelerator (ESCA) consists of a control unit and a single instruction multiple data (SIMD) processing element (PE) array,in which PEs communicate with each other via a hierarchical two-level network-on-chip (NoC) with high bandwidth and low latency.We exploit the architecture features of ESCA to implement a parallel FFT algorithm efficiently.Experimental results show that both the proposed parallel FFT algorithm and the ESCA architecture are scalable.The 16-bit fixed-point parallel FFT performance of ESCA is compared with a published work to prove the superiority of the mapping algorithm and the hardware architecture.The floating-point parallel FFT performances of ESCA are evaluated and compared with those of the IBM Cell processor and GPU to demonstrate the computing power of the ESCA system for high performance applications. 相似文献

10.

Optimal Partitioning and Granularity of Uniform Task Graphs

下载免费PDF全文

Zhang Zhongyun Li Guojie 《计算机科学技术学报》1991,6(2):185-194

Task partitioning is an important technique in parallel processing.In this paper,we investigate the optimal partitioning strategies and granularities of tasks with communications based on several models of parallel computer systems.Different from the usual approach,we study the optimal partitioning strategies and granularities from the viewpoint of minimizing T as well as minimizing NT^2,where N is the number of processors used and T is the program execution time using N processors.Our results show that the optimal partitioning strategies for all cases discussed in this paper are the same--either to assign all tasks to one processor or to distribute them among the processors as equally as possible depending only on the functions of ratio of running time to communication time R/C. 相似文献

11.

Identification- and Elimination-Based Parallel Query Processing Techniques for Object-Oriented Databases

Chen Y. H. Su S. Y. W. 《Journal of Parallel and Distributed Computing》1995,28(2)

The newly developed object-oriented database management systems provide rich facilities for the modeling and processing of structural as well as behavioral properties of complex application objects. However, due to their inherent generality, new functionalities to be added to these systems as they continue to evolve, and high performance demand in many application domains, efficient parallel algorithms and architectures would be needed to meet the performance requirement for processing large OODBs. In our previous work, we have shown that processing OODBs can be viewed as the manipulation of patterns of object associations. In this paper, we present several parallel, multiwavefront algorithms based on two approaches, i.e., identification and elimination approaches, to verify association patterns specified in queries. Both approaches allow more processors to operate concurrently on a query than the traditional tree-structured query processing approach, thus introducing a higher degree of parallelism in query processing. We present a graph model to transform the query processing problem into a graph problem. Based on the graph model, proofs of correctness of both approaches for tree-structured queries are given, and a combined approach for solving cyclic queries is also provided. We present a new data structure to represent associations between objects, parallel algorithms based on these approaches, and some evaluation results obtained from an actual implementation of these algorithms on an nCUBE 2 parallel computer. 相似文献

12.

Parallel manipulator kinematics learning using holographic neural network models

Roger Boudreau Glen Levesque Salah Darenfed 《Robotics and Computer》1998,14(1):37-44

The forward kinematic problem of parallel manipulators is resolved using a holographic neural paradigm. In a holographic neural model, stimulus–response (input–output) associations are transformed from the domain of real numbers to the domain of complex vectors. An element of information within the holographic neural paradigm has a semantic content represented by phase information and a confidence level assigned in the magnitude of the complex scalar. Networks are trained on a database generated from the closed-form inverse kinematic solutions. After the learning phase, the networks are tested on trajectories which were not part of the training data. The simulation results, given for a planar three-degree-of-freedom parallel manipulator with revolute joints and for a spherical three-degree-of-freedom parallel manipulator, show that holographic neural network models are feasible to solve the forward kinematic problem of parallel manipulators. 相似文献

13.

Hybrid ensemble approach for classification 总被引：1，自引：1，他引：0

Brijesh Verma Syed Zahid Hassan 《Applied Intelligence》2011,34(2):258-278

This paper presents a novel hybrid ensemble approach for classification in medical databases. The proposed approach is formulated to cluster extracted features from medical databases into soft clusters using unsupervised learning strategies and fuse the decisions using parallel data fusion techniques. The idea is to observe associations in the features and fuse the decisions made by learning algorithms to find the strong clusters which can make impact on overall classification accuracy. The novel techniques such as parallel neural-based strong clusters fusion and parallel neural network based data fusion are proposed that allow integration of various clustering algorithms for hybrid ensemble approach. The proposed approach has been implemented and evaluated on the benchmark databases such as Digital Database for Screening Mammograms, Wisconsin Breast Cancer, and Pima Indian Diabetics. A comparative performance analysis of the proposed approach with other existing approaches for knowledge extraction and classification is presented. The experimental results demonstrate the effectiveness of the proposed approach in terms of improved classification accuracy on benchmark medical databases. 相似文献

14.

Parallel Algorithms for Discovery of Association Rules 总被引：2，自引：0，他引：2

Mohammed J. Zaki Srinivasan Parthasarathy Mitsunori Ogihara Wei Li 《Data mining and knowledge discovery》1997,1(4):343-373

Discovery of association rules is an important data mining task. Several parallel and sequential algorithms have been proposed in the literature to solve this problem. Almost all of these algorithms make repeated passes over the database to determine the set of frequent itemsets (a subset of database items), thus incurring high I/O overhead. In the parallel case, most algorithms perform a sum-reduction at the end of each pass to construct the global counts, also incurring high synchronization cost. In this paper we describe new parallel association mining algorithms. The algorithms use novel itemset clustering techniques to approximate the set of potentially maximal frequent itemsets. Once this set has been identified, the algorithms make use of efficient traversal techniques to generate the frequent itemsets contained in each cluster. We propose two clustering schemes based on equivalence classes and maximal hypergraph cliques, and study two lattice traversal techniques based on bottom-up and hybrid search. We use a vertical database layout to cluster related transactions together. The database is also selectively replicated so that the portion of the database needed for the computation of associations is local to each processor. After the initial set-up phase, the algorithms do not need any further communication or synchronization. The algorithms minimize I/O overheads by scanning the local database portion only twice. Once in the set-up phase, and once when processing the itemset clusters. Unlike previous parallel approaches, the algorithms use simple intersection operations to compute frequent itemsets and do not have to maintain or search complex hash structures. Our experimental testbed is a 32-processor DEC Alpha cluster inter-connected by the Memory Channel network. We present results on the performance of our algorithms on various databases, and compare it against a well known parallel algorithm. The best new algorithm outperforms it by an order of magnitude. 相似文献

15.

Algorithms for asynchronous parallel processing of object-orienteddatabases

Thakore A.K. Su S.Y.W. Lam H.X. 《Knowledge and Data Engineering, IEEE Transactions on》1995,7(3):487-504

Management of large quantities of complex data is essential in many advanced application areas. Object-oriented (OO) database management system have been developed to effectively model and process the complex domain knowledge. They have been shown to outperform some existing relational systems. The existing implementations of OO database management systems attempt to improve the efficiency of OO queries by explicitly capturing the relationships among objects. However, the execution of complex queries involving the retrieval of objects from many classes and relationships among them causes the existing system to operate inefficiently. In this paper, we present parallel algorithms for the processing of queries against a large OO database. The algorithms are based on a closed model of query processing pattern-based access instead of the conventional value-based access. During processing, the algorithms avoid the execution of time-consuming join operations by making use of the explicitly stored object associations. Generation of large quantities of temporary data is avoided by marking objects using their identifiers and by employing a two-phase query processing strategy. A query is processed by concurrent multiple waves, thereby improving parallelism avoiding the complexities introduced in their sequential implementation. The correctness and the performance of the parallel algorithms have been tested and analyzed by running parallel programs on a 32-node transputer based parallel machine designed and developed at the IBM Research Center at Yorktown Heights, New York. Benchmark queries of different semantic complexities are generated, and their performance is analyzed for various data and query parameters 相似文献

16.

EB<Superscript>3</Superscript>: an entity-based black-box specification method for information systems

M.?Frappier Email author R.?St-Denis 《Software and Systems Modeling》2003,2(2):134-149

相似文献

17.

基于自组装DNA计算的NTRU密码系统破译方案(英文)

张勋才牛莹崔光照王延峰《计算机学报》2008,31(12)

自组装DNA计算在解决NP问题,尤其在破译密码系统方面,具有传统计算机无法比拟的优势．文中提出了一种用自组装DNA计算破译NTRU公钥密码系统的方法．针对NTRU密码系统的特点,采用DNA瓦片编码信息,借助于瓦片间的粘性末端进行自组装,给出了求解多项式卷积运算的实现方案．在此基础上,通过引入非确定性的指派瓦片,提出了一种破译NTRU系统的非确定性算法．通过创建数以亿计的参与计算的DNA瓦片,该算法可以并行地测试每个可能的密钥,以高概率地输出正确密钥．该方法最大的优点是充分利用了DNA瓦片具有的海量存储能力、生化反应的巨大并行性以及组装的自发有序性．理论分析表明,该方法具有一定的可行性．相似文献

18.

Audit: A new synchronization API for the GET/PUT protocol

Atsushi Hori Jinpil Lee Mitsuhisa Sato 《Journal of Parallel and Distributed Computing》2012

The GET/PUT protocol is considered an effective communication API for parallel computing. However, the one-sided nature of the GET/PUT protocol lacks synchronization functionality for the target process. To date, several techniques have been proposed to tackle this problem. The APIs suggested thus far have failed to hide implementation details of the synchronization functionality. In this paper, a new synchronization API for the GET/PUT protocol is proposed. The central idea here is to associate synchronization flags with the GET/PUT memory regions. Using this technique, synchronization flags are hidden from users, and they are freed from managing the associations between the memory regions and the synchronization flags. The proposed API, named Audit, does not incur additional programming and thus enables natural parallel programming. The evaluations show that Audit exhibits better performance compared to the Notify API proposed in ARMCI. 相似文献

19.

Pharmacy robotic dispensing and planogram analysis using association rule mining with prescription data

《Expert systems with applications》2016

相似文献

20.

Exploring time‐dependent symptom outcomes in office staff

Xiaoshu Lu Risto Toivonen Esa‐Pekka Takala 《人机工程学与制造业中的人性因素》2009,19(3):241-253

This article illustrates the application of a new mathematical model developed for the study of time‐dependent health outcomes for office staff during computer work. The model describes the time‐dependent associations of computer usage with outcomes expressed as discomfort in multiple body regions. The association is explicitly presented with a functional relationship that is parameterized by body regions. The validation of the model demonstrated accuracy in reproducing the observed quantities for the study population. Therefore, we used this model to assess the impact of computer‐related work exposure on discomfort in different body regions among office staff to better understand the behavior of musculoskeletal and other symptoms. The exposures and outcomes were recorded parallel in time as usage of keyboard and mouse and with diaries of discomfort. The body regions of neck/shoulders, eyes, head, shoulder joint/upper arm, and upper back were identified to have the highest discomfort levels and rates for the development of discomfort parallel with exposures. Most of our findings are consistent with the literature. The developed mathematical methodology may be used to understand how the human body reacts to computer work to further prevent potential musculoskeletal and other disorders. © 2009 Wiley Periodicals, Inc. 相似文献