首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 531 毫秒
1.
Given N matrices A1, A2,...,AN of size NtimesN, the matrix chain product problem is to compute A1timesA2times...timesAN. Given an NtimesN matrix A, the matrix powers problem is to calculate the first N powers of A, that is, A, A2, A3,..., AN. We solve the two problems on distributed memory systems (DMSs) with p processors that can support one-to-one communications in T(p) time. Assume that the fastest sequential matrix multiplication algorithm has time complexity O(Nalpha), where the currently best value of a is less than 2.3755. Let p be arbitrarily chosen in the range 1lesplesNalpha+1/(log N)2. We show that the two problems can be solved by a DMS with p processors in Tchain(N,p)=O((Nalpha+1/p)+T(p))((N2(2+1/alpha/p2/alpha)(log+p/N)1-2/alpha+log+((p log N)/Nalpha)) and Tpower (N,p)=O(Nalpha+1/p+T(p)((N2(1+1/alpha)/p2/alpha)(log+p/2 log N)1-2/alpha+(log N)2))) times, respectively, where the function log+ is defined as follows: log+ x=log x if xges1 and log+ x=1 if 0相似文献   

2.
We present two fast algorithms for sorting on a linear array with a reconfigurable pipelined bus system (LARPBS), one of the recently proposed parallel architectures based on optical buses. In our first algorithm, we sort N numbers in O(log N log log N) worst-case time using N processors. In our second algorithm, we sort N numbers in O((log log N)2) worst-case time using N1+ε processors, for any fixed ε such that 0 < ε < 1. Our algorithms are based on a novel deterministic sampling scheme for merging two sorted arrays of length N each in O(log log N) time on an LARPBS with N processors. To our knowledge, the previous best sorting algorithm on this architecture has a running time of O((log N)2) using N processors  相似文献   

3.
We present an O((log log N)/sup 2/) -time algorithm for computing the distance transform of an N /spl times/ N binary image. Our algorithm is designed for the common concurrent read concurrent write parallel random access machine (CRCW PRAM) and requires O(N/sup 2+/spl epsi///log log N) processors, for any /spl epsi/ such that 0 < /spl epsi/ < 1. Our algorithm is based on a novel deterministic sampling scheme and can be used for computing distance transforms for a very general class of distance functions. We also present a scalable version of our algorithm when the number of processors is available p/sup 2+/spl epsi///log log p for some p < N. In this case, our algorithm runs in O((N/sup 2//p/sup 2/)+(N/p) log log p + (log log p)/sup 2/) time. This scalable algorithm is more practical since usually the number of available processors is much less than the size of the image.  相似文献   

4.
Rule chaining in fuzzy expert systems   总被引:1,自引:0,他引:1  
A fuzzy expert system must do rule chaining differently than a nonfuzzy expert system. In particular, any rule that can fire with a particular linguistic variable in its consequent must fire before any rule whose antecedent conditions depend upon the resultant fuzzy set value of the consequent linguistic variable is allowed to fire. The dependent rules would be considered in a chain with the fuzzy rules which generate or assert the needed fuzzy linguistic variable. A recent paper by J. Pan et al. (1998) points out that a version of the FuzzyCLIPS expert system shell does not operate with chained fuzzy rules as one would expect. They introduce FuzzyShell which is described as the only known shell to have the expected fuzzy rule chaining performance. We show several approaches to obtaining the desired behavior in FuzzyCLIPS. Further, a potential pitfall with the FuzzyShell approach to dealing with chaining is pointed out  相似文献   

5.
Kai  Yong-Jin 《Computer aided design》2003,35(14):1269-1285
Many geometric optimization problems in CAD/CAM can be reduced to a maximal intersection problem on the sphere: given a set of N simple spherical polygons on the unit sphere and a real number constant L≤2π, find an arc of length L on the unit sphere that intersects as many spherical polygons as possible. Past results can only solve this maximization problem for two very restricted special cases: the arc must be either a great circle or a semi-great circle. In this paper, a simple and deterministic algorithm based on domain partitioning is presented for solving this maximal arc intersection problem in the general case when the number L is arbitrary. The algorithm is made possible by reducing the domain of the arcs to a continuous sub-space in R2 and then establishing a quotient space partitioning in this sub-space based on a congruence relation. The number of the constituting congruent sub-regions in this quotient space partitioning is shown to have an upper-bound O(E3), where E is the total number of edges on the polygons. The proposed algorithm has a worst-case upper bound O(ME) on its running time, where M is an output-sensitive number and is bounded by O(E3). Examples including two realistic tests for 4-axis NC machining are presented.  相似文献   

6.
7.
In this paper, a distributed selectsort algorithm and a parameterized selectsort algorithm are presented to be applied on distributed systems for cases when N P where N is the number of elements to be sorted and P is the number of processors in the system. The distributed system considered in this paper uses a broadcasting channel for communication between processors. We show that the number of messages required for the parameterized selectsort algorithm is independent of N and is of complexity O(P), which is optimal in a distributed system with P processors. Furthermore, the amount of communication required in terms of elements is N + O(P3) and the computation time complexity is O((N/P)lgN + P2lg(N/P)). Hence, when N P3, the computation time complexity is O((N/P)lgN), which is optimal using P processors. In addition, this parameterized algorithm provides us with a parameter K such that by choosing the value of K allows us to trade among processing requirement, memory requirement, and communication requirement. It is shown that this parameterized algorithm can reduce the communication requirements significantly while only slightly increasing the computation requirements.  相似文献   

8.
基于模糊决策树的文本分类规则抽取   总被引:8,自引:0,他引:8  
王煜  王正欧 《计算机应用》2005,25(7):1634-1637
提出一种合并分枝的模糊决策树文本分类方法对相似文本类进行分类,并可抽取出分类精度较高的模糊分类规则。首先研究改进了的χ2统计量,并根据改进的χ2统计量对文本的特征词条进行聚合,有效地降低了文本向量空间的维数。然后使用一种合并分枝的模糊决策树进行分类,大大减少了抽取的规则数量。从而既保证了决策树分类的精度和速度,又可抽取出可理解的模糊分类规则。  相似文献   

9.
The work performed by a parallel algorithm is the product of its running time and the number of processors it requires. This paper presents work-efficient (or cost-optimal) routing algorithms to determine the switch settings for realizing permutations on rearrangeable symmetrical networks such as Benes and the reduced Ω NΩN-1. These networks have 2n-1 stages with N=2n inputs/outputs, each stage consisting of N/2 crossbar switches of size (2×2). Previously known parallel routing algorithms for a rearrangeable network with N inputs determine the states of all switches recursively in O(n) iterations using N processors. Each iteration determines the switch settings of at most two stages of the network and requires at least O(n) time on a computer of N processors, regardless of the type of its interconnection network. Hence, the work of any previously known parallel routing algorithm equals at least O(Nn2) for setting up all the switches of a rearrangeable network. The new routing algorithms run on a computer of p processors, 1⩽p⩽N/n, and perform work O(Nn). Moreover, because the range of p is large, the new routing algorithms do not have to be changed in case some processors become faulty  相似文献   

10.
CAN is a heuristic algorithm that employs an information theoretic measure to learn rules. CAN approach distinguishes itself from other approaches by being direct, meaning that there are no intermediate representations, an induced rule is never altered in later stages and only tests that appear in the final solution are generated. In the selection of rule conditions (tests) existing rule induction algorithms do not provide a satisfactory answer to the partitioning of the feature space of discrete feature variables with nonordered qualitative values (i.e., categorical attributes) for multiclass problems. Existing algorithms have exponential complexity in N, where N is the number of feature values. Therefore, heuristic algorithms are employed at this step. An important contribution of this paper is to show that in test selection within CAN framework optimal partitions are achieved in linear time in N for the multiclass case.  相似文献   

11.
Real-time remote robotics-toward networked telexistence   总被引:2,自引:0,他引:2  
Many people have longed to project themselves to a remote environment-one where they have the sensation of existing in a different place-while actually remaining where they are. Another dream involves amplifying human muscle power and sensing capabilities with machines while reserving human dexterity through a sensation of direct operation. Virtual reality must have the computer-generated environment or transmitted remote environment's essence of reality to effectively become reality for the user. One of the most promising technologies today is the integration of virtual reality and robotics on the network. The general concept is called networked robotics; in particular, we call it R3. This Japanese national R&D scheme is moving toward the realization of mutual telexistence through various kinds of networks, including the Internet. The launch of the five-year MITI “Humanoid and Human Friendly Robotics” project in April 1998 takes the first step toward the realization of R3  相似文献   

12.
郝彦彬  郭晓  杨乃定 《计算机应用》2015,35(4):1030-1034
根据属性上的函数依赖关系,提出了信息系统属性信息粒的概念,并给出了可分离信息系统的粒结构计算方法。首先,定义了信息系统可分离性,证明了如果一个信息系统是可分离的,则该系统的粒结构可分解为该系统的子系统粒结构的笛卡儿乘积;其次,给出了信息系统可分离性的判别方法及信息系统分解算法;最后,分析了该计算方法的复杂度。分析结果表明,与直接计算信息系统的粒结构相比,该计算方法可将计算复杂度从O(2n)降低到O(2n1+2n2+…+2nk),n=n1+n2+…+nk。理论分析和实例计算表明,该计算方法是可行的。  相似文献   

13.
郝彦彬  郭晓  杨乃定 《计算机应用》2015,35(7):1915-1920
针对不可分离信息系统的属性粒结构计算问题,提出一种利用分治和增量计算相结合的计算方法。首先,研究了在信息系统函数依赖集上增加新的函数依赖(FD)后,信息系统属性粒结构的变化规律,证明了信息系统结构增量定理;其次,通过移除部分函数依赖,使不可分离信息系统成为可分离信息系统,利用分解定理计算出可分离信息系统结构;然后,将移除的函数依赖加入可分离信息系统,利用增量定理计算出原信息系统结构;最后,给出了计算不可分离信息系统属性粒结构的算法,分析了算法复杂度。与直接计算不可分离信息系统的粒结构相比,该计算方法可将计算复杂度从O(n×m×2n)降低到小于O(n×k×2n)(k1×m1×2n1)+O(n2×m2×2n2)(n=n1+n2,m=m1+m2)。理论分析和实例计算表明,所提方法能有效降低不可分离信息系统属性粒结构的计算复杂度。  相似文献   

14.
This paper presents a new parallel algorithm for routing unicast (one-to-one) assignments in Benes networks. Parallel routing algorithms for such networks were reported earlier, but these algorithms were designed primarily to route permutation assignments. The routing algorithm presented in this paper removes this restriction without an increase in the order of routing cost or routing time. We realize this new routing algorithm on two different topologies. The algorithm routes a unicast assignment involving O(k) pairs of inputs and outputs in O(lg 2 k+lg n) time on a completely connected network of n processors and in O(lg4 k+lg2 k lg n) time on an extended shuffle-exchange network of n processors. Using O(n lg n) professors, the same algorithm can be pipelined to route α unicast assignments each involving O(k) pairs of inputs and outputs, in O(lg2 k+lg n+(α-1) lg k) time on a completely connected network and in O(lg4 k+lg2 k lg n+(α-1)(lg 3 k+lg k lg n)) time on the extended shuffle-exchange network. These yield an average routing time of O(lg k) in the first case, and O(lg3 k+1g k lg n) in the second case, for all α⩾lg n. These complexities indicate that the algorithm given in this paper is as fast as Nassimi and Sahni's algorithm for unicast assignments, and with pipelining, it is faster than the same algorithm at least by a factor of O(lg n) on both topologies. Furthermore, for sparse assignments, i.e., when k=O(1), it is the first algorithm which has an average routing time of O(1g n) on a topology with O(n) links  相似文献   

15.
We efficiently map a priority queue on the hypercube architecture in a load balanced manner, with no additional communication overhead, and present optimal parallel algorithms for performing insert and deletemin operations. Two implementations for such operations are proposed on the single port hypercube model. In a b-bandwidth, n-item priority queue in which every node contains b items in sorted order, the first implementation achieves optimal speed up of O(min{log n, b log n/log b+log log n}) for inserting b presorted items or deleting b smallest items, where b=O(n1c/) with c>1. In particular, single insertion and deletion operations are cost optimal and require O(log n/p+log p) time using O(log n/log log n) processors. The second implementation is more scalable since it uses a larger number of processors, and attains a “nearly” optimal speedup on the single hypercube. Namely, the insertion of log n presorted items or the deletion of the log n smallest items is accomplished in O(log log n2) time using O(log2 n/log log n) processors. Finally, on the slightly more powerful pipelined hypercube model, the second implementation performs log n operations in O(log log n) time using O(log2 n/log log n) processors, thus achieving an optimal speed up. To the best of our knowledge, our algorithms are the first implementations of b-bandwidth distributed priority queues, which are load balanced and yet guarantee optimal speed ups  相似文献   

16.
We consider the problems of routing and sorting on a de Bruijn network. First, we show that any deterministic oblivious routing scheme for permutation routing on a d-ary de Bruijn network with N=dn nodes, in the worst case, will take Ω(√N) steps under the single-port model. This improves the existing lower bounds provided d is not a constant. We also show that the lower bound is indeed a tight one. Second, we present a deterministic nonoblivious permutation routing algorithm which runs in O(d.n2) time on a d-ary de Bruijn network with N=dn nodes. This algorithm is currently the fastest known nonoblivious deterministic routing algorithm for de Bruijn networks of arbitrary degree. Finally, we present an efficient general sorting algorithm for the de Bruijn networks of arbitrary degree. This algorithm is the best sorting algorithm known so far. It runs in O((log d).d.n2) time for directed de Bruijn network with dn nodes, degree d, and diameter n. As a corollary, we show that on a binary de Bruijn network of Nnodes, our sorting scheme requires at most 2 log2 Nsteps  相似文献   

17.
Parallel clustering algorithms   总被引:3,自引:0,他引:3  
Clustering techniques play an important role in exploratory pattern analysis, unsupervised learning and image segmentation applications. Many clustering algorithms, both partitional clustering and hierarchical clustering, require intensive computation, even for a modest number of patterns. This paper presents two parallel clustering algorithms. For a clustering problem with N = 2n patterns and M = 2m features, the time complexity of the traditional partitional clustering algorithm on a single processor computer is O(MNK), where K is the number of clusters. The proposed algorithm on anSIMD computer with MN processors has a time complexity O(K(n + m)). The time complexity of the proposed single-link hierarchical clustering algorithm is reduced from O(MN2) of the uniprocessor algorithm to O(nN) with MN processors.  相似文献   

18.
This paper presents the design of a VLSI fuzzy processor, which is capable of dealing with complex fuzzy inference systems, i.e., fuzzy inferences that include rule chaining. The architecture of the processor is based on a computational model whose main features are: the capability to cope effectively with complex fuzzy inference systems; a detection phase of the rule with a positive degree of activation to reduce the number of rules to be processed per inference; parallel computation of the degree of activation of active rules; and representation of membership functions based on α-level sets. As the fuzzy inference can be divided into different processing phases, the processor is made up of a number of stages which are pipelined. In each stage several inference processing phases are performed parallelly. Its performance is in the order of 2 MFLIPS with 256 rules, eight inputs, two chained variables, and four outputs and 5.2 MFLIPS with 32 rules, three inputs, and one output with a clock frequency of 66 MHz  相似文献   

19.
20.
In Georgiou and Smith (1992), the following question was raised: Consider a linear, shift-invariant system on L2[0, ∞). Let the graph of the system have Fourier transform (MN)H2 (i.e., the system has a transfer function P=N/M) where M, N are elements of CA={f∈H: f is continuous on the compactified right-half plane}. Is it possible to normalize M and N (i.e., to ensure |M|2+|N|2=1) in CA? The author shows by example that this is not always possible  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号