期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

1.

Space-Time Equations for Non-Unimodular Mappings

《国际计算机数学杂志》2012,89(5):555-572

The class of systems of uniform recurrence equations (UREs) is closed under uni-modular transformations. As a result, every systolic array described by a unimodular mapping can be specified by a system of space-time UREs, in which the time and space coordinates are made explicit. As non-unimodular mappings are frequently used in systolic designs, this paper presents a method that derives space-time equations for systolic arrays described by non-unimodular mappings. The space-time equations for non-unimodular mappings are known elsewhere as sparse UREs (SUREs) because the domains of their variables are sparse and their data dependences are uniform. Our method is compositional in that space-time SUREs can be further transformed by unimodular and non-unimodular mappings, allowing a straightforward implementation in systems like ALPHA. Specifying a systolic design by space-time equations has two advantages. First, the space-time equations exhibit all useful properties about the design, allowing the design to be formally verified. Second, depending on the application area and performance requirement, the space-time equations can be realised as custom VLSI systems, FPGAs, or programs to be run on a parallel computer. 相似文献

2.

Generation of injective and reversible modular mappings

Hyuk Jae Lee Fortes J.A.B. 《Parallel and Distributed Systems, IEEE Transactions on》2003,14(1):1-12

A modular mapping consists of a linear transformation followed by modulo operations. It is characterized by a transformation matrix and a vector of moduli, called the modulus vector. Modular mappings are useful to derive parallel versions of algorithms with commutative operations and algorithms intended for execution on processor arrays with toroidal networks. In order to preserve algorithm correctness, modular mappings must be injective. Results of previous work characterize injective modular mappings of rectangular index sets. This paper provides a technique to generate modular mappings that satisfy these injective conditions and extends the results to general index sets. For an n-dimensional rectangular index set, the technique has O(n/sup 2/n!) complexity. To facilitate generation of efficient code, modular mappings must also be reversible (i.e., have easily described inverses). An O(n/sup 2/) method is provided to generate reversible modular mappings. This method reduces the search space by fixing entries of the modulus vector while attempting to minimize the number of entries to exclude few solutions. For general index sets defined by linear inequalities, injectivity can be checked by formulating and solving a set of linear inequalities. A modified Fourier-Motzkin elimination is proposed to solve these inequalities. To generate an injective modular mapping of an index set defined by linear inequalities, this paper proposes a technique that attempts to minimize the values of the entries of the modulus vector. Several examples are provided to illustrate the application of the above mentioned methods, including the case of BLAS routines. 相似文献

3.

Index Set Splitting

Martin Griebl Paul Feautrier Christian Lengauer 《International journal of parallel programming》2000,28(6):607-631

There are many algorithms for the space-time mapping of nested loops. Some of them even make the optimal choices within their framework. We propose a preprocessing phase for algorithms in the polytope model, which extends the model and yields space-time mappings whose schedule is, in some cases, orders of magnitude faster. These are cases in which the dependence graph has small irregularities. The basic idea is to split the index set of the loop nests into parts with a regular dependence structure and apply the existing space-time mapping algorithms to these parts individually. This work is based on a seminal idea in the more limited context of loop parallelization at the code level. We elevate the idea to the model level (our model is the polytope model), which increases its applicability by providing a clearer and wider range of choices at an acceptable analysis cost. Index set splitting is one facet in the effort to extend the power of the polytope model and to enable the generation of competitive target code. 相似文献

4.

Algorithm transformations for the data broadcast elimination method

《国际计算机数学杂志》2012,89(4):433-461

VLSI technology has had tremendous success in revolutionizing computer design with processor arrays. Local communication and interconnection is a constraint that dictates the design of processor arrays. The shared bus and global access to memory are now no longer used, since they lower the speed. Consequently, parallel algorithms must be designed according to these constraints.

One of the problems that must be resolved for the above mentioned constraints is data broadcast elimination. Algorithms must be transformed into a form that uses data propagation instead of data broadcast.

Here systems of affine recurrence equations are analyzed and data broadcast is denned in context of the definition of data dependence and affine recurrence equations. A method for data broadcast elimination is introduced in [1] and expands the system of affine recurrence equations into new recurrence equations, that define data propagation and eliminates the data dependences where data broadcast occurs.

Parallel algorithms are usually given as a set of similar tasks repetitively performed on different data. The iteration form of presenting the algorithms is most common. Several techniques are introduced to transform the algorithm to a single assignment form of recurrence equations.

Some improvements of these techniques are presented to make the application of the data broadcast elimination method easier and more straight forward. The presented techniques are classified as the transformation of iterative algorithms to a recurrence form, the transformation of recurrence form to a single assignment form, and fulfilling the index forms of the algorithms.

A system of affine recurrence equations with the data broadcast property is always obtained by applying these procedures. The method of data broadcast elimination successfully transforms this system of affine recurrence equations into a system of uniform recurrence equations which can be used for parallel implementation on VLSI processor arrays. 相似文献

5.

DNA and quantum based algorithms for VLSI circuits testing

Amardeep?Singh Email author Lalit?M.?Bharadwaj Singh?Harpreet 《Natural computing》2005,4(1):53-72

Testing of VLSI circuits is still a NP hard problem. Existing conventional methods are unable to achieve the required breakthrough in terms of complexity, time and cost. This paper deals with testing the VLSI circuits using natural computing methods. Two prototypical algorithms named as DATPG and QATPG are developed utilizing the properties of DNA computing and Quantum computing, respectively. The effectiveness of these algorithms in terms of result quality, CPU requirements, fault detection and number of iterations is experimentally compared with some of existing classical approaches like exhaustive search and Genetic algorithms, etc. The algorithms developed are so efficient that they require only N (where N is the total number of vectors) iterations to find the desired test vector whereas in classical computing, it takes N/2 iterations. The extendibility of new approach enables users to easily find out the test vector from VLSI circuits and can be adept for testing the VLSI chips. 相似文献

6.

Predicate Matrix: automatically extending the semantic interoperability between predicate resources

Maddalen Lopez de Lacalle Egoitz Laparra Itziar Aldabe German Rigau 《Language Resources and Evaluation》2016,50(2):263-289

This paper presents a novel approach to improve the interoperability between four semantic resources that incorporate predicate information. Our proposal defines a set of automatic methods for mapping the semantic knowledge included in WordNet, VerbNet, PropBank and FrameNet. We use advanced graph-based word sense disambiguation algorithms and corpus alignment methods to automatically establish the appropriate mappings among their lexical entries and roles. We study different settings for each method using SemLink as a gold-standard for evaluation. The results show that the new approach provides productive and reliable mappings. In fact, the mappings obtained automatically outnumber the set of original mappings in SemLink. Finally, we also present a new version of the Predicate Matrix, a lexical-semantic resource resulting from the integration of the mappings obtained by our automatic methods and SemLink. 相似文献

7.

Mapping algorithms in ispahan

E.S. Gelsema G. Eden 《Pattern recognition》1980,12(3):127-136

A number of mapping operations in pattern recognition is reviewed. Two families of new mapping algorithms are defined. The performance of all algorithms is illustrated using a collection of 100 feature vectors obtained from images of two classes of white blood cells. It is emphasized that the interactive pattern recognition system ISPAHAN is well suited to find optimal decision functions based on such mappings. 相似文献

8.

A Maple package for improved global mapping forecast

H. Carli L.G.S. Duarte L.A.C.P. da Mota 《Computer Physics Communications》2014

We present a Maple implementation of the well known global approach to time series analysis and some further developments designed to improve the computational efficiency of the forecasting capabilities of the approach. This global approach can be summarized as being a reconstruction of the phase space, based on a time ordered series of data obtained from the system. After that, using the reconstructed vectors, a portion of this space is used to produce a mapping, a polynomial fitting, through a minimization procedure, that represents the system and can be employed to forecast further entries for the series. In the present implementation, we introduce a set of commands, tools, in order to perform all these tasks. For example, the command VecTS deals mainly with the reconstruction of the vector in the phase space. The command GfiTS deals with producing the minimization and the fitting. ForecasTS uses all these and produces the prediction of the next entries. For the non-standard algorithms, we here present two commands: IforecasTS and NiforecasTS that, respectively deal with the one-step and the N

N

-step forecasting. Finally, we introduce two further tools to aid the forecasting. The commands GfiTS and AnalysTS, basically, perform an analysis of the behavior of each portion of a series regarding the settings used on the commands just mentioned above. 相似文献

9.

On time mapping of uniform dependence algorithms into lowerdimensional processor arrays

Shang W. Fortes J.A.B. 《Parallel and Distributed Systems, IEEE Transactions on》1992,3(3):350-363

Most existing methods of mapping algorithms into processor arrays are restricted to the case where n-dimensional algorithms, or algorithms with n nested loops, are mapped into (n-1)-dimensional arrays. However, in practice, it is interesting to map n-dimensional algorithms into (k-1)-dimensional arrays where k<n. A computational conflict occurs if two or more computations of an algorithm are mapped into the same execution time. Based on the Hermite normal form of the mapping matrix, necessary and sufficient conditions are derived to identify mapping without computational conflicts. These conditions are used to find time mappings of n-dimensional algorithms into (k-1)-dimensional arrays, k<n , without computational conflicts. For some applications, the mapping is time-optimal 相似文献

10.

Parallelism detection and transformation techniques useful for VLSI algorithms

J.A.B. Fortes D.I. Moldovan 《Journal of Parallel and Distributed Computing》1985,2(3):277-301

相似文献

11.

Algorithms for a class of infinite permutation groups

Stefan Kohl 《Journal of Symbolic Computation》2008

Motivated by the famous 3n+1

3 n + 1

conjecture, we call a mapping from Z

Z

to Z

Z

residue-class-wise affine if there is a positive integer m

m

such that it is affine on residue classes (mod m

m

). This article describes a collection of algorithms and methods for computation in permutation groups and monoids formed by residue-class-wise affine mappings. 相似文献

12.

Set oriented mappings on neural networks

R. K.?Brouwer Email author W.?Pedrycz 《Soft Computing - A Fusion of Foundations, Methodologies and Applications》2003,8(1):28-37

Multi-layer perceptrons (MLP) or feed-forward neural networks (FFNN) are generally used to represent many-to-one (m-o) mappings from \(\Re^{n}\) to co-domain \(\Re^{m}\). Input units distribute real values to hidden layer units and individual output units produce values in \(\Re\). Thus MLP's represent or simulate the mappings of functions where the range consists of vectors and ordered lists. However it is also useful to represent mappings where the range consists of elements that are sets or collections (bags) of vectors of real values. The question answered in this paper is “Can an MLP be trained and used to represent a mapping from vectors of real values to collections of vectors of real values?”. Representing mappings from vectors to sets of real numbers or vectors of real numbers has a useful application that is of interest since a one-to-many (o-m) mapping from \(\Re^{n}\) to co-domain \(\Re^{m}\) is equivalent to a m-o mapping from \(\Re^n\) to co-domain P(\(\Re^{m}\)) where P(\(\Re^{m}\)) is the power set of \(\Re\) ^m. The paper describes a gradient descent training algorithm that successfully stores a mapping from vectors to sets and thereby a one-many mapping, on a feed-forward network requiring a relatively small number of training epochs. The method is tried on two one-to-many relationships. One is obtained from the inverse of a function and the other is a relationship that maps ages of parents to ages of their children. The method is readily extended to representing mappings to fuzzy sets. 相似文献

13.

Time-optimal visibility-related algorithms on meshes with multiplebroadcasting

Bhagavathi D. Bokka V.V. Gurla H. Olariu S. Schwing J.L. Stojmenovic I. Zhang J. 《Parallel and Distributed Systems, IEEE Transactions on》1995,6(7):687-703

Given a collection of objects in the plane along with a viewpoint ω, the visibility problem involves determining the portion of each object that is visible to an observer positioned at ω. The visibility problem is central to various application areas including computer graphics, image processing, VLSI design, and robot navigation, among many others. The main contribution of this work is to provide time-optimal solutions to this problem for several classes of objects, namely ordered line segments, disks, and iso-oriented rectangles in the plane. In addition, our visibility algorithm for line segments is at the heart of time-optimal solutions for determining, for each element in a given sequence of real numbers, the position of the nearest larger element within that sequence, triangulating a set of points in the plane, determining the visibility pairs among a set of vertical line segments, and constructing the dominance and visibility graphs of a set of iso-oriented rectangles in the plane. All the algorithms in this paper involve an input of size n and run in O(log n) time on a mesh with multiple broadcasting of size n×n. This is the first instance of time-optimal solutions for these problems on this architecture 相似文献

14.

Analogue VLSI primitives for perceptual tasks in machine vision

G. M. Bisio L. Raffo S. P. Sabatini 《Neural computing & applications》1998,7(3):216-228

A variety of computational tasks in early vision can be formulated through lattice networks. The cooperative action of these networks depends upon the topology of interconnections, both feedforward and recurrent ones. The Gabor-like impulse response of a 2nd-order lattice network (i.e. with nearest and next-to-nearest interconnections) is analysed in detail, pointing out how a near-optimal filtering behaviour in space and frequency domains can be achieved through excitatory/inhibitory interactions without impairing the stability of the system. These architectures can be mapped, very efficiently at transistor level, on VLSI structures operating as analogue perceptual engines. The hardware implementation of early vision tasks can, indeed, be tackled by combining these perceptual agents through suitable weighted sums. Various implementation strategies have been pursued with reference to: (i) the algorithm-circuit mapping (current-mode and transconductor approaches); (ii) the degree of programmability (fixed, selectable and tunable); and (iii) the implementation technology (2 and 0.8 gate lengths). Applications of the perceptual engine to machine vision algorithms are discussed. 相似文献

15.

How Fast Is the k-Means Method?

Sariel Har-Peled Bardia Sadri 《Algorithmica》2005,41(3):185-202

We present polynomial upper and lower bounds on the number of iterations performed by the k-means method (a.k.a. Lloyds method) for k-means clustering. Our upper bounds are polynomial in the number of points, number of clusters, and the spread of the point set. We also present a lower bound, showing that in the worst case the k-means heuristic needs to perform (n) iterations, for n points on the real line and two centers. Surprisingly, the spread of the point set in this construction is polynomial. This is the first construction showing that the k-means heuristic requires more than a polylogarithmic number of iterations. Furthermore, we present two alternative algorithms, with guaranteed performance, which are simple variants of the k-means method. Results of our experimental studies on these algorithms are also presented. 相似文献

16.

On the existence of affine invariant descent directions

Yu-Hong Dai Felix Lieder 《Optimization methods & software》2020,35(5):938-954

This paper begins with a brief review of affine invariance and its significance for iterative algorithms. It then explores the existence of affine invariant descent directions for unconstrained minimization. While there may exist several affine invariant descent directions for smooth functions at a given point, it is shown that for quadratic functions, there exists exactly one invariant descent direction in the strictly convex case and generally none in the case where the Hessian is singular or indefinite. These results can be generalized to smooth nonlinear functions and have implications regarding the initialization of minimization algorithms. They stand in contrast to recent works on constrained convex and nonconvex optimization for which there may exist an affine invariant ‘frame’ that depends on the feasible set and that can be used to define an affine invariant descent direction. 相似文献

17.

Motion mapping and mode decision for MPEG-2 to H.264/AVC transcoding

Jun Xin Jianjun Li Anthony Vetro Shun-ichi Sekiguchi 《Multimedia Tools and Applications》2007,35(2):203-223

This paper describes novel transcoding techniques aimed for low-complexity MPEG-2 to H.264/AVC transcoding. An important application for this type of conversion is efficient storage of broadcast video in consumer devices. The architecture for such a system is presented, which includes novel motion mapping and mode decision algorithms. For the motion mapping, two algorithms are presented. Both efficiently map incoming MPEG-2 motion vectors to outgoing H.264/AVC motion vectors regardless of the block sizes that the motion vectors correspond to. In addition, the algorithm maps motion vectors to different reference pictures, which is useful for picture type conversion and prediction from multiple reference pictures. We also propose an efficient rate-distortion optimised macroblock coding mode decision algorithm, which first evaluates candidate modes based on a simple cost function so that a reduced set of candidate modes is formed, then based on this reduced set, we evaluate the more complex Lagrangian cost calculation to determine the coding mode. Extensive simulation results show that our proposed transcoder incorporating the proposed algorithms achieves very good rate-distortion performance with low complexity. Compared with the cascaded decoder-encoder solution, the coding efficiency is maintained while the complexity is significantly reduced.

Shun-ichi SekiguchiEmail:

相似文献

18.

A Subdivision Scheme for Continuous-Scale B-Splines and Affine-Invariant Progressive Smoothing 总被引：1，自引：0，他引：1

Guillermo Sapiro Albert Cohen Alfred M. Bruckstein 《Journal of Mathematical Imaging and Vision》1997,7(1):23-40

Multiscale representations and progressive smoothing constitutean important topic in different fields as computer vision, CAGD,and image processing. In this work, a multiscale representationof planar shapes is first described. The approach is based oncomputing classical B-splines of increasing orders, andtherefore is automatically affine invariant. The resultingrepresentation satisfies basic scale-space properties at least ina qualitative form, and is simple to implement.The representation obtained in this way is discrete in scale,since classical B-splines are functions in , where k isan integer bigger or equal than two. We present a subdivisionscheme for the computation of B-splines of finite support atcontinuous scales. With this scheme, B-splines representationsin are obtained for any real r in [0, ), andthe multiscale representation is extended to continuous scale.The proposed progressive smoothing receives a discrete set ofpoints as initial shape, while the smoothed curves arerepresented by continuous (analytical) functions, allowing astraightforward computation of geometric characteristics of theshape. 相似文献

19.

Conjugate conflict continuation graphs for multi-layer constrained via minimization

Rung-Bin Lin Shu-Yu Chen 《Information Sciences》2007,177(12):2436-2447

A graph model for describing the relationships among wire segments is crucial to constrained via minimization (CVM) in a VLSI design. In this paper we present a new graph model, called the conjugate conflict continuation graph, for multi-layer CVM with stacked vias. This graph model eases the handling of stacked via problems. An integer linear programming (ILP) formulation and a simulated annealing (SA) algorithm based on this graph model are developed to solve multi-layer CVM. The ILP model is too complicated to solve efficiently. The SA algorithm on average achieves 6.4% via reduction for layouts obtained using a commercial tool under a set of practical constraints in which the metal wires (including pins) used in cell layouts, power rails and rings, and clock routing are treated as obstacles or fixed-layer objects to a multi-layer CVM. 相似文献

20.

有效生成饰带群混沌吸引子参数的改进优生遗传算法

陈宁金华王凤英《小型微型计算机系统》2006,27(1):93-96

本文将蒙特卡罗搜索法与优生遗传算法应用于构造饰带群等价映射模型p112与模型p1a1混沌吸引子,并针对“遗传漂移”现象提出了改进的优生遗传算法．研究表明,在参数空间中引入空间距离的限制,可以由初始种群参数向量搜索出无重复参数向量的子代参数集合．在进化的种群中,也无重复混沌吸引子参数向量,从而避免了原有优生遗传算法在种群中出现的“遗传漂移”现象．新算法实现了种群中的参数无重复地不断更新,利用更新的种群在参数空间上能够持续地搜索出无重复图形结构的混沌吸引子参数向量,解决了原优生遗传算法无法持续有效生成新的混沌吸引子参数向量的问题．相似文献