首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
阐述了一种适用于核外计算程序的变换技术,它通过联合使用循环变换和数据变换这两种编译优化技术来增强程序的局部性,提高数据存取效率。该方法不仅能优化单独一个嵌套循环,还能同时处理多个嵌套循环。实验结果表明了该方法能显著提高核外计算的性能。  相似文献   

2.
3.
Loop fusion improves data locality and reduces synchronization in data-parallel applications. However, loop fusion is not always legal. Even when legal, fusion may introduce loop-carried dependences which prevent parallelism. In addition, performance losses result from cache conflicts in fused loops. In this paper, we present new techniques to: (1) allow fusion of loop nests in the presence of fusion-preventing dependences, (2) maintain parallelism and allow the parallel execution of fused loops with minimal synchronization, and (3) eliminate cache conflicts in fused loops. We describe algorithms for implementing these techniques in compilers. The techniques are evaluated on a 56-processor KSR2 multiprocessor and on a 18-processor Convex SPP-1000 multiprocessor. The results demonstrate performance improvements for both kernels and complete applications. The results also indicate that careful evaluation of the profitability of fusion is necessary as more processors are used  相似文献   

4.
5.
Prof. D. Heller 《Computing》1979,22(2):101-118
The parallel evaluation ofA N =a 1 °a 2 °...°a N , where ° is binary associative, is studied. Under an idealized model of parallel computation, the minimal number of parallel processors required to computeA N in at mostt steps is determined for ?log2 N?≤tN?1. This indicates that it is not always desirable to reduce the running time to an absolute minimum, and provides a lower bound on the processing power required for time-constrained evaluation of general arithmetic expressions. Results for two-input processors, are generalized tob-input processors, and then to non-homogeneous collections of processors. The latter does not have a closed-form solution, so approximations are analyzed.  相似文献   

6.
Constraint Satisfaction Problem (CSP) involves finding values for variables to satisfy a set of constraints. Consistency check is the key technique in solving this class of problems. Past research has developed many algorithms for such a purpose, e.g., node consistency, are consistency, generalized node and arc consistency, specific methods for checking specific constraints, etc. In this article, an attempt is made to unify these algorithms into a common framework. This framework consists of two parts. the first part is a generic consistency check algorithm, which allows and encourages each individual constraint to be checked by its specific consistency methods. Such an approach provides a direct way of practical implementation of the CSP model for real problem-solving. the second part is a general schema for describing the handling of each type of constraint. the schema characterizes various issues of constraint handling in constraint satisfaction, and provides a common language for expressing, discussing, and exchanging constraint handling techniques. © 1995 John Wiley & Sons, Inc.  相似文献   

7.
In this paper, a unified framework for multimodal content retrieval is presented. The proposed framework supports retrieval of rich media objects as unified sets of different modalities (image, audio, 3D, video and text) by efficiently combining all monomodal heterogeneous similarities to a global one according to an automatic weighting scheme. Then, a multimodal space is constructed to capture the semantic correlations among multiple modalities. In contrast to existing techniques, the proposed method is also able to handle external multimodal queries, by embedding them to the already constructed multimodal space, following a space mapping procedure of a submanifold analysis. In our experiments with five real multimodal datasets, we show the superiority of the proposed approach against competitive methods.  相似文献   

8.
Knowledge patterns, such as association rules, clusters or decision trees, can be defined as concise and relevant information that can be extracted, stored, analyzed, and manipulated by knowledge workers in order to drive and specialize business decision processes. In this paper we deal with data mining patterns. The ability to manipulate different types of patterns under a unified environment is becoming a fundamental issue for any ‘intelligent’ and data-intensive application. However, approaches proposed so far for pattern management usually deal with specific and predefined types of patterns and mainly concern pattern extraction and exchange issues. Issues concerning the integrated, advanced management of heterogeneous patterns are in general not (or marginally) taken into account.  相似文献   

9.
We propose a general framework for structure identification, as defined by Dechter and Pearl. It is based on the notion of prime implicate, and handles Horn, bijunctive and affine, as well as Horn-renamable formulas, for which, to our knowledge, no polynomial algorithm has been proposed before. This framework, although quite general, gives good complexity results, and in particular we get for Horn formulas the same running time and better output size than the algorithms previously known.  相似文献   

10.
In the paper we present a framework for partitioning data parallel computations across a heterogeneous metasystem at runtime. The framework is guided by program and resource information which is made available to the system. Three difficult problems are handled by the framework: processor selection, task placement and heterogeneous data domain decomposition. Solving each of these problems contributes to reduced elapsed time. In particular, processor selection determines the best grain size at which to run the computation, task placement reduces communication cost, and data domain decomposition achieves processor load balance. We present results which indicate that excellent performance is achievable using the framework. The paper extends our earlier work on partitioning data parallel computations across a single-level network of heterogeneous workstations.  相似文献   

11.
The increasing importance being placed on software measurement has led to an increased amount of research developing new software measures. Given the importance of object-oriented development techniques, one specific area where this has occurred is coupling measurement in object-oriented systems. However, despite a very interesting and rich body of work, there is little understanding of the motivation and empirical hypotheses behind many of these new measures. It is often difficult to determine how such measures relate to one another and for which application they can be used. As a consequence, it is very difficult for practitioners and researchers to obtain a clear picture of the state of the art in order to select or define measures for object-oriented systems. This situation is addressed and clarified through several different activities. First, a standardized terminology and formalism for expressing measures is provided which ensures that all measures using it are expressed in a fully consistent and operational manner. Second, to provide a structured synthesis, a review of the existing frameworks and measures for coupling measurement in object-oriented systems takes place. Third, a unified framework, based on the issues discovered in the review, is provided and all existing measures are then classified according to this framework. This paper contributes to an increased understanding of the state-of-the-art  相似文献   

12.
13.
The emergence of Web technologies enables a variety of Web-based service applications, which can be examined from business process integration, supply chain management, and knowledge management perspectives. To categorize existing Web-based services while foreseeing potential new types, a unified view is needed to represent the structures and processes of Web-based services. This paper proposes a general framework to identify essential structures and operations of Web-based services, and then models these components. We articulate the framework with Web technologies, such as Web service and semantic Web, multi-agent and peer-to-peer, and Web information retrieval and mining. Two comprehensive examples in insurance and knowledge services are used to elaborate the use of Web-based service framework in fulfilling business processes. This study synthesizes essential structures and processes of Web-based services to build a framework for researchers and practitioners to develop Web-based services and techniques.  相似文献   

14.
This paper presents a review in the form of a unified framework for tackling estimation problems in Digital Signal Processing (DSP) using Support Vector Machines (SVMs). The paper formalizes our developments in the area of DSP with SVM principles. The use of SVMs for DSP is already mature, and has gained popularity in recent years due to its advantages over other methods: SVMs are flexible non-linear methods that are intrinsically regularized and work well in low-sample-sized and high-dimensional problems. SVMs can be designed to take into account different noise sources in the formulation and to fuse heterogeneous information sources. Nevertheless, the use of SVMs in estimation problems has been traditionally limited to its mere use as a black-box model. Noting such limitations in the literature, we take advantage of several properties of Mercerʼs kernels and functional analysis to develop a family of SVM methods for estimation in DSP. Three types of signal model equations are analyzed. First, when a specific time-signal structure is assumed to model the underlying system that generated the data, the linear signal model (so-called Primal Signal Model formulation) is first stated and analyzed. Then, non-linear versions of the signal structure can be readily developed by following two different approaches. On the one hand, the signal model equation is written in Reproducing Kernel Hilbert Spaces (RKHS) using the well-known RKHS Signal Model formulation, and Mercerʼs kernels are readily used in SVM non-linear algorithms. On the other hand, in the alternative and not so common Dual Signal Model formulation, a signal expansion is made by using an auxiliary signal model equation given by a non-linear regression of each time instant in the observed time series. These building blocks can be used to generate different novel SVM-based methods for problems of signal estimation, and we deal with several of the most important ones in DSP. We illustrate the usefulness of this methodology by defining SVM algorithms for linear and non-linear system identification, spectral analysis, non-uniform interpolation, sparse deconvolution, and array processing. The performance of the developed SVM methods is compared to standard approaches in all these settings. The experimental results illustrate the generality, simplicity, and capabilities of the proposed SVM framework for DSP.  相似文献   

15.
In this paper we propose a new optimization framework that unites some of the existing tensor based methods for face recognition on a common mathematical basis. Tensor based approaches rely on the ability to decompose an image into its constituent factors (i.e. person, lighting, viewpoint, etc.) and then utilizing these factor spaces for recognition. We first develop a multilinear optimization problem relating an image to its constituent factors and then develop our framework by formulating a set of strategies that can be followed to solve this optimization problem. The novelty of our research is that the proposed framework offers an effective methodology for explicit non-empirical comparison of the different tensor methods as well as providing a way to determine the applicability of these methods in respect to different recognition scenarios. Importantly, the framework allows the comparative analysis on the basis of quality of solutions offered by these methods. Our theoretical contribution has been validated by extensive experimental results using four benchmark datasets which we present along with a detailed discussion.  相似文献   

16.
Traditional supervised classifiers use only labeled data (features/label pairs) as the training set, while the unlabeled data is used as the testing set. In practice, it is often the case that the labeled data is hard to obtain and the unlabeled data contains the instances that belong to the predefined class but not the labeled data categories. This problem has been widely studied in recent years and the semi-supervised PU learning is an efficient solution to learn from positive and unlabeled examples. Among all the semi-supervised PU learning methods, it is hard to choose just one approach to fit all unlabeled data distribution. In this paper, a new framework is designed to integrate different semi-supervised PU learning algorithms in order to take advantage of existing methods. In essence, we propose an automatic KL-divergence learning method by utilizing the knowledge of unlabeled data distribution. Meanwhile, the experimental results show that (1) data distribution information is very helpful for the semi-supervised PU learning method; (2) the proposed framework can achieve higher precision when compared with the state-of-the-art method.  相似文献   

17.
A unified framework for subspace face recognition   总被引:2,自引:0,他引:2  
PCA, LDA, and Bayesian analysis are the three most representative subspace face recognition approaches. In this paper, we show that they can be unified under the same framework. We first model face difference with three components: intrinsic difference, transformation difference, and noise. A unified framework is then constructed by using this face difference model and a detailed subspace analysis on the three components. We explain the inherent relationship among different subspace methods and their unique contributions to the extraction of discriminating information from the face difference. Based on the framework, a unified subspace analysis method is developed using PCA, Bayes, and LDA as three steps. A 3D parameter space is constructed using the three subspace dimensions as axes. Searching through this parameter space, we achieve better recognition performance than standard subspace methods.  相似文献   

18.
In practice, many applications require a dimensionality reduction method to deal with the partially labeled problem. In this paper, we propose a semi-supervised dimensionality reduction framework, which can efficiently handle the unlabeled data. Under the framework, several classical methods, such as principal component analysis (PCA), linear discriminant analysis (LDA), maximum margin criterion (MMC), locality preserving projections (LPP) and their corresponding kernel versions can be seen as special cases. For high-dimensional data, we can give a low-dimensional embedding result for both discriminating multi-class sub-manifolds and preserving local manifold structure. Experiments show that our algorithms can significantly improve the accuracy rates of the corresponding supervised and unsupervised approaches.  相似文献   

19.
We present an adaptive out-of-core technique for rendering massive scalar volumes employing single-pass GPU ray casting. The method is based on the decomposition of a volumetric dataset into small cubical bricks, which are then organized into an octree structure maintained out-of-core. The octree contains the original data at the leaves, and a filtered representation of children at inner nodes. At runtime an adaptive loader, executing on the CPU, updates a view and transfer function-dependent working set of bricks maintained on GPU memory by asynchronously fetching data from the out-of-core octree representation. At each frame, a compact indexing structure, which spatially organizes the current working set into an octree hierarchy, is encoded in a small texture. This data structure is then exploited by an efficient stackless ray casting algorithm, which computes the volume rendering integral by visiting non-empty bricks in front-to-back order and adapting sampling density to brick resolution. Block visibility information is fed back to the loader to avoid refinement and data loading of occluded zones. The resulting method is able to interactively explore multi-gigavoxel datasets on a desktop PC.  相似文献   

20.
Precise calculation of molecular electronic wavefunctions by methods such as coupled-cluster requires the computation of tensor contractions, the cost of which has polynomial computational scaling with respect to the system and basis set sizes. Each contraction may be executed via matrix multiplication on a properly ordered and structured tensor. However, data transpositions are often needed to reorder the tensors for each contraction. Writing and optimizing distributed-memory kernels for each transposition and contraction is tedious since the number of contractions scales combinatorially with the number of tensor indices. We present a distributed-memory numerical library (Cyclops Tensor Framework (CTF)) that automatically manages tensor blocking and redistribution to perform any user-specified contractions. CTF serves as the distributed-memory contraction engine in Aquarius, a new program designed for high-accuracy and massively-parallel quantum chemical computations. Aquarius implements a range of coupled-cluster and related methods such as CCSD and CCSDT by writing the equations on top of a C++ templated domain-specific language. This DSL calls CTF directly to manage the data and perform the contractions. Our CCSD and CCSDT implementations achieve high parallel scalability on the BlueGene/Q and Cray XC30 supercomputer architectures showing that accurate electronic structure calculations can be effectively carried out on top of general distributed-memory tensor primitives.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号