首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
Extensibility in complex compiler systems goes well beyond modularity of design and it needs to be considered from the early stages of the design, especially the design of the Intermediate Representation. One of the primary barriers to compiler pass extensibility and modularity is interference between passes caused by transformations that invalidate existing analysis information. In this paper, we also present a callback system which is provided to automatically track changes to the compilers internal representation (IR) allowing full pass reordering and an easy-to-use interface for developing lazy update incremental analysis passes. We present a new algorithm for incremental interprocedural data flow analysis and demonstrate the benefits of our design framework and our prototype compiler system. It is shown that compilation time for multiple data flow analysis algorithms can be cut in half by incrementally updating data flow analysis.  相似文献   

2.
张钧波  李天瑞  潘毅  罗川  滕飞 《软件学报》2015,26(5):1064-1078
日益复杂和动态变化的海量数据处理,是当前人们普遍关注的问题,其核心内容之一是研究如何利用已有的信息实现快速的知识更新.粒计算是近年来新兴的一个研究领域,是信息处理的一种新的概念和计算范式,主要用于描述和处理不确定的、模糊的、不完整的和海量的信息,以及提供一种基于粒与粒间关系的问题求解方法.作为粒计算理论中的一个重要组成部分,粗糙集是一种处理不确定性和不精确性问题的有效数学工具.根据云计算中的并行模型MapReduce,给出了并行计算粗糙集中等价类、决策类和两者之间相关性的算法;然后,设计了用于处理大规模数据的并行粗糙近似集求解算法.为应对动态变化的海量数据,结合MapReduce模型和增量更新方法,根据不同的增量策略,设计了两种并行增量更新粗糙近似集的算法.实验结果表明,该算法可以有效地快速更新知识;而且数据量越大,效果越明显.  相似文献   

3.
The article discuses a number of fundamental results related to determining the maximum output flow in a network after edge failures. On the basis of four theorems, we propose very efficient augmentation algorithms for restoring the maximum possible output flow in a repairable flow network, after an edge failure. In many cases, the running time of the proposed algorithm is independent of the size of the network or varies linearly with the size of the network. The high computational speed of the proposed algorithms makes them suitable for optimising the performance of repairable flow networks in real time and for decongesting overloaded branches in networks. We show that the correct algorithm for maximising the flow in a static flow network, with edges fully saturated with flow, is a special case of the proposed reoptimisation algorithm, after transforming the network into a network with balanced nodes. An efficient two-stage augmentation algorithm has also been proposed for maximising the output flow in a network with empty edges. The algorithm is faster than the classical flow augmentation algorithms. The article also presents a study on the link between performance, topology and size of repairable flow networks by using a specially developed software tool. The topology of repairable flow networks has a significant impact on their performance. Two networks built with identical type and number of components can have very different performance levels because of slight differences in their topology.  相似文献   

4.
Interprocedural data flow information is useful for many software testing and analysis techniques, including data flow testing, regression testing, program slicing and impact analysis. For programs with aliases, these testing and analysis techniques can yield invalid results, unless the data flow information accounts for aliasing effects. Recent research provides algorithms for performing interprocedural data flow analysis in the presence of aliases; however, these algorithms are expensive, and achieve precise results only on complete programs. This paper presents an algorithm for performing alias analysis on incomplete programs that lets individual software components such as library routines, subroutines or subsystems be independently analyzed. The paper also presents an algorithm for reusing the results of this separate analysis when the individual software components are linked with calling modules. Our algorithms let us analyze frequently used software components, such as library routines or classes, independently, and reuse the results of that analysis when analyzing calling programs, without incurring the expense of completely reanalyzing each calling program. Our algorithms also provide a way to analyze large systems incrementally  相似文献   

5.
Improved multiprocessor performance can be attained by combining data flow and control flow concepts. This type of combined architecture is characterized and several examples of previously proposed machines are given. A new model that permits the analysis of such systems is presented and performance measures are defined. This model is then used to analyze the performance of the algorithms under a wide variety of combined systems. The results of these experiments show that partition size is a major factor in the performance of such systems and an optimal size may be found for given system parameters.  相似文献   

6.
数据流分析是编译器中重要部分,而增量式分析在程序开发环境和过程间优化编译器中有着相关实用的价值,当程序发生变化时,它可以增量式地维护数据流信息,而不致因程序的任何小改动都重新进行数据流分析,给出了一种增量式的消去数据流算法,它基于路径简化算法,具有和路径简化算法同样的复杂度,同样的通用性(适用于不可归约流图和流函数不完备的情况),而且能方便地在程序发生变化时维护现有的数据流信息。  相似文献   

7.
Efficient Flow Computation on Massive Grid Terrain Datasets   总被引:1,自引:0,他引:1  
As detailed terrain data becomes available, GIS terrain applications target larger geographic areas at finer resolutions. Processing the massive datasets involved in such applications presents significant challenges to GIS systems and demands algorithms that are optimized for both data movement and computation. In this paper we present efficient algorithms for flow routing on massive grid terrain datasets, extending our previous work on flow accumulation. Our algorithms are developed in the framework of external memory algorithms and use I/O-techniques to achieve efficiency. We have implemented the algorithms in the Terraflow system, which is the first comprehensive terrain flow software system designed and optimized for massive data. We compare the performance of Terraflow with that of state-of-the-art commercial and open-source GIS systems. On large terrains, Terraflow outperforms existing systems by a factor of 2 to 1,000, and is capable of solving problems no system was previously able to solve.  相似文献   

8.
本文把类的操作划分成三个不同的级别,采用增量算法分别对不同级别的操作进行数据流分析,相应得到三种不同的定义-引用对,根据生成的定义-引用对就可以进行基于数据流的测试,如何对有调用关系的操作之间进行快速有效的数据流分析是对类进行数据流分析的一个难点,本文对该问题进行了深入研究,并提出了相应的解决方法。  相似文献   

9.
In recent years because of substantial use of wireless sensor network the distributed estimation has attracted the attention of many researchers. Two popular learning algorithms: incremental least mean square (ILMS) and diffusion least mean square (DLMS) have been reported for distributed estimation using the data collected from sensor nodes. But these algorithms, being derivative based, have a tendency of providing local minima solution particularly for minimization of multimodal cost function. Hence for problems like distributed parameters estimation of IIR systems, alternative distributed algorithms are required to be developed. Keeping this in view the present paper proposes two population based incremental particle swarm optimization (IPSO) algorithms for estimation of parameters of noisy IIR systems. But the proposed IPSO algorithms provide poor performance when the measured data is contaminated with outliers in the training samples. To alleviate this problem the paper has proposed a robust distributed algorithm (RDIPSO) for IIR system identification task. The simulation results of benchmark IIR systems demonstrate that the proposed algorithms provide excellent identification performance in all cases even when the training samples are contaminated with outliers.  相似文献   

10.
The prediction accuracy and generalization ability of neural/neurofuzzy models for chaotic time series prediction highly depends on employed network model as well as learning algorithm. In this study, several neural and neurofuzzy models with different learning algorithms are examined for prediction of several benchmark chaotic systems and time series. The prediction performance of locally linear neurofuzzy models with recently developed Locally Linear Model Tree (LoLiMoT) learning algorithm is compared with that of Radial Basis Function (RBF) neural network with Orthogonal Least Squares (OLS) learning algorithm, MultiLayer Perceptron neural network with error back-propagation learning algorithm, and Adaptive Network based Fuzzy Inference System. Particularly, cross validation techniques based on the evaluation of error indices on multiple validation sets is utilized to optimize the number of neurons and to prevent over fitting in the incremental learning algorithms. To make a fair comparison between neural and neurofuzzy models, they are compared at their best structure based on their prediction accuracy, generalization, and computational complexity. The experiments are basically designed to analyze the generalization capability and accuracy of the learning techniques when dealing with limited number of training samples from deterministic chaotic time series, but the effect of noise on the performance of the techniques is also considered. Various chaotic systems and time series including Lorenz system, Mackey-Glass chaotic equation, Henon map, AE geomagnetic activity index, and sunspot numbers are examined as case studies. The obtained results indicate the superior performance of incremental learning algorithms and their respective networks, such as, OLS for RBF network and LoLiMoT for locally linear neurofuzzy model.  相似文献   

11.
In recent years, the spectral clustering method has gained attentions because of its superior performance. To the best of our knowledge, the existing spectral clustering algorithms cannot incrementally update the clustering results given a small change of the data set. However, the capability of incrementally updating is essential to some applications such as websphere or blogsphere. Unlike the traditional stream data, these applications require incremental algorithms to handle not only insertion/deletion of data points but also similarity changes between existing points. In this paper, we extend the standard spectral clustering to such evolving data, by introducing the incidence vector/matrix to represent two kinds of dynamics in the same framework and by incrementally updating the eigen-system. Our incremental algorithm, initialized by a standard spectral clustering, continuously and efficiently updates the eigenvalue system and generates instant cluster labels, as the data set is evolving. The algorithm is applied to a blog data set. Compared with recomputation of the solution by the standard spectral clustering, it achieves similar accuracy but with much lower computational cost. It can discover not only the stable blog communities but also the evolution of the individual multi-topic blogs. The core technique of incrementally updating the eigenvalue system is a general algorithm and has a wide range of applications—as well as incremental spectral clustering—where dynamic graphs are involved. This demonstrates the wide applicability of our incremental algorithm.  相似文献   

12.
As the number of available multiprocessors increases, so does the importance of providing software support for these systems, including parallel compilers. Data flow analysis, an important component of software tools, may be computed many times during the compilation of a program, especially when compiling for a multiprocessor. Although converting a sequential data flow algorithm to a parallel algorithm can present some opportunities for computing data flow in parallel, more parallelism can be exposed by the development of new parallel data flow algorithms. We present a technique that computes rapid data flow problems in parallel and thus is applicable for commonly used classical data flow problems, including reaching definitions, reachable uses, available expressions, and very busy expressions. Unlike previous techniques, our technique exploits the inherent parallelism in the data flow computation that occurs across independent paths, within linear paths, and in paths through loops of a control flow graph. The technique first changes cyclic structures in a control flow graph to acyclic structures and then builds a combining directed acyclic graph (DAG) that represents the paths through the control flow graph needed to compute data flow. Data flow is then computed using two passes over the DAG by computing the data flow for the nodes on each level of the DAG in parallel. We also present experimental results comparing the performance of our algorithm with a sequential algorithm and a parallelized sequential algorithm  相似文献   

13.
When a multidatabase system contains textual database systems (i.e., information retrieval systems), queries against the global schema of the multidatabase system may contain a new type of joins-joins between attributes of textual type. Three algorithms for processing such a type of joins are presented and their I/O costs are analyzed in this paper. Since such a type of joins often involves document collections of very large size, it is very important to find efficient algorithms to process them. The three algorithms differ on whether the documents themselves or the inverted files on the documents are used to process the join. Our analysis and the simulation results indicate that the relative performance of these algorithms depends on the input document collections, system characteristics, and the input query. For each algorithm, the type of input document collections with which the algorithm is likely to perform well is identified. An integrated algorithm that automatically selects the best algorithm to use is also proposed  相似文献   

14.
高维流式大数据的产生与发展对传统机器学习和数据挖掘算法提出了诸多挑战。本文结合流式大数据流式到达的特性,首先建立自适应增量特征提取算法模型。然后,针对噪声环境,建立基于特征空间校准的增量流形学习算法模型,解决小样本问题。最后,构造流形学习的正则化优化框架,解决高维数据流特征提取过程中产生的降维误差问题,并得到最终的最优解。实验结果表明本文提出的算法框架符合流形学习算法的3个 评价指标:稳定性、提高性以及学习曲线能迅速增加到一个相对稳定的水平;从而实现了高维数据流的高效学习。  相似文献   

15.
The performance analysis of parallel algorithms and systems is considered. For these, numerical solutions methods quickly show their limits because of the enormous state-space growth. The proposed methodology and software tool, list-manipulation parallel-modeling package (LISPACK) uses string manipulation, lumping, and recursive elimination to define the large Markovian process, its restructuring, and efficient solution. The analysis of a typical parallel system and algorithm model is developed as a case study, to discuss the features of the method. The paper has two contributions. The first is the symbolic-approach methodology proposed for the performance analysis of parallel algorithms and systems. The second is a tool that exploits the capabilities of the symbolic approach in the solution of parallel models, where the numerical techniques reveal their limits  相似文献   

16.
On-demand broadcast is an effective wireless data dissemination technique to enhance system scalability and capability to handle dynamic data access patterns. Previous studies on time-critical on-demand data broadcast were conducted under the assumption that each client requests only one data item at a time. With the rapid growth of time-critical information dissemination services in emerging applications, there is an increasing need for systems to support efficient processing of real-time multi-item requests. Little work, however, has been done. In this paper, we study the behavior of six representative single-item request based scheduling algorithms in time-critical multi-item request environments. The results show that the performance of all algorithms deteriorates when dealing with multi-item requests. We observe that data popularity, which is an effective factor to save bandwidth and improve performance in scheduling single-item requests, becomes a hindrance to performance in multi-item request environments. Most multi-item requests scheduled by these algorithms suffer from a starvation problem, which is the root of performance deterioration. Based on our analysis, a novel algorithm that considers both request popularity and request timing requirement is proposed. The performance results of our simulation study show that the proposed algorithm is superior to other classical algorithms under a variety of circumstances.  相似文献   

17.
18.
19.
赵小龙  杨燕 《控制与决策》2019,34(10):2061-2072
增量式属性约简是针对动态型数据的一种重要的数据挖掘方法,目前已提出的增量式属性约简算法大多基于离散型数据构建,很少有对数值型数据进行相关的研究.鉴于此,提出一种数值型信息系统中对象不断增加的增量式属性约简算法.首先,在数值型信息系统中建立一种分层的邻域粒化计算方法,并基于该方法提出邻域粒化的增量式计算;然后,在邻域粒化增量式计算的基础上给出邻域粒化条件熵的增量式更新方法,并基于该更新机制提出对应的增量式属性约简算法;最后,通过实验分析表明所提出算法对于数值型数据的增量式属性约简具有更高的有效性和优越性.  相似文献   

20.
指针分析是对软件进行编译优化、错误检测的核心基础技术之一.现有经典指针分析框架,如Doop,会将待分析程序和分析算法转化成Datalog评估问题并进行求解,如程序规模较大,单次求解分析时间开销较大.在程序频繁变更发布的情况下,相关程序分析的开销更是难以负担.近年来,增量分析作为一种在代码频繁变更场景下有效复用已有分析结果提升分析效率的技术受到了越来越多的关注.然而,目前的增量指针分析技术通常针对特定算法设计,支持的指针分析选项有限,其可用性也受到较大限制.针对上述问题,本文设计并实现了一种基于差分式Datalog求解的增量指针分析框架DDoop(Differential Doop). DDoop实现了增量输入事实生成技术与增量分析规则自动化重写技术,将多版本程序增量分析问题表达为差分Datalog评估问题,从而可以充分利用成熟的差分式Datalog求解引擎,如DDlog,来实现端到端的增量指针分析,并最大化兼容复用Doop中已有的指针分析实现,提供透明的增量化支持.我们在广泛应用的真实世界程序上对DDoop进行了实验评估,实验结果显示DDoop相较于非增量的Doop框架具有显著的性能优势,同时高度兼容Doop中已有的各种指针分析规则.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号