期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

1.

Speeding up multiple instance learning classification rules on GPUs

Alberto Cano Amelia Zafra Sebastián Ventura 《Knowledge and Information Systems》2015,44(1):127-145

相似文献

2.

Speeding up global illumination computations using programmable GPUs

G. Fournier B. Proche 《Simulation Modelling Practice and Theory》2005,13(8):727-740

In the context of realistic image synthesis, many stochastic methods have been proposed to sample direct and indirect radiance. We present new ways to use hardware graphics to sample direct and indirect lighting in a scene. Jittered sampling of light sources can easily be implemented on a fragment program to obtain soft shadow samples. Using a voxel representation of the scene, indirect illumination can be computed using hemispherical jittered sampling. These algorithms have been implemented in our rendering framework but can be used in other contexts like radiosity or final gathering of the photon map. 相似文献

3.

Speeding up two string-matching algorithms 总被引：9，自引：0，他引：9

M. Crochemore A. Czumaj L. Gasieniec S. Jarominek T. Lecroq W. Plandowski W. Rytter 《Algorithmica》1994,12(4-5):247-267

We show how to speed up two string-matching algorithms: the Boyer-Moore algorithm (BM algorithm), and its version called here the reverse factor algorithm (RF algorithm). The RF algorithm is based on factor graphs for the reverse of the pattern. The main feature of both algorithms is that they scan the text right-to-left from the supposed right position of the pattern. The BM algorithm goes as far as the scanned segment (factor) is a suffix of the pattern. The RF algorithm scans while the segment is a factor of the pattern. Both algorithms make a shift of the pattern, forget the history, and start again. The RF algorithm usually makes bigger shifts than BM, but is quadratic in the worst case. We show that it is enough to remember the last matched segment (represented by two pointers to the text) to speed up the RF algorithm considerably (to make a linear number of inspections of text symbols, with small coefficient), and to speed up the BM algorithm (to make at most 2 ·n comparisons). Only a constant additional memory is needed for the search phase. We give alternative versions of an accelerated RF algorithm: the first one is based on combinatorial properties of primitive words, and the other two use the power of suffix trees extensively. The paper demonstrates the techniques to transform algorithms, and also shows interesting new applications of data structures representing all subwords of the pattern in compact form.The work by M. Crochemore and T. Lecroq was partially supported by PRC Mathématiques-Informatique, M. Crochemore was also partially supported by NATO Grant CRG 900293, and the work by A. Czumaj, L. Gasieniec, S. Jarominek, W. Plandowski, and W. Rytter was supported by KBN of the Polish Ministry of Education. 相似文献

4.

Speeding up backpropagation using multiobjective evolutionary algorithms 总被引：3，自引：0，他引：3

Abbass HA 《Neural computation》2003,15(11):2705-2726

The use of backpropagation for training artificial neural networks (ANNs) is usually associated with a long training process. The user needs to experiment with a number of network architectures; with larger networks, more computational cost in terms of training time is required. The objective of this letter is to present an optimization algorithm, comprising a multiobjective evolutionary algorithm and a gradient-based local search. In the rest of the letter, this is referred to as the memetic Pareto artificial neural network algorithm for training ANNs. The evolutionary approach is used to train the network and simultaneously optimize its architecture. The result is a set of networks, with each network in the set attempting to optimize both the training error and the architecture. We also present a self-adaptive version with lower computational cost. We show empirically that the proposed method is capable of reducing the training time compared to gradient-based techniques. 相似文献

5.

A calibrated asymptotic framework for analyzing packet classification algorithms on GPUs

Abbasi M. Rafiee M. 《The Journal of supercomputing》2019,75(10):6574-6611

The Journal of Supercomputing - Packet classification is a computationally intensive, highly parallelizable task in many advanced network systems like high-speed routers and firewalls. Recently,... 相似文献

6.

Speeding up evolutionary algorithms through asymmetric mutation operators 总被引：1，自引：0，他引：1

Doerr B Hebbinghaus N Neumann F 《Evolutionary computation》2007,15(4):401-410

Successful applications of evolutionary algorithms show that certain variation operators can lead to good solutions much faster than other ones. We examine this behavior observed in practice from a theoretical point of view and investigate the effect of an asymmetric mutation operator in evolutionary algorithms with respect to the runtime behavior. Considering the Eulerian cycle problem we present runtime bounds for evolutionary algorithms using an asymmetric operator which are much smaller than the best upper bounds for a more general one. In our analysis it turns out that a plateau which both algorithms have to cope with changes its structure in a way that allows the algorithm to obtain an improvement much faster. In addition, we present a lower bound for the general case which shows that the asymmetric operator speeds up computation by at least a linear factor. 相似文献

7.

Implementing QR factorization updating algorithms on GPUs

Robert Andrew Nicholas Dingle 《Parallel Computing》2014

Linear least squares problems are commonly solved by QR factorization. When multiple solutions need to be computed with only minor changes in the underlying data, knowledge of the difference between the old data set and the new can be used to update an existing factorization at reduced computational cost. We investigate the viability of implementing QR updating algorithms on GPUs and demonstrate that GPU-based updating for removing columns achieves speed-ups of up to 13.5× compared with full GPU QR factorization. We characterize the conditions under which other types of updates also achieve speed-ups. 相似文献

8.

Dealing with the evaluation of supervised classification algorithms

Guzman Santafe Iñaki Inza Jose A. Lozano 《Artificial Intelligence Review》2015,44(4):467-508

相似文献

9.

可扩展报文分类算法研究与评测

周粳迪《计算机应用研究》2009,26(3):814-818

针对报文分类算法的可扩展性,深入分析了典型可扩展报文分类算法的时间、空间复杂度;基于ClassBench工具集开发出可扩展报文分类算法评测系统,利用该系统对典型算法在不同模拟场景下进行评测,并对各算法的性能差异和适用条件进行了系统分析。最后,对今后可扩展报文分类算法的发展趋势作出了展望。相似文献

10.

报文分类算法可扩展性标准评测系统

周粳迪程东年刘勤让《计算机工程与设计》2009,30(18)

针对报丈分类算法没有标准可扩展性评测工具的问题,基于计算机体系结构测试领域的benchmark方法开发出基于benchmark的报文分类算法可扩展性评测系统.该系统利用benchmark参数文件引导生成规则库,允许用户利用高层输入参数控制规则库和Trace的产生,并能够实现对被测算法可扩展性能指标的实时监测.最后利用该评测系统对典型报文分类算法在不同模拟场景下进行仿真评测.仿真结果表明,该系统能够准确评测算法的可扩展性,为研究报丈分类算法的可扩展性提供了标准评测工具. 相似文献

11.

Speeding up the Marr-Hildreth edge operator

《Computer Vision, Graphics, and Image Processing》1988,41(2):172-185

相似文献

12.

加速PMR四分树构造的研究

周巧临蒋华《计算机与现代化》2004,(12):94-96,99

PMR四分树空间索引结构在包含空间连接的空间数据库的查询中是很有效的,本文对桶载入PMR四分树的算法做了一些改进,即两种互补的技术：一种改进的插入算法和一种桶载入方法。该技术使得四分树的构造速度相对于传统的四分树构造方法大大提高。该方法可运用到许多基于规则划分的空间数据结构上,来加快它们的构造。相似文献

13.

High performance evaluation of evolutionary-mined association rules on GPUs

Alberto Cano José María Luna Sebastián Ventura 《The Journal of supercomputing》2013,66(3):1438-1461

Association rule mining is a well-known data mining task, but it requires much computational time and memory when mining large scale data sets of high dimensionality. This is mainly due to the evaluation process, where the antecedent and consequent in each rule mined are evaluated for each record. This paper presents a novel methodology for evaluating association rules on graphics processing units (GPUs). The evaluation model may be applied to any association rule mining algorithm. The use of GPUs and the compute unified device architecture (CUDA) programming model enables the rules mined to be evaluated in a massively parallel way, thus reducing the computational time required. This proposal takes advantage of concurrent kernels execution and asynchronous data transfers, which improves the efficiency of the model. In an experimental study, we evaluate interpreter performance and compare the execution time of the proposed model with regard to single-threaded, multi-threaded, and graphics processing unit implementation. The results obtained show an interpreter performance above 67 billion giga operations per second, and speed-up by a factor of up to 454 over the single-threaded CPU model, when using two NVIDIA 480 GTX GPUs. The evaluation model demonstrates its efficiency and scalability according to the problem complexity, number of instances, rules, and GPU devices. 相似文献

14.

Speeding up the detection of evolutive tandem repeats

Richard Groult Martine Lonard Laurent Mouchard 《Theoretical computer science》2004,310(1-3):309-328

We recently introduced evolutive tandem repeats with jump (using Hamming distance) (Proc. MFCS’02: the 27th Internat. Symp. Mathematical Foundations of Computer Science, Warszawa, Otwock, Poland, August 2002, Lecture Notes in Computer Science, Vol. 2420, Springer, Berlin, pp. 292–304) which consist in a series of almost contiguous copies having the following property: the Hamming distance between two consecutive copies is always smaller than a given parameter e. In this article, we present a significative improvement that speeds up the detection of evolutive tandem repeats. It is based on the progressive computation of distances between candidate copies participating to the evolutive tandem repeat. It leads to a new algorithm, still quadratic in the worst case, but much more efficient on average, authorizing larger sequences to be processed. 相似文献

15.

Analysis of symmetry groups of box-splines for evaluation on GPUs

《Graphical Models》2017

In this paper we analyze the symmetry groups of box-splines for efficient analytic evaluation of splines and their derivatives on GPUs (Graphics Processing Units). Given a box-spline, we first analyze its polynomial structure and find its space group which is composed of a point group and a translational group on the domain lattice. To evaluate a spline generated by the box-spline (or its derivative) function, the point group is decomposed into right cosets such that all the polytopes in the same coset share the same analytic polynomial formula. Moreover, by leveraging their symmetries, sufficient number of linearly independent derivative functions of the same order are chosen such that they have a change-of-variables relation with each other. Our OpenCL implementations show that our method is at least ≈ 30% faster but the kernel is at least ≈ 30% smaller compared with the other techniques. 相似文献

16.

基于二分法的CDMA地址的快速分配算法

耿新民陈仕兵《微计算机信息》2006,(12):262-263

本文提出了一种可用于CDMA地址分配的算法,这种基于二分法的快速分配算法与传统算法相比,具有时间复杂度非常低,不需要进行地址码相互独立性判断的特点。该算法也可用于其他的具有数据独立性要求的情况。相似文献

17.

Further results on the experimental evaluation of iterative learning control algorithms for non-minimum phase plants

C. Freeman P. L. Lewin E. Rogers 《International journal of control》2013,86(4):569-582

This paper builds on previous work in which simple structure ILC algorithms were experimentally evaluated on a non-minimum phase plant. The phase-lead update law, the most promising of those implemented, is examined using an established performance criterion. The use of a forgetting factor is found to overcome the problem of instability, but at the expense of increased final error. This is verified through experimentation. The phase-lead algorithm is improved using additional phase-leads to remove the instability and improve convergence and final error. This technique is generalized to produce an optimization routine which leads to greatly improved results. The learning law utilizing the plant adjoint is found to fit naturally into this framework and practical results are presented to compare their respective performance. This algorithm, which requires a model of the plant, is reformulated into one which does not. Results are presented using this technique and practical guidelines are produced and tested to improve its performance. A simple method of increasing the learning at higher frequencies is proposed and practical limitations are addressed and verified experimentally. All the algorithms tested are compared with respect to their performance, practicality and robustness. 相似文献

18.

Performance evaluation of fuzzy rule-based classification systems obtained by multi-objective genetic algorithms

Hisao Ishibuchi Tadahiko Murata Mitsuo Gen 《Computers & Industrial Engineering》1998,35(3-4):575-578

In this paper, we examine the classification performance of fuzzy if-then rules selected by a GA-based multi-objective rule selection method. This rule selection method can be applied to high-dimensional pattern classification problems with many continuous attributes by restricting the number of antecedent conditions of each candidate fuzzy if-then rule. As candidate rules, we only use fuzzy if-then rules with a small number of antecedent conditions. Thus it is easy for human users to understand each rule selected by our method. Our rule selection method has two objectives: to minimize the number of selected fuzzy if-then rules and to maximize the number of correctly classified patterns. In our multi-objective fuzzy rule selection problem, there exist several solutions (i.e., several rule sets) called “non-dominated solutions” because two conflicting objectives are considered. In this paper, we examine the performance of our GA-based rule selection method by computer simulations on a real-world pattern classification problem with many continuous attributes. First we examine the classification performance of our method for training patterns by computer simulations. Next we examine the generalization ability for test patterns. We show that a fuzzy rule-based classification system with an appropriate number of rules has high generalization ability. 相似文献

19.

Experimental evaluation of iterative learning control algorithms for non-minimum phase plants

C. T. Freeman P. L. Lewin E. Rogers 《International journal of control》2013,86(11):826-846

The purpose of this paper is two-fold, firstly it describes the development and modelling of an experimental test facility as a platform on which to assess the performance of Iterative Learning Control (ILC) schemes. This facility includes a non-minimum phase component. Secondly, P-Type, D-Type and phase-lead types of the algorithm have been implemented on the test-bed, results are presented for each method and their performance is compared. Although all the ILC strategies tested experience eventual divergence when applied to a non-minimum phase system, it is found that there is an optimum phase-lead ILC design that maximizes convergence and minimizes error. A general method of arriving at this phase-lead from knowledge of the plant model is described. A variety of filters have been applied and assessed in order to improve the overall performance of the algorithm. 相似文献

20.

Speeding up the learning of robot kinematics through function decomposition 总被引：1，自引：0，他引：1

de Angulo V.R. Torras C. 《Neural Networks, IEEE Transactions on》2005,16(6):1504-1512

The main drawback of using neural networks or other example-based learning procedures to approximate the inverse kinematics (IK) of robot arms is the high number of training samples (i.e., robot movements) required to attain an acceptable precision. We propose here a trick, valid for most industrial robots, that greatly reduces the number of movements needed to learn or relearn the IK to a given accuracy. This trick consists in expressing the IK as a composition of learnable functions, each having half the dimensionality of the original mapping. Off-line and on-line training schemes to learn these component functions are also proposed. Experimental results obtained by using nearest neighbors and parameterized self-organizing map, with and without the decomposition, show that the time savings granted by the proposed scheme grow polynomially with the precision required. 相似文献