首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
Using as example the Soil and Water Assessment Tool (SWAT) model and a Southern Ontario Canada watershed, we conduct a set of experiments on calibration using a manual approach, a parallelized version of the shuffled complex evolution (SCE), Generalized Likelihood Uncertainty Estimation (GLUE), Sequential Uncertainty Fitting (SUFI-2) and compare to a simple parallel search on a finite set of gridded input parameter values invoking the probably approximately correct (PAC) learning hypothesis. We derive an estimation of the error in fitting and a prior estimate of the probability of success, based on the PAC hypothesis. We conclude that from the equivalent effort expended on initial setup for the other named algorithms we can already find directly a good parameter set for calibration. We further note that, in this algorithm, simultaneous co-calibration of flow and chemistry (total nitrogen and total phosphorous) is more likely to produce acceptable results, as compared to flow first, even with a simple weighted multiobjective approach. This approach is especially suited to a parallel, distributed or cloud computational environment.  相似文献   

2.
In this paper, a stream-based dataflow architecture is proposed, and its simulation model, which has helped to evaluate the effectiveness of the proposed architectural concept, is discussed. The machine integrates the conventional Von Neumann type of control flow subsystem with a dataflow processing element of token storage type. The control flow unit tackles the dynamic nature of the stream structure including input/output whereas the dataflow unit does the computation part in an applicative style. A pipelined version of the stream machine is also discussed. The effectiveness of the machine is studied by running a few example programs in the simulated machine. The machine is expected to be useful in real time signal processing applications.  相似文献   

3.
The performance of a multiprocessor architecture is determined both by the way the program is partitioned into processes and by the way these processes are allocated to different processors. In the fine-grain dataflow model, where each process consists of a single instruction, decomposition of a program into processes is achieved automatically by compilation. This paper investigates the effectiveness of fine-grain decomposition in the context of the prototype dataflow machine now in operation at the University of Manchester. The current machine is a uniprocessor, known as the Single-Ring Dataflow Machine, comprising a single processing element which contains several units connected together in a pipelined ring. A Multi-ring Dataflow Machine (MDM) containing several such processing elements connected together via an interprocessor switching network, is currently under investigation. The paper describes a method of allocating dataflow instructions to processing elements in the MDM, and examines the influence of this method on selection of a switching network. Results obtained from simulation of the MDM are presented. They show that programs are executed efficiently when their parallelism is matched to the parallelism of the machine hardware.  相似文献   

4.
The authors describe several fundamentally useful primitive operations and routines and illustrate their usefulness in a wide range of familiar version processes. These operations are described in terms of a vector machine model of parallel computation. They use a parallel vector model because vector models can be mapped onto a wide range of architectures. They also describe implementing these primitives on a particular fine-grained machine, the connection machine. It is found that these primitives are applicable in a variety of vision tasks. Grid permutations are useful in many early vision algorithms, such as Gaussian convolution, edge detection, motion, and stereo computation. Scan primitives facilitate simple, efficient solutions of many problems in middle- and high-level vision. Pointer jumping, using permutation operations, permits construction of extended image structures in logarithmic time. Methods such as outer products, which rely on a variety of primitives, play an important role of many high-level algorithms  相似文献   

5.
Interval data offer a valuable way of representing the available information in complex problems where uncertainty, inaccuracy, or variability must be taken into account. Considered in this paper is the learning of interval neural networks, of which the input and output are vectors with interval components, and the weights are real numbers. The back-propagation (BP) learning algorithm is very slow for interval neural networks, just as for usual real-valued neural networks. Extreme learning machine (ELM) has faster learning speed than the BP algorithm. In this paper, ELM is applied for learning of interval neural networks, resulting in an interval extreme learning machine (IELM). There are two steps in the ELM for usual feedforward neural networks. The first step is to randomly generate the weights connecting the input and the hidden layers, and the second step is to use the Moore–Penrose generalized inversely to determine the weights connecting the hidden and output layers. The first step can be directly applied for interval neural networks. But the second step cannot, due to the involvement of nonlinear constraint conditions for IELM. Instead, we use the same idea as that of the BP algorithm to form a nonlinear optimization problem to determine the weights connecting the hidden and output layers of IELM. Numerical experiments show that IELM is much faster than the usual BP algorithm. And the generalization performance of IELM is much better than that of BP, while the training error of IELM is a little bit worse than that of BP, implying that there might be an over-fitting for BP.  相似文献   

6.
The evolution of edge computing devices has enabled machine intelligence techniques to process data close to its producers (the sensors) and end-users. Although edge devices are usually resource-constrained, the distribution of processing services among several nodes enables a processing capacity similar to cloud environments. However, the edge computing environment is highly dynamic, impacting the availability of nodes in the distributed system. In addition, the processing workload for each node can change constantly. Thus, the scaling of processing services needs to be rapidly adjusted, avoiding bottlenecks or wasted resources while meeting the applications’ QoS requirements. This paper presents an auto-scaling subsystem for container-based processing services using online machine learning. The auto-scaling follows the MAPE-K control loop to dynamically adjust the number of containers in response to workload changes. We designed the approach for scenarios where the number of processing requests is unknown beforehand. We developed a hybrid auto-scaling mechanism that behaves reactively while a prediction online machine learning model is continuously trained. When the prediction model reaches a desirable performance, the auto-scaling acts proactively, using predictions to anticipate scaling actions. An experimental evaluation has demonstrated the feasibility of the architecture. Our solution achieved fewer service level agreement (SLA) violations and scaling operations to meet demand than purely reactive and no scaling approaches using an actual application workload. Also, our solution wasted fewer resources compared to the other techniques.  相似文献   

7.
This paper presents a new architecture for embedded systems and describes an appropriate method for programming a control system. A grinding machine control system was built and an experimental verification of the theoretical approach was performed. The efficiency of this novel system was compared with the conventional control systems by grinding a workpiece up to the stringent quality requirements. The superior performance of the OR dataflow control system lead to the encouraging conclusions presented in this paper.  相似文献   

8.
Logic programming languages have gained wide acceptance because of two reasons. First is their clear declarative semantics and the second is the wide scope for parallelism they provide which can be exploited by building suitable parallel architectures. In this paper, we propose a multi-ring dataflow machine to support theOR-parallelism and theArgument parallelism of logic programs. A new scheme is suggested for handling the deferred read mechanism of the dataflow architecture. The required data structures, the dataflow actors and the builtin dataflow procedures for OR-parallel execution are discussed. Multiple binding environments arising in the OR-parallel execution are handled by a new scheme called thetagged variable scheme. Schemes for constrained OR-parallel execution are also discussed.  相似文献   

9.
曹嵘晖    唐卓    左知微    张学东   《智能系统学报》2021,16(5):919-930
当前机器学习等算法的计算、迭代过程日趋复杂, 充足的算力是保障人工智能应用落地效果的关键。本文首先提出一种适应倾斜数据的分布式异构环境下的任务时空调度算法,有效提升机器学习模型训练等任务的平均效率;其次,提出分布式异构环境下高效的资源管理系统与节能调度算法,实现分布式异构环境下基于动态预测的跨域计算资源迁移及电压/频率的动态调节,节省了系统的整体能耗;然后构建了适应于机器学习/深度学习算法迭代的分布式异构优化环境,提出了面向机器学习/图迭代算法的分布式并行优化基本方法。最后,本文研发了面向领域应用的智能分析系统,并在制造、交通、教育、医疗等领域推广应用,解决了在高效数据采集、存储、清洗、融合与智能分析等过程中普遍存在的性能瓶颈问题。  相似文献   

10.
Analog neural network for support vector machine learning   总被引:1,自引:0,他引:1  
An analog neural network for support vector machine learning is proposed, based on a partially dual formulation of the quadratic programming problem. It results in a simpler circuit implementation with respect to existing neural solutions for the same application. The effectiveness of the proposed network is shown through some computer simulations concerning benchmark problems  相似文献   

11.
12.
This article describes a framework for synchronization optimizations and a set of transformations for programs that implement critical sections using mutual exclusion locks. The basic synchronization transformations take constructs that acquire and release locks and move these constructs both within and between procedures. They also eliminate, acquire and release constructs that use the same lock and are adjacent in the program. The article also presents a synchronization optimization algorithm, lock elimination, that uses these transformations to reduce the synchronization overhead. This algorithm locates computations that repeatedly acquire and release the same lock, then transforms the computations so that they acquire and release the lock only once. The goal of this algorithm is to reduce the lock overhead by reducing the number of times that computations acquire and release locks. But because the algorithm also increases the sizes of the critical sections, it may decrease the amount of available concurrency. The algorithm addresses this trade-off by providing several different optimization policies. The policies differ in the amount by which they increase the sizes of the critical sections. Experimental results from a parallelizing compiler for object-based programs illustrate the practical utility of the lock elimination algorithm. For three benchmark applications, the algorithm can dramatically reduce the number of times the applications acquire and release locks, which significantly reduces the amount of time processors spend acquiring and releasing locks. The resulting overall performance improvements for these benchmarks range from no observable improvement to up to 30% performance improvement. Copyright © 1999 John Wiley & Sons, Ltd.  相似文献   

13.
郭棉  张锦友 《计算机应用》2021,41(9):2639-2645
针对物联网(IoT)数据源的多样化、数据的非独立同分布性、边缘设备计算能力和能耗的异构性,提出一种集中学习和联邦学习共存的移动边缘计算(MEC)网络计算迁移策略。首先,建立与集中学习、联邦学习都关联的计算迁移系统模型,考虑了集中学习、联邦学习模型产生的网络传输延迟、计算延迟以及能耗;然后,以系统平均延迟为优化目标、以能耗和基于机器学习准确率的训练次数为限制条件构建面向机器学习的计算迁移优化模型。接着对所述计算迁移进行了博弈分析,并基于分析结果提出一种能量约束的延迟贪婪(ECDG)算法,通过延迟贪婪决策和能量约束决策更新二阶优化来获取模型的优化解。与集中式贪婪算法和面向联邦学习的客户选择(FedCS)算法相比,ECDG算法的平均学习延迟最低,约为集中式贪婪算法的1/10,为FedCS算法的1/5。实验结果表明,ECDG算法能通过计算迁移自动为数据源选择最优的机器学习模型,从而有效降低机器学习的延迟,提高边缘设备的能效,满足IoT应用的服务质量(QoS)要求。  相似文献   

14.
Compute-intensive applications have gradually changed focus from massively parallel supercomputers to capacity as a resource obtained on-demand. This is particularly true for the large-scale adoption of cloud computing and MapReduce in industry, while it has been difficult for traditional high-performance computing (HPC) usage in scientific and engineering computing to exploit this type of resources. However, with the strong trend of increasing parallelism rather than faster processors, a growing number of applications target parallelism already on the algorithm level with loosely coupled approaches based on sampling and ensembles. While these cannot trivially be formulated as MapReduce, they are highly amenable to throughput computing. There are many general and powerful frameworks, but in particular for sampling-based algorithms in scientific computing there are some clear advantages from having a platform and scheduler that are highly aware of the underlying physical problem. Here, we present how these challenges are addressed with combinations of dataflow programming, peer-to-peer techniques and peer-to-peer networks in the Copernicus platform. This allows automation of sampling-focused workflows, task generation, dependency tracking, and not least distributing these to a diverse set of compute resources ranging from supercomputers to clouds and distributed computing (across firewalls and fragile networks). Workflows are defined from modules using existing programs, which makes them reusable without programming requirements. The system achieves resiliency by handling node failures transparently with minimal loss of computing time due to checkpointing, and a single server can manage hundreds of thousands of cores e.g. for computational chemistry applications.  相似文献   

15.
16.
应用程序中涉及到的数据日益扩大且结构日益复杂,使得在大规模数据上运行极限学习机ELM成为一个具有挑战性的任务。为了应对这一挑战,提出了一个在云计算环境下安全和实用的ELM外包机制。该机制将ELM显式地分为私有部分和公有部分,可以有效地减少训练时间,并确保算法输入与输出的安全性。私有部分主要负责随机参数的生成和一些简单的矩阵计算;公有部分外包到云计算服务器中,由云计算服务商负责ELM算法中计算量最大的计算Moore-Penrose广义逆的操作。该广义逆也作为证据以验证结果的正确性和可靠性。我们从理论上对该外包机制的安全性进行了分析。在CIFAR-10数据集上的实验结果表明,我们所提出的机制可以有效地减少用户的计算量。  相似文献   

17.
Zhu  Xing  Xu  Qiang  Tang  Minggao  Li  Huajin  Liu  Fangzhou 《Neural computing & applications》2018,30(12):3825-3835

A novel hybrid model composed of least squares support vector machines (LSSVM) and double exponential smoothing (DES) was proposed and applied to calculate one-step ahead displacement of multifactor-induced landslides. The wavelet de-noising and Hodrick-Prescott filter methods were used to decompose the original displacement time series into three components: periodic term, trend term and random noise, which respectively represent periodic dynamic behaviour of landslides controlled by the seasonal triggers, the geological conditions and the random measuring noise. LSSVM and DES models were constructed and trained to forecast the periodic component and the trend component, respectively. Models’ inputs include the seasonal triggers (e.g. reservoir level and rainfall data) and displacement values which are measurable variables in a specific prior time. The performance of the hybrid model was evaluated quantitatively. Calculated displacement from the hybrid model is excellently consistent with actual monitored value. Results of this work indicate that the hybrid model is a powerful tool for predicting one-step ahead displacement of landslide triggered by multiple factors.

  相似文献   

18.
19.
刘志刚  许少华  李盼池 《控制与决策》2016,31(12):2241-2247
连续过程神经元网络在权函数正交基展开时, 基函数个数无法有效确定, 因此逼近精度不高. 针对该问题, 提出一种离散过程神经元网络, 使用三次样条数值积分处理离散样本和权值的时域聚合运算. 模型训练采用双链量子粒子群完成输入权值的全局寻优, 通过量子旋转门和非门完成种群进化. 局部使用极限学习, 通过Moore-Penrose广义逆计算输出权值. 以时间序列预测为例进行仿真实验, 结果验证了模型的有效性, 且训练收敛能力和逼近能力都有一定程度的提高.  相似文献   

20.
罗庚合 《计算机应用》2013,33(7):1942-1945
针对极限学习机(ELM)算法随机选择输入层权值的问题,借鉴第2类型可拓神经网络(ENN-2)聚类的思想,提出了一种基于可拓聚类的ELM(EC-ELM)神经网络。该神经网络是以隐含层神经元的径向基中心向量作为输入层权值,采用可拓聚类算法动态调整隐含层节点数目和径向基中心,并根据所确定的输入层权值,利用Moore-Penrose广义逆快速完成输出层权值的求解。同时,对标准的Friedman#1回归数据集和Wine分类数据集进行测试,结果表明,EC-ELM提供了一种简便的神经网络结构和参数学习方法,并且比基于可拓理论的径向基函数(ERBF)、ELM神经网络具有更高的建模精度和更快的学习速度,为复杂过程的建模提供了新思路。  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号