首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
The reinforcement and imitation learning paradigms have the potential to revolutionise robotics. Many successful developments have been reported in literature; however, these approaches have not been explored widely in robotics for construction. The objective of this paper is to consolidate, structure, and summarise research knowledge at the intersection of robotics, reinforcement learning, and construction. A two-strand approach to literature review was employed. A bottom-up approach to analyse in detail a selected number of relevant publications, and a top-down approach in which a large number of papers were analysed to identify common relevant themes and research trends. This study found that research on robotics for construction has not increased significantly since the 1980s, in terms of number of publications. Also, robotics for construction lacks the development of dedicated systems, which limits their effectiveness. Moreover, unlike manufacturing, construction's unstructured and dynamic characteristics are a major challenge for reinforcement and imitation learning approaches. This paper provides a very useful starting point to understating research on robotics for construction by (i) identifying the strengths and limitations of the reinforcement and imitation learning approaches, and (ii) by contextualising the construction robotics problem; both of which will aid to kick-start research on the subject or boost existing research efforts.  相似文献   

2.
Fault diagnosis of rolling bearing is crucial for safety of large rotating machinery. However, in practical engineering, the fault modes of rolling bearings are usually compound faults and contain a large amount of noise, which increases the difficulty of fault diagnosis. Therefore, a deep feature enhanced reinforcement learning method is proposed for the fault diagnosis of rolling bearing. Firstly, to improve robustness, the neural network is modified by the Elu activation function. Secondly, attention model is used to improve the feature enhanced ability and acquire essential global information. Finally, deep Q network is established to accurately diagnosis the fault modes. Sufficient experiments are conducted on the rolling bearing dataset. Test result shows that the proposed method is superior to other intelligent diagnosis methods.  相似文献   

3.
徐郁  朱韵攸  刘筱  邓雨婷  廖勇 《计算机应用》2022,42(10):3252-3258
针对现有电力物资车辆路径问题(EVRP)优化时考虑目标函数较为单一、约束不够全面,并且传统求解算法效率不高的问题,提出一种基于深度强化学习(DRL)的电力物资配送多目标路径优化模型和求解算法。首先,充分考虑了电力物资配送区域的加油站分布情况、物资运输车辆的油耗等约束,建立了以电力物资配送路径总长度最短、成本最低、物资需求点满意度最高为目标的多目标电力物资配送模型;其次,设计了一种基于DRL的电力物资配送路径优化算法DRL-EVRP求解所提模型。DRL-EVRP使用改进的指针网络(Ptr-Net)和Q-学习(Q-learning)算法结合的深度Q-网络(DQN)来将累积增量路径长度的负值与满意度之和作为奖励函数。所提算法在进行训练学习后,可直接用于电力物资配送路径规划。仿真实验结果表明,DRL-EVRP求解得到的电力物资配送路径总长度相较于扩展C-W(ECW)节约算法、模拟退火(SA)算法更短,且运算时间在可接受范围内,因此所提算法能更加高效、快速地进行电力物资配送路径优化。  相似文献   

4.
RRL is a relational reinforcement learning system based on Q-learning in relational state-action spaces. It aims to enable agents to learn how to act in an environment that has no natural representation as a tuple of constants. For relational reinforcement learning, the learning algorithm used to approximate the mapping between state-action pairs and their so called Q(uality)-value has to be very reliable, and it has to be able to handle the relational representation of state-action pairs. In this paper we investigate the use of Gaussian processes to approximate the Q-values of state-action pairs. In order to employ Gaussian processes in a relational setting we propose graph kernels as a covariance function between state-action pairs. The standard prediction mechanism for Gaussian processes requires a matrix inversion which can become unstable when the kernel matrix has low rank. These instabilities can be avoided by employing QR-factorization. This leads to better and more stable performance of the algorithm and a more efficient incremental update mechanism. Experiments conducted in the blocks world and with the Tetris game show that Gaussian processes with graph kernels can compete with, and often improve on, regression trees and instance based regression as a generalization algorithm for RRL. Editors: David Page and Akihiro Yamamoto  相似文献   

5.
In order to accomplish diverse tasks successfully in a dynamic (i.e., changing over time) construction environment, robots should be able to prioritize assigned tasks to optimize their performance in a given state. Recently, a deep reinforcement learning (DRL) approach has shown potential for addressing such adaptive task allocation. It remains unanswered, however, whether or not DRL can address adaptive task allocation problems in dynamic robotic construction environments. In this paper, we developed and tested a digital twin-driven DRL learning method to explore the potential of DRL for adaptive task allocation in robotic construction environments. Specifically, the digital twin synthesizes sensory data from physical assets and is used to simulate a variety of dynamic robotic construction site conditions within which a DRL agent can interact. As a result, the agent can learn an adaptive task allocation strategy that increases project performance. We tested this method with a case project in which a virtual robotic construction project (i.e., interlocking concrete bricks are delivered and assembled by robots) was digitally twinned for DRL training and testing. Results indicated that the DRL model’s task allocation approach reduced construction time by 36% in three dynamic testing environments when compared to a rule-based imperative model. The proposed DRL learning method promises to be an effective tool for adaptive task allocation in dynamic robotic construction environments. Such an adaptive task allocation method can help construction robots cope with uncertainties and can ultimately improve construction project performance by efficiently prioritizing assigned tasks.  相似文献   

6.
Least squares support vector regression (LSSVR) is an effective and competitive approach for crude oil price prediction, but its performance suffers from parameter sensitivity and long tuning time. This paper considers the user-defined parameters as uncertain (or random) factors to construct an LSSVR ensemble learning paradigm, by taking four major steps. First, probability distributions of the user-defined parameters in LSSVR are designed using grid method for low upper bound estimation (LUBE). Second, random sets of parameters are generated according to the designed probability distributions to formulate diverse individual LSSVR members. Third, each individual member is applied to individual prediction. Finally, all individual results are combined to the final output via ensemble weighted averaging, with probabilities measuring the corresponding weights. The computational experiment using the crude oil spot price of West Texas Intermediate (WTI) verifies the effectiveness of the proposed LSSVR ensemble learning paradigm with uncertain parameters compared with some existing LSSVR variants (using other popular parameters selection algorithms), in terms of prediction accuracy and time-saving.  相似文献   

7.
The increasing demand for mobility in our society poses various challenges to traffic engineering, computer science in general, and artificial intelligence and multiagent systems in particular. As it is often the case, it is not possible to provide additional capacity, so that a more efficient use of the available transportation infrastructure is necessary. This relates closely to multiagent systems as many problems in traffic management and control are inherently distributed. Also, many actors in a transportation system fit very well the concept of autonomous agents: the driver, the pedestrian, the traffic expert; in some cases, also the intersection and the traffic signal controller can be regarded as an autonomous agent. However, the “agentification” of a transportation system is associated with some challenging issues: the number of agents is high, typically agents are highly adaptive, they react to changes in the environment at individual level but cause an unpredictable collective pattern, and act in a highly coupled environment. Therefore, this domain poses many challenges for standard techniques from multiagent systems such as coordination and learning. This paper has two main objectives: (i) to present problems, methods, approaches and practices in traffic engineering (especially regarding traffic signal control); and (ii) to highlight open problems and challenges so that future research in multiagent systems can address them.  相似文献   

8.
H.F.  G.P.  F.T.  H.Y. 《Neurocomputing》2007,70(16-18):2913
This paper compares the predictive performance of ARIMA, artificial neural network and the linear combination models for forecasting wheat price in Chinese market. Empirical results show that the combined model can improve the forecasting performance significantly in contrast with its counterparts in terms of the error evaluation measurements. However, as far as turning points and profit criterions are concerned, the ANN model is best as well as at capturing a significant number of turning points. The results are conflicting when implementing dissimilar forecasting criteria (the quantitative and the turning points measurements) to evaluate the performance of three models. The ANN model is overall the best model, and can be used as an alternative method to model Chinese future food grain price.  相似文献   

9.
进化强化学习及其在机器人路径跟踪中的应用   总被引:2,自引:1,他引:2  
研究了一种基于自适应启发评价(AHC)强化学习的移动机器人路径跟踪控制方法.AHC的评价单元(ACE)采用多层前向神经网络来实现.将TD(λ)算法和梯度下降法相结合来更新神经网络的权值.AHC的动作选择单元(ASE)由遗传算法优化的模糊推理系统(FIS)构成.ACE网络的输出构成二次强化信号,用于指导ASE的学习.最后将所提出的算法应用于移动机器人的行为学习,较好地解决了机器人的复杂路径跟踪问题.  相似文献   

10.
Cognitive radio network (CRN) enables unlicensed users (or secondary users, SUs) to sense for and opportunistically operate in underutilized licensed channels, which are owned by the licensed users (or primary users, PUs). Cognitive radio network (CRN) has been regarded as the next-generation wireless network centered on the application of artificial intelligence, which helps the SUs to learn about, as well as to adaptively and dynamically reconfigure its operating parameters, including the sensing and transmission channels, for network performance enhancement. This motivates the use of artificial intelligence to enhance security schemes for CRNs. Provisioning security in CRNs is challenging since existing techniques, such as entity authentication, are not feasible in the dynamic environment that CRN presents since they require pre-registration. In addition these techniques cannot prevent an authenticated node from acting maliciously. In this article, we advocate the use of reinforcement learning (RL) to achieve optimal or near-optimal solutions for security enhancement through the detection of various malicious nodes and their attacks in CRNs. RL, which is an artificial intelligence technique, has the ability to learn new attacks and to detect previously learned ones. RL has been perceived as a promising approach to enhance the overall security aspect of CRNs. RL, which has been applied to address the dynamic aspect of security schemes in other wireless networks, such as wireless sensor networks and wireless mesh networks can be leveraged to design security schemes in CRNs. We believe that these RL solutions will complement and enhance existing security solutions applied to CRN To the best of our knowledge, this is the first survey article that focuses on the use of RL-based techniques for security enhancement in CRNs.  相似文献   

11.
In this paper, we first discuss the meaning of physical embodiment and the complexity of the environment in the context of multi-agent learning. We then propose a vision-based reinforcement learning method that acquires cooperative behaviors in a dynamic environment. We use the robot soccer game initiated by RoboCup (Kitano et al., 1997) to illustrate the effectiveness of our method. Each agent works with other team members to achieve a common goal against opponents. Our method estimates the relationships between a learner's behaviors and those of other agents in the environment through interactions (observations and actions) using a technique from system identification. In order to identify the model of each agent, Akaike's Information Criterion is applied to the results of Canonical Variate Analysis to clarify the relationship between the observed data in terms of actions and future observations. Next, reinforcement learning based on the estimated state vectors is performed to obtain the optimal behavior policy. The proposed method is applied to a soccer playing situation. The method successfully models a rolling ball and other moving agents and acquires the learner's behaviors. Computer simulations and real experiments are shown and a discussion is given.  相似文献   

12.
Accurate prediction of electricity consumption is essential for providing actionable insights to decision-makers for managing volume and potential trends in future energy consumption for efficient resource management. A single model might not be sufficient to solve the challenges that result from linear and non-linear problems that occur in electricity consumption prediction. Moreover, these models cannot be applied in practice because they are either not interpretable or poorly generalized. In this paper, a stacking ensemble model for short-term electricity consumption is proposed. We experimented with machine learning and deep models like Random Forests, Long Short Term Memory, Deep Neural Networks, and Evolutionary Trees as our base models. Based on the experimental observations, two different ensemble models are proposed, where the predictions of the base models are combined using Gradient Boosting and Extreme Gradient Boosting (XGB). The proposed ensemble models were tested on a standard dataset that contains around 500,000 electricity consumption values, measured at periodic intervals, over the span of 9 years. Experimental validation revealed that the proposed ensemble model built on XGB reduces the training time of the second layer of the ensemble by a factor of close to 10 compared to the state-of-the-art , and also is more accurate. An average reduction of approximately 39% was observed in the Root mean square error.  相似文献   

13.
Dam displacement is an important indicator of the overall dam health status. Numerical prediction of such displacement based on real-world monitoring data is a common practice for dam safety assessment. However, the existing methods are mainly based on statistical models or shallow machine learning models. Although they can capture the timing of the dam displacement sequence, it is difficult to characterize the complex coupling relationship between displacement and multiple influencing factors (e.g., water level, temperature, and time). In addition, input factors of most dam displacement prediction models are artificially constructed based on modelers’ personal experience, which lead to a loss of valuable information, thus prediction power, provided by the full set of raw monitoring data. To address these problems, this paper proposes a novel dual-stage deep learning approach based on one-Dimensional Residual network and Long Short-Term Memory (LSTM) unit, referred to herein as the DRLSTM model. In the first stage, the raw monitoring sequence is processed and spliced with convolution to form a combined sequence. After the timing information is extracted, the convolution direction is switched to learn the complex relationship between displacement and its influencing factors. LSTM is used to extract this relationship to obtain Stage I prediction. The second stage takes the difference between the actual measurement and the Stage I prediction as inputs, and LSTM extracts the stochastic features of the monitoring system to obtain Stage II prediction. The sum of two stage predictions forms the final prediction. The DRLSTM model only requires raw monitoring data of water level and temperature to accurately predict displacement. Through a real-world comparative study against four commonly used shallow learning models and three deep learning models, the root mean square error and mean absolute error of our proposed method are the smallest, being 0.198 mm and 0.149 mm respectively, while the correlation coefficient is the largest at 0.962. It is concluded that the DRLSTM model performance well for evaluating dam health status.  相似文献   

14.
Load demand forecasting is a critical process in the planning of electric utilities. An ensemble method composed of Empirical Mode Decomposition (EMD) algorithm and deep learning approach is presented in this work. For this purpose, the load demand series were first decomposed into several intrinsic mode functions (IMFs). Then a Deep Belief Network (DBN) including two restricted Boltzmann machines (RBMs) was used to model each of the extracted IMFs, so that the tendencies of these IMFs can be accurately predicted. Finally, the prediction results of all IMFs can be combined by either unbiased or weighted summation to obtain an aggregated output for load demand. The electricity load demand data sets from Australian Energy Market Operator (AEMO) are used to test the effectiveness of the proposed EMD-based DBN approach. Simulation results demonstrated attractiveness of the proposed method compared with nine forecasting methods.  相似文献   

15.
Personalized production has emerged as a result of the increasing customer demand for more personalized products. Personalized production systems carry a greater amount of uncertainty and variability when compared with traditional manufacturing systems. In this paper, we present a smart manufacturing system using a multi-agent system and reinforcement learning, which is characterized by machines with intelligent agents to enable a system to have autonomy of decision making, sociability to interact with other systems, and intelligence to learn dynamically changing environments. In the proposed system, machines with intelligent agents evaluate the priorities of jobs and distribute them through negotiation. In addition, we propose methods for machines with intelligent agents to learn to make better decisions. The performance of the proposed system and the dispatching rule is demonstrated by comparing the results of the scheduling problem with early completion, productivity, and delay. The obtained results show that the manufacturing system with distributed artificial intelligence is competitive in a dynamic environment.  相似文献   

16.
17.
18.
In this paper, a dynamic fuzzy energy state based AODV (DFES-AODV) routing protocol for Mobile Ad-hoc NETworks (MANETs) is presented. In DFES-AODV route discovery phase, each node uses a Mamdani fuzzy logic system (FLS) to decide its Route REQuests (RREQs) forwarding probability. The FLS inputs are residual battery level and energy drain rate of mobile node. Unlike previous related-works, membership function of residual energy input is made dynamic. Also, a zero-order Takagi Sugeno FLS with the same inputs is used as a means of generalization for state-space in SARSA-AODV a reinforcement learning based energy-aware routing protocol. The simulation study confirms that using a dynamic fuzzy system ensures more energy efficiency in comparison to its static counterpart. Moreover, DFES-AODV exhibits similar performance to SARSA-AODV and its fuzzy extension FSARSA-AODV. Therefore, the use of dynamic fuzzy logic for adaptive routing in MANETs is recommended.  相似文献   

19.
In this work, RL is used to find an optimal policy for a marketing campaign. Data show a complex characterization of state and action spaces. Two approaches are proposed to circumvent this problem. The first approach is based on the self-organizing map (SOM), which is used to aggregate states. The second approach uses a multilayer perceptron (MLP) to carry out a regression of the action-value function. The results indicate that both approaches can improve a targeted marketing campaign. Moreover, the SOM approach allows an intuitive interpretation of the results, and the MLP approach yields robust results with generalization capabilities.  相似文献   

20.
This paper presents the application of a deep learning based model for the short-term forecasting of the electric demand in a heating, ventilation, and air conditioning system (HVAC) for the demand response programs of utility companies. The deep learning model is applied through two different approaches comparing their merits. The approaches consist of: (i) a monolithic approach that applies a single large model to forecast the target variables, and (ii) a sequential approach that consists of multiple deep learning models coupled together each targeting a specific energy load within the HVAC system. The model accuracy of both approaches is explored over two case studies applied to the same institutional building; however, the case studies differ in their data source. The first case study uses synthetic data obtained from an eQuest simulation, while the second case study uses measurement data obtained from the building automation system. Results show that the difference in forecasting error of these approaches is negligible; however, the monolithic approach required the least amount of calibration time. Next, this paper explores the application of off-site weather data applied to a building model calibrated with on-site data. The experiments demonstrated that the off-site weather data can be applied with a slight reduction in forecasting performance.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号