首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 921 毫秒
1.
在R-CNN框架提出后,基于深度学习的目标检测框架逐渐成为主流,可分为基于候选窗口和基于回归两类。近两年来,在Faster R-CNN、YOLO、SSD等经典的基于深度学习目标检测框架的基础上,出现了大量的优秀框架。根据优化方法对近几年提出的框架进行了梳理和总结。在PASCAL_VOC和MS COCO等主流测试集上对目标检测方法的性能及优缺点进行了对比分析。讨论了目标检测领域当前面临的困难与挑战,对可能的发展方向进行了展望。  相似文献   

2.
CCDM 2014数据挖掘竞赛基于医学诊断数据,提出了实际生活中广泛出现的多类标问题和多类分类问题。针对两个问题出现的类别不平衡现象以及训练样本较少等特点,为了更好地完成数据挖掘任务,借助二次学习和集成学习的思想,提出了一个新的学习框架--二次集成学习。该学习框架通过首次集成学习得到若干置信度较高的样本,将其加入到原始训练集,并在新的训练集上进行二次学习,进而得到泛化性能更高的分类器。竞赛结果表明,与常用的集成学习相比,二次集成学习在两个问题上均取得了非常理想的结果。  相似文献   

3.
A novel framework for intelligent structural control is proposed using reinforcement learning. In this approach, a deep neural network learns how to improve structural responses using feedback control. The effectiveness of the framework is demonstrated in a case study for a moment frame subjected to earthquake excitations. The performance of the learning method was improved by proposing a state-selector function that prevented the neural network from forgetting key states. Results show that the controller significantly improves structural responses not only to earthquake records on which it was trained but also to earthquake records new to the controller. The controller also has stable performance under environmental uncertainties. This capability distinguishes the proposed approach and makes it more appropriate for the situations in which it is likely that the controller will be exposed to unpredictable external excitations and high degrees of uncertainties.  相似文献   

4.
Domain adaptation learning(DAL) methods have shown promising results by utilizing labeled samples from the source(or auxiliary) domain(s) to learn a robust classifier for the target domain which has a few or even no labeled samples.However,there exist several key issues which need to be addressed in the state-of-theart DAL methods such as sufficient and effective distribution discrepancy metric learning,effective kernel space learning,and multiple source domains transfer learning,etc.Aiming at the mentioned-above issues,in this paper,we propose a unified kernel learning framework for domain adaptation learning and its effective extension based on multiple kernel learning(MKL) schema,regularized by the proposed new minimum distribution distance metric criterion which minimizes both the distribution mean discrepancy and the distribution scatter discrepancy between source and target domains,into which many existing kernel methods(like support vector machine(SVM),v-SVM,and least-square SVM) can be readily incorporated.Our framework,referred to as kernel learning for domain adaptation learning(KLDAL),simultaneously learns an optimal kernel space and a robust classifier by minimizing both the structural risk functional and the distribution discrepancy between different domains.Moreover,we extend the framework KLDAL to multiple kernel learning framework referred to as MKLDAL.Under the KLDAL or MKLDAL framework,we also propose three effective formulations called KLDAL-SVM or MKLDAL-SVM with respect to SVM and its variant μ-KLDALSVM or μ-MKLDALSVM with respect to v-SVM,and KLDAL-LSSVM or MKLDAL-LSSVM with respect to the least-square SVM,respectively.Comprehensive experiments on real-world data sets verify the outperformed or comparable effectiveness of the proposed frameworks.  相似文献   

5.
This paper addresses a new method for combination of supervised learning and reinforcement learning (RL). Applying supervised learning in robot navigation encounters serious challenges such as inconsistent and noisy data, difficulty for gathering training data, and high error in training data. RL capabilities such as training only by one evaluation scalar signal, and high degree of exploration have encouraged researchers to use RL in robot navigation problem. However, RL algorithms are time consuming as well as suffer from high failure rate in the training phase. Here, we propose Supervised Fuzzy Sarsa Learning (SFSL) as a novel idea for utilizing advantages of both supervised and reinforcement learning algorithms. A zero order Takagi–Sugeno fuzzy controller with some candidate actions for each rule is considered as the main module of robot's controller. The aim of training is to find the best action for each fuzzy rule. In the first step, a human supervisor drives an E-puck robot within the environment and the training data are gathered. In the second step as a hard tuning, the training data are used for initializing the value (worth) of each candidate action in the fuzzy rules. Afterwards, the fuzzy Sarsa learning module, as a critic-only based fuzzy reinforcement learner, fine tunes the parameters of conclusion parts of the fuzzy controller online. The proposed algorithm is used for driving E-puck robot in the environment with obstacles. The experiment results show that the proposed approach decreases the learning time and the number of failures; also it improves the quality of the robot's motion in the testing environments.  相似文献   

6.

This paper presents a smart supervisory framework for a single process controller, designed for Industry 4.0 shop floors. This digitization of a full supervisory suite for a single process controller enables self-awareness, self-diagnosis, self-prognosis, and self-healing (by definition, these "self" elements are missing from other supervisory frameworks diagnosing numerous controllers in parallel). The proposed framework is aligned with the concept of a Cyber Physical System (CPS), since its implementation generates a rich cyber physical entity of the controlled process. This CPS entity can either be considered as the process digital twin, or can provide a solid basis for generating it. Finally, the framework includes the main characteristics of Industry 4.0, such as advanced use of Artificial Intelligence (AI) and big data analysis. The framework is based on four modules: (1) Control and Awareness module—performing both continuous process control and adjustments, as well as machine learning (ML) and statistical process control (SPC) for identifying abnormalities that require further diagnosis; (2) Process -diagnosis module—performing continual (recurrent) analysis of the process state and trends; (3) Prognosis and Healing module—performing prognosis and automated intervention via parameter changes, re-configurations, and automated maintenance; (4) External Interaction Platform—an interactive module for interfacing with experts, presenting them with the process analysis information and obtaining feedback from them as part of a learning process. Using an implementation showcase to illustrate the methodological framework’s applicability, we demonstrate its real-world potential. The proposed framework could serve as a guide for implementing smart process control and maintenance systems in Industry 4.0 shop floors. It could also provide a firm basis for comparison with future suggested frameworks. Future research directions could include pursuing improvements to the proposed process control framework and validating the framework by case studies of its implementation.

  相似文献   

7.
This work proposes a novel proportional-derivative(PD)-type state-dependent Riccati equation(SDRE) approach with iterative learning control(ILC) augmentation. On the one hand, the PD-type control gains could adopt many useful available criteria and tools of conventional PD controllers. On the other hand, the SDRE adds nonlinear and optimality characteristics to the controller, i.e., increasing the stability margins. These advantages with the ILC correction part deliver a precise control law with...  相似文献   

8.
作为监督学习的一种变体,多示例学习(MIL)试图从包中的示例中学习分类器。在多示例学习中,标签与包相关联,而不是与单个示例相关联。包的标签是已知的,示例的标签是未知的。MIL可以解决标记模糊问题,但要解决带有弱标签的问题并不容易。对于弱标签问题,包和示例的标签都是未知的,但它们是潜在的变量。现在有多个标签和示例,可以通过对不同标签进行加权来近似估计包和示例的标签。提出了一种新的基于迁移学习的多示例学习框架来解决弱标签的问题。首先构造了一个基于多示例方法的迁移学习模型,该模型可以将知识从源任务迁移到目标任务中,从而将弱标签问题转换为多示例学习问题。在此基础上,提出了一种求解多示例迁移学习模型的迭代框架。实验结果表明,该方法优于现有多示例学习方法。  相似文献   

9.
目前多智能体强化学习算法多采用集中学习,分散行动的框架。该框架存在算法收敛时间过长和可能无法收敛的问题。为了加快多智能体的集体学习时间,提出多智能体分组学习策略。通过使用循环神经网络预测出多智能体的分组矩阵,通过在分组内部共享智能体之间经验的机制,提高了多智能体的团队学习效率;同时,为了弥补分组带来的智能体无法共享信息的问题,提出了信息微量的概念在所有智能体之间传递部分全局信息;为了加强分组内部优秀经验的留存,提出了推迟组内优秀智能体死亡时间的生灭过程。最后,在迷宫实验中,训练时间比MADDPG减少12%;夺旗实验中,训练时间比MADDPG减少17%。  相似文献   

10.
The integration of reinforcement learning (RL) and imitation learning (IL) is an important problem that has long been studied in the field of intelligent robotics. RL optimizes policies to maximize the cumulative reward, whereas IL attempts to extract general knowledge about the trajectories demonstrated by experts, i.e, demonstrators. Because each has its own drawbacks, many methods combining them and compensating for each set of drawbacks have been explored thus far. However, many of these methods are heuristic and do not have a solid theoretical basis. This paper presents a new theory for integrating RL and IL by extending the probabilistic graphical model (PGM) framework for RL, control as inference. We develop a new PGM for RL with multiple types of rewards, called probabilistic graphical model for Markov decision processes with multiple optimality emissions (pMDP-MO). Furthermore, we demonstrate that the integrated learning method of RL and IL can be formulated as a probabilistic inference of policies on pMDP-MO by considering the discriminator in generative adversarial imitation learning (GAIL) as an additional optimality emission. We adapt the GAIL and task-achievement reward to our proposed framework, achieving significantly better performance than policies trained with baseline methods.  相似文献   

11.
不平衡数据广泛存在于现实生活中,代价敏感学习能有效解决这一问题。然而,当数据的标记信息有限或不足时,代价敏感学习分类器的分类精度大大下降,分类性能得不到保证。针对这一情况,该文提出了一种局部几何保持的Laplacian代价敏感支持向量机(LPCS-LapSVM),该模型基于半监督学习框架,将代价敏感学习和类内局部保持散度的思想引入其中,从考虑内在可分辨信息和样本的局部几何分布两方面来提高代价敏感支持向量机在标记信息有限的场景中的分类性能。UCI数据集上的实验结果表明了该算法的有效性。  相似文献   

12.
Motion planning is an important problem in character animation and interactive simulation. However, few planning methods have considered domain‐specific knowledge that governs the agent's behaviors, and none of them is capable of planning the interactive task in which the agent interacts with the objects in the virtual environment. This paper presents a novel method to plan the interactive task based on Q‐learning for intelligent characters. The approach can be described as a three‐phase framework: data preprocessing phase, controller learning phase, and motion‐synthesis phase. In the data preprocessing phase, we abstract the motion clips as high‐level behaviors and construct the interactive behavior graph (IBG) to define the interactive capabilities of the agent in terms of interactive features. For the controller training phase, with IBG, Q‐learning algorithm is employed to train the control policy in the discrete domain with interactive features. In the motion‐synthesis phase, the optimal motion sequences can be generated by following the policy to accomplish the interactive task finally. The experimental results demonstrate that the uniform framework can generate reasonable and realistic motion sequences to plan interactive task in complex environment. Copyright © 2012 John Wiley & Sons, Ltd.  相似文献   

13.
赵恒军  李权忠  曾霞  刘志明 《软件学报》2022,33(7):2538-2561
信息物理系统(cyber-physicalsystem,CPS)的安全控制器设计是一个热门研究方向,现有基于形式化方法的安全控制器设计存在过度依赖模型、可扩展性差等问题.基于深度强化学习的智能控制可处理高维非线性复杂系统和不确定性系统,正成为非常有前景的CPS控制技术,但是缺乏对安全性的保障.针对强化学习控制在安全性方面的不足,围绕一个工业油泵控制系统典型案例,开展安全强化学习算法和智能控制应用研究.首先,形式化了工业油泵控制的安全强化学习问题,搭建了工业油泵仿真环境;随后,通过设计输出层结构和激活函数,构造了神经网络形式的油泵控制器,使得油泵开关时间的线性不等式约束得到满足;最后,为了更好地权衡安全性和最优性控制目标,基于增广拉格朗日乘子法设计实现了新型安全强化学习算法.在工业油泵案例上的对比实验表明,该算法生成的控制器在安全性和最优性上均超越了现有同类算法.在进一步评估中,所生成神经网络控制器以90%的概率通过了严格形式化验证;同时,与理论最优控制器相比实现了低至2%的最优目标值损失.所提方法有望推广至更多应用场景,实例研究的方案有望为安全智能控制和形式化验证领域其他学者提供借鉴.  相似文献   

14.
问题分类是问答系统中的重要组成部分。但现阶段的问题分类需要人工制定提取特征的策略和不断优化特征规则。深度学习方法在问题分类上具有可行性,通过自我学习特征的方式表示和理解问题,避免人工特征的制定,从而减少人工代价。针对问题分类,改进了长短期记忆人工神经网络(LSTM)和卷积神经网络(CNN)模型,并结合两者的优势组合成为一种新的学习框架(LSTM-MFCNN),加强对词序语义和深度特征的学习。实验结果表明,该方法在不需要制定繁琐的特征规则的条件下,仍然有较好的表现,准确率达到了93.08%。  相似文献   

15.
Though, Unified Modeling Language (UML), Ontology, and Text categorization approaches have been used to automate the classification and selection of design pattern(s). However, there are certain issues such as time and effort for formal specification of new patterns, system context-awareness, and lack of knowledge which needs to be addressed. We propose a framework (i.e. Three-phase method) to discuss these issues, which can aid novice developers to organize and select the correct design pattern(s) for a given design problem in a systematic way. Subsequently, we propose an evaluation model to gauge the efficacy of the proposed framework via certain unsupervised learning techniques. We performed three case studies to describe the working procedure of the proposed framework in the context of three widely used design pattern catalogs and 103 design problems. We find the significant results of Fuzzy c-means and Partition Around Medoids (PAM) as compared to other unsupervised learning techniques. The promising results encourage the applicability of the proposed framework in terms of design patterns organization and selection with respect to a given design problem.  相似文献   

16.
现有进化算法大都从问题的零初始信息开始搜索最优解, 没有利用先前解决相似问题时获得的历史信息, 在一定程度上浪费了计算资源.将迁移学习的思想扩展到进化优化领域, 本文研究一种基于相似历史信息迁移学习的进化优化框架.从已解决问题的模型库中找到与新问题匹配的历史问题, 将历史问题对应的知识迁移到新问题的求解过程中, 以提高种群的搜索效率.首先, 定义一种基于多分布估计的最大均值差异指标, 用来评价新问题与历史模型之间的匹配程度; 接着, 将相匹配的历史问题的知识迁移到新问题中, 给出一种基于模型匹配程度的进化种群初始化策略, 以加快算法的搜索速度; 然后, 给出一种基于迭代聚类的代表个体保存策略, 保留求解过程中产生的优势信息, 用于更新历史模型库; 最后, 将自适应骨干粒子群优化算法嵌入到所提框架, 给出一种基于相似历史信息迁移学习的骨干粒子群优化算法.针对多个改进的典型测试函数, 实验结果表明, 所提迁移策略可以加速粒子群的搜索过程, 显著提高算法的收敛速度和搜索效率.  相似文献   

17.
针对传统的分类器集成的每次迭代通常是将单个最优个体分类器集成到强分类器中,而其它可能有辅助作用的个体分类器被简单抛弃的问题,提出了一种基于Boosting框架的非稀疏多核学习方法MKL-Boost,利用了分类器集成学习的思想,每次迭代时,首先从训练集中选取一个训练子集,然后利用正则化非稀疏多核学习方法训练最优个体分类器,求得的个体分类器考虑了M个基本核的最优非稀疏线性凸组合,通过对核组合系数施加LP范数约束,一些好的核得以保留,从而保留了更多的有用特征信息,差的核将会被去掉,保证了有选择性的核融合,然后将基于核组合的最优个体分类器集成到强分类器中。提出的算法既具有Boosting集成学习的优点,同时具有正则化非稀疏多核学习的优点,实验表明,相对于其它Boosting算法,MKL-Boost可以在较少的迭代次数内获得较高的分类精度。  相似文献   

18.
On one hand, multiple object detection approaches of Hough transform (HT) type and randomized HT type have been extended into an evidence accumulation featured general framework for problem solving, with five key mechanisms elaborated and several extensions of HT and RHT presented. On the other hand, another framework is proposed to integrate typical multi-learner based approaches for problem solving, particularly on Gaussian mixture based data clustering and local subspace learning, multi-sets mixture based object detection and motion estimation, and multi-agent coordinated problem solving. Typical learning algorithms, especially those that base on rival penalized competitive learning (RPCL) and Bayesian Ying-Yang (BYY) learning, are summarized from a unified perspective with new extensions. Furthermore, the two different frameworks are not only examined with one viewed crossly from a perspective of the other, with new insights and extensions, but also further unified into a general problem solving paradigm that consists of five basic mechanisms in terms of acquisition, allocation, amalgamation, admission, and affirmation, or shortly A5 paradigm.  相似文献   

19.
点击欺诈是近年来最常见的网络犯罪手段之一,互联网广告行业每年都会因点击欺诈而遭受巨大损失。为了能够在海量点击中有效地检测欺诈点击,构建了多种充分结合广告点击与时间属性关系的特征,并提出了一种点击欺诈检测的集成学习框架——CAT-RFE集成学习框架。CAT-RFE集成学习框架包含3个部分:基分类器、递归特征消除(RFE,recursive feature elimination)和voting集成学习。其中,将适用于类别特征的梯度提升模型——CatBoost(categorical boosting)作为基分类器;RFE是基于贪心策略的特征选择方法,可在多组特征中选出较好的特征组合;Voting集成学习是采用投票的方式将多个基分类器的结果进行组合的学习方法。该框架通过CatBoost和RFE在特征空间中获取多组较优的特征组合,再在这些特征组合下的训练结果通过voting进行集成,获得集成的点击欺诈检测结果。该框架采用了相同的基分类器和集成学习方法,不仅克服了差异较大的分类器相互制约而导致集成结果不理想的问题,也克服了RFE在选择特征时容易陷入局部最优解的问题,具备更好的检测能力。在实际互联网点击欺诈数据集上的性能评估和对比实验结果显示,CAT-RFE集成学习框架的点击欺诈检测能力超过了CatBoost模型、CatBoost和RFE组合的模型以及其他机器学习模型,证明该框架具备良好的竞争力。该框架为互联网广告点击欺诈检测提供一种可行的解决方案。  相似文献   

20.
In this paper,a data-driven conflict-aware safe reinforcement learning(CAS-RL)algorithm is presented for control of autonomous systems.Existing safe RL results with predefined performance functions and safe sets can only provide safety and performance guarantees for a single environment or circumstance.By contrast,the presented CAS-RL algorithm provides safety and performance guarantees across a variety of circumstances that the system might encounter.This is achieved by utilizing a bilevel learning control architecture:A higher metacognitive layer leverages a data-driven receding-horizon attentional controller(RHAC)to adapt relative attention to different system’s safety and performance requirements,and,a lower-layer RL controller designs control actuation signals for the system.The presented RHAC makes its meta decisions based on the reaction curve of the lower-layer RL controller using a metamodel or knowledge.More specifically,it leverages a prediction meta-model(PMM)which spans the space of all future meta trajectories using a given finite number of past meta trajectories.RHAC will adapt the system’s aspiration towards performance metrics(e.g.,performance weights)as well as safety boundaries to resolve conflicts that arise as mission scenarios develop.This will guarantee safety and feasibility(i.e.,performance boundness)of the lower-layer RL-based control solution.It is shown that the interplay between the RHAC and the lower-layer RL controller is a bilevel optimization problem for which the leader(RHAC)operates at a lower rate than the follower(RL-based controller)and its solution guarantees feasibility and safety of the control solution.The effectiveness of the proposed framework is verified through a simulation example.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号