期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

1.

赵铭慧张雪波郭宪欧勇盛《控制理论与应用》2021,38(12):1901-1910

为了解决复杂装配模型的序列规划问题,并使算法对任意初始状态具有较高的适应性,本文提出了一种包含正向装配以及逆向拆解的一体化双向装配序列规划方法BASPW–DQN.针对复杂装配模型,首先进行了一体化装配序列规划的问题描述与形式化表示;在此基础上,引入了课程学习及迁移学习方法,对包含前向装配和逆向错误零件拆卸两部分过程的双向装配序列规划方法进行研究.在所搭建的ROS-Gazebo与TensorFlow相结合的仿真平台上进行了验证,测试结果证明此双向网络对于任意初始状态(包括零装配、部分装配、误装配等初始状态)的装配任务均可以在较少步数内完成,验证了所提方法对于解决装配序列规划问题的有效性与适应性. 相似文献

2.

面向产品装配序列规划的智能优化算法库

敬石开李连升曾森刘继红《计算机辅助设计与图形学学报》2010,22(9)

针对复杂产品装配规划的组合爆炸和盲目搜索难题,以及单个智能优化算法各自存在的缺点,提出一种用装配序列智能优化算法库解决装配序列规划问题的方法.装配序列规划智能优化算法库主要由算法顾问和算法池构成,算法顾问依据装配规划问题的描述、算法可量化性能的主要参考指标和经验公式,向装配规划人员推荐解决规划任务的最合适算法;算法池包括改进的遗传算法、蚁群算法和模拟退火算法等3种智能优化算法.建立了统一的装配序列规划优化模型和智能算法评价指标体系,并给出了装配序列规划智能优化算法库的具体操作流程.最后结合瓶塞开启机实例,验证了该算法库为装配规划人员推荐的智能优化算法是合理的. 相似文献

3.

基于离散时间最优控制的航空发动机装配序列规划

汤新民钟诗胜《控制与决策》2008,23(11)

为实现航空发动机维修差错的控制,采用基于优先约束关系的装配子网对发动机部件装配序列建模.在给定的装配评价准则下,将装配序列规划问题转化为最优变迁激发序列问题.引入离散时间的Pontryagin最小值原理(DTPMP),将极小化哈密顿函数这一全局优化的必要条件作为求解零部件装配序列的启发信息.为避免潜在死锁,给出了最优变迁激发序列算法.最后对最优装配序列规划算法的分析显示,该算法有多项式时间的复杂度. 相似文献

4.

基于改进人工萤火虫算法的装配序列规划研究

陆屹程培源齐悦程月蒙《测控技术》2016,35(3):140-144

装配是装备保养维护的重要环节,高效和无损地装配好拆卸维护的零件在战场上尤为重要.为了解决装配序列规划最优解问题,根据装配序列规划的特点,提出了基于人工萤火虫算法的离散SA-GSO算法.首先利用干涉矩阵对装配序列进行了可行性分析,并根据操作实际设定了适应度函数;然后针对人工萤火虫算法存在的易早熟等缺陷,利用模拟退火原理进行优化并对算法进行离散化,以适用于装配序列最优解问题;最后进行了实例验证,实验结果证明了该算法的可行性及有效性. 相似文献

5.

基于果蝇优化算法的多工位装配序列规划

袁文兵常亮徐周波古天龙《计算机科学》2017,44(4):246-251

为同时解决产品装配序列规划和多工位分配问题,提出一种面向复杂产品的基于果蝇优化算法的多工位装配序列规划方法。首先,基于果蝇优化算法设计了针对求解序列的编码体系;其次,采用多子种群并行搜索模式,重新设计了果蝇优化算法的搜索过程;然后,为了综合考虑多工位上相关装配操作成本的影响,提出了新的适应度函数表达式,并将适应度函数与优先序列矩阵结合起来对进化过程进行引导,实现了对产品装配序列和工位分配顺序的优化;最后,以飞机起落架为例,验证了所提方法在解决多目标优化问题方面的有效性。相似文献

6.

装配序列规划问题的CSP模型及其符号OBDD求解技术 总被引：1，自引：0，他引：1

徐周波古天龙《计算机辅助设计与图形学学报》2010,22(5)

完全、正确的可行装配序列的表示和生成是装配序列评价、优化和选择的前提,为此建立了单调非线性装配意义下的可行装配序列规划问题的约束满足问题(CSP)模型,并给出了基于有序二叉决策图(OBDD)的符号求解算法.首先以装配联接图和移动向量函数为装配体模型,给出了装配联接图模型的共享二叉决策图(SBDD)表示、移动向量函数的OBDD表示,以及装配序列规划问题的CSP描述;然后将生成所有可行装配序列的问题转化为对CSP求解所有可能解的问题,利用回溯算法对CSP问题进行符号OBDD求解,得到了满足几何可行性约束的所有可行装配序列.最后通过装配体实验验证了基于CSP模型和OBDD推理的装配序列生成技术的正确性和可行性. 相似文献

7.

基于可拆卸性分析和图方法的目标拆卸序列规划

《微型机与应用》2015,(17):85-88

针对现有目标拆卸序列求解方法在拆卸并行性规划和计算效率上存在不足的情况,提出了一种目标拆卸序列优化算法。基于球面映射、非正交干涉矩阵以及复合拆卸路径判别等方法的运用与改进,研究了零部件可拆卸性分析方法;应用分层图方法,对目标零件进行定位,结合子装配体识别法,去除冗余拆卸步骤,并提高拆卸过程的并行性。通过实例对该方法的有效性进行了验证。相似文献

8.

基于离散时间最优控制的航空发动机装配序列规划

汤新民钟诗胜《控制与决策》2008,23(11):1221-1225

为实现航空发动机维修差错的控制,采用基于优先约束关系的装配子网对发动机部件装配序列建模. 在给定的装配评价准则下,将装配序列规划问题转化为最优变迁激发序列问题. 引入离散时间的 Pontryagin最小值原理(DTPMP),将极小化哈密顿函数这一全局优化的必要条件作为求解零部件装配序列的启发信息 .为避免潜在死锁,给出了最优变迁激发序列算法. 最后对最优装配序列规划算法的分析显示,该算法有多项式时间的复杂度.

相似文献

9.

基于多智能体强化学习的微装配任务规划方法

下载免费PDF全文

徐兴辉唐大林顾书豪左佳祺王晓东任同群《计算机测量与控制》2023,31(8):217-223

现有装配任务规划方式多为人工规划，存在低效、高成本、易误操作等问题，为此分析了微装配操作的任务特点，以及对微装配中多操作臂协作与竞争关系进行了详细分析，并提出多智能体强化学习中符合微装配任务特点的动作空间、状态空间以及奖励函数的构建方法；利用CoppeliaSim仿真软件构建合理的仿真模型，对已有设备进行物理建模，构建了基于多智能体深度确定性策略梯度算法的学习模型并进行训练，在仿真环境中对设计的状态、动作空间以及奖励函数进行了逐项实验验证，最终获得了稳定的路径以及完整的任务实施方案；仿真结果表明，提出的环境构建方法，更契合直角坐标运动为主要框架的微装配任务，能够克服现有规划方法的不足，能够实现可实际工程化的多臂协同操作，提高任务的效率以及规划的自动化程度。相似文献

10.

运用有向联接件知识求解装配序列规划 总被引：5，自引：0，他引：5

董天阳童若锋张玲董金祥《计算机辅助设计与图形学学报》2004,16(1):128-133

在分析联接件特征的基础上，提出了有向联接件的概念，建立基于联接件知识的零件表达式；运用有向联接件知识求解装配序列规划，实现了知识推理算法和几何约束推理算法的有机结合，有效地降低了装配规划的计算复杂度，保证规划所得装配序列的合理性和实用性．此外，还建立了装配体的自由空间图，可以解决非单调装配序列规划问题。相似文献

11.

基于强化学习的多目标车辆跟随决策算法

邓小豪侯进谭光鸿万斌杨曹婷婷《控制与决策》2021,36(10):2497-2503

为满足自适应巡航系统跟车模式下的舒适性需求并兼顾车辆安全性和行车效率,解决已有算法泛化性和舒适性差的问题,基于深度确定性策略梯度算法(deep deterministic policy gradient,DDPG),提出一种新的多目标车辆跟随决策算法.根据跟随车辆与领航车辆的相互纵向运动学特性,建立车辆跟随过程的马尔可夫决策过程(Markov decision process,MDP)模型.结合最小安全距离模型,设计一个高效、舒适、安全的车辆跟随决策算法.为提高模型收敛速度,改进了DDPG算法经验样本的存储方式和抽取策略,根据经验样本重要性的不同,对样本进行分类存储和抽取.针对跟车过程的多目标结构,对奖赏函数进行模块化设计.最后,在仿真环境下进行测试,当测试环境和训练环境不同时,依然能顺利完成跟随任务,且性能优于已有跟随算法. 相似文献

12.

Learning classifier systems from a reinforcement learning perspective

P. L. Lanzi 《Soft Computing - A Fusion of Foundations, Methodologies and Applications》2002,6(3-4):162-170

We analyze learning classifier systems in the light of tabular reinforcement learning. We note that although genetic algorithms are the most distinctive feature of learning classifier systems, it is not clear whether genetic algorithms are important to learning classifiers systems. In fact, there are models which are strongly based on evolutionary computation (e.g., Wilson's XCS) and others which do not exploit evolutionary computation at all (e.g., Stolzmann's ACS). To find some clarifications, we try to develop learning classifier systems “from scratch”, i.e., starting from one of the most known reinforcement learning technique, Q-learning. We first consider thebasics of reinforcement learning: a problem modeled as a Markov decision process and tabular Q-learning. We introduce a formal framework to define a general purpose rule-based representation which we use to implement tabular Q-learning. We formally define generalization within rules and discuss the possible approaches to extend our rule-based Q-learning with generalization capabilities. We suggest that genetic algorithms are probably the most general approach for adding generalization although they might be not the only solution. 相似文献

13.

Continuous-action reinforcement learning with fast policy search and adaptive basis function selection

Xin Xu Chunming Liu Dewen Hu 《Soft Computing - A Fusion of Foundations, Methodologies and Applications》2011,15(6):1055-1070

As an important approach to solving complex sequential decision problems, reinforcement learning (RL) has been widely studied in the community of artificial intelligence and machine learning. However, the generalization ability of RL is still an open problem and it is difficult for existing RL algorithms to solve Markov decision problems (MDPs) with both continuous state and action spaces. In this paper, a novel RL approach with fast policy search and adaptive basis function selection, which is called Continuous-action Approximate Policy Iteration (CAPI), is proposed for RL in MDPs with both continuous state and action spaces. In CAPI, based on the value functions estimated by temporal-difference learning, a fast policy search technique is suggested to search for optimal actions in continuous spaces, which is computationally efficient and easy to implement. To improve the generalization ability and learning efficiency of CAPI, two adaptive basis function selection methods are developed so that sparse approximation of value functions can be obtained efficiently both for linear function approximators and kernel machines. Simulation results on benchmark learning control tasks with continuous state and action spaces show that the proposed approach not only can converge to a near-optimal policy in a few iterations but also can obtain comparable or even better performance than Sarsa-learning, and previous approximate policy iteration methods such as LSPI and KLSPI. 相似文献

14.

移动边缘计算中基于深度强化学习的计算卸载调度方法

詹文翰王瑾朱清新段翰聪叶娅兰《计算机应用研究》2021,38(1):241-245,263

针对移动边缘计算中具有依赖关系的任务的卸载决策问题,提出一种基于深度强化学习的任务卸载调度方法,以最小化应用程序的执行时间。任务调度的过程被描述为一个马尔可夫决策过程,其调度策略由所提出的序列到序列深度神经网络表示,并通过近端策略优化(proximal policy optimization)方法进行训练。仿真实验表明,所提出的算法具有良好的收敛能力,并且在不同环境下的表现均优于所对比的六个基线算法,证明了该方法的有效性和可靠性。相似文献

15.

Asynchronous Stochastic Approximation and Q-Learning 总被引：21，自引：6，他引：15

Tsitsiklis John N. 《Machine Learning》1994,16(3):185-202

We provide some general results on the convergence of a class of stochastic approximation algorithms and their parallel and asynchronous variants. We then use these results to study the Q-learning algorithm, a reinforcement learning method for solving Markov decision problems, and establish its convergence under conditions more general than previously available. 相似文献

16.

Automatic generation of assembly system configuration with equipment selection for automotive battery manufacturing

Sha Li Hui WangS. Jack Hu Yhu-Tin LinJeffrey A. Abell 《Journal of Manufacturing Systems》2011,30(4):188-195

High power and high capacity lithium-ion batteries are being adopted for electrical and hybrid electrical vehicles (EV/HEV) applications. An automotive Li-ion battery pack usually has a hierarchical composition of components assembled in some repetitive patterns. Such a product assembly hierarchy may facilitate automatic configuration of assembly systems including assembly task grouping, sequence planning, and equipment selection. This paper utilizes such a hierarchical composition in generating system configurations with equipment selection for optimal assembly system design. A recursive algorithm is developed to generate feasible assembly sequences and the initial configurations including hybrid configurations. The generated configurations are embedded in an optimal assembly system design problem for simultaneous equipment selection and task assignment by minimizing equipment investment cost. The complexity of the computational algorithm is also discussed. 相似文献

17.

面向装配工序交叉的虚拟装配工艺信息模型 总被引：5，自引：1，他引：4

王念东刘毅李文正周军李澍《计算机辅助设计与图形学学报》2009,21(9)

虚拟装配工艺规划中目前大都采用装配顺序描述装配过程,但尚无法表达装配工序交叉和设计中的安装布置'捌整等问题.考虑设计阶段和制造阶段的装配要求,建立了基于装配任务的装配工艺信息模型.该模型以多层次的装配进程表达产品装配工艺过程,通过任务对象的交叉安装表达工序交叉和安装布置凋整,并给出了装配调整策略;针对冗余装配任务提出了装配任务合并方法.最后通过实例验证了文中模型的正确性和有效性. 相似文献

18.

Assembly sequence planning of rigid and flexible parts

《Journal of Manufacturing Systems》2015

Assembly sequence planning (ASP) is the process of computing a sequence of assembly motions for constituent parts of an assembled final product. ASP is proven to be NP-hard and thus its effective and efficient solution has been a challenge for the researchers in the field. Despite the fact that most assembled products like ships, aircrafts and automobiles are composed of rigid and flexible parts, no work exists for assembly/disassembly sequence planning of flexible parts. This paper lays out a theoretical ground for modeling the deformability of flexible assembly parts by introducing the concept of Assembly stress matrix (ASM) to describe interference relations between parts of an assembly and the amount of compressive stress needed for assembling flexible parts. Also, the Scatter Search (SS) optimization algorithm is customized for this problem to produce high-quality solutions by simultaneously minimizing both the maximum applied stress exerted for performing assembly operations and the number of assembly direction changes. The parameters of this algorithm are tuned by a TOPSIS-Taguchi based tuning method. A number of ASP problems with rigid and flexible parts were solved by the presented SS and other algorithms like Genetic and Memetic algorithms, Simulated Annealing, Breakout Local Search, Iterated Local Search, and Multistart Local Search, and the results and their in-depth statistical analyses showed that the SS outperformed other algorithms by producing the best-known or optimal solutions with highest success rates. 相似文献

19.

The Loss from Imperfect Value Functions in Expectation-Based and Minimax-Based Tasks 总被引：1，自引：0，他引：1

Heger Matthias 《Machine Learning》1996,22(1-3):197-225

Many reinforcement learning (RL) algorithms approximate an optimal value function. Once the function is known, it is easy to determine an optimal policy. For most real-world applications, however, the value function is too complex to be represented by lookup tables, making it necessary to use function approximators such as neural networks. In this case, convergence to the optimal value function is no longer guaranteed and it becomes important to know to which extent performance diminishes when one uses approximate value functions instead of optimal ones. This problem has recently been discussed in the context of expectation based Markov decision problems. Our analysis generalizes this work to minimax-based Markov decision problems, yields new results for expectation-based tasks, and shows how minimax-based and expectation based Markov decision problems relate. 相似文献

20.

Learning cooperative grasping with the graph representation of a state-action space

Markus Jianwei 《Robotics and Autonomous Systems》2002,38(3-4):183-195

In this paper we present a method for two robot manipulators to learn cooperative tasks. If a single robot is unable to grasp an object in a certain orientation, it can only continue with the help of other robots. The grasping can be realized by a sequence of cooperative operations that re-orient the object. Several sequences are needed to handle the different situations in which an object is not graspable for the robot. It is shown that a distributed learning method based on a Markov decision process is able to learn the sequences for the involved robots, a master robot that needs to grasp and a helping robot that supports him with the re-orientation. A novel state-action graph is used to store the reinforcement values of the learning process. Further an example of aggregate assembly shows the generality of this approach. 相似文献