共查询到20条相似文献,搜索用时 10 毫秒
1.
This paper presents industrial applications for improving the capability of the fine-pitch stencil printing process (SPP) based on the DMAIC framework and using Taguchi-based methodologies. SPP is widely recognized as the main contributor of soldering defects in a surface mount assembly (SMA). An inadequate volume of solder paste deposition or poor printing quality can cause soldering defects and lead to significant reworking and repairing costs. In practice, both the desired amount of solder paste volume (quantitative index) and printing quality (qualitative index) are preferably used to monitor the SPP for the reduction of soldering defects during the statistical control process (SPC), particularly for a fine-pitch solder paste printing operation. To continuously improve SPP capability, the DMAIC framework is followed and Taguchi-based methodologies are proposed under the considerations of single characteristic performance index (SCPI) and multiple characteristic performance indices (MCPI). The SCPI is optimized using the conventional Taguchi method. Then, a Taguchi fuzzy-based model is developed to optimize the SPP with the MCPI property. Optimizing a multi-response problem by the Taguchi method involves the engineer's judgment which tends to increase the degree of uncertainty. The performance of these two approaches is compared through the process capability metric, and the material and factors significantly affecting the fine-pitch SPP performance are reported. 相似文献
2.
Surface mount assembly defect problems can cause significant production-time losses. About 60% of surface mount assembly defects can be attributed to the solder paste stencil printing process. This paper proposes a neurofuzzy-based quality-control system for the fine pitch stencil printing process. The neurofuzzy approach is used to model the nonlinear behavior of the stencil printing process. Eight control variables are defined for process planning and control, including stencil thickness, component pitch, aperture area, snap-off height, squeegee speed, squeegee pressure, solder paste viscosity, and solder paste particle size. The response variables are the volume and height of solder paste deposited. The values of the response variables provide indicators for identifying potential quality problems. A 38–3 fractional factorial experimental design is conducted to collect structured data to augment those collected from the production line for neurofuzzy learning and modeling. Visual basic programming language is then used for both rule retrieval and graphical-user-interface modeling. The effectiveness of the proposed system is illustrated through a real-world application. 相似文献
3.
The solder paste printing (SPP) is a critical procedure in a surface mount technology (SMT) based assembly line, which is one of the major attributes to the defect of the printed circuit boards (PCBs). The quality of SPP is influenced by multiple factors, such as the squeegee speed, pressure, the stencil separation speed, cleaning frequency, and cleaning profile. During printing, the printer environment is dynamically varying due to the physical change of solder paste, which can result in a dynamic variation of the relationships between the printing results and the influential factors. To reduce the printing defects, it is critical to understand such dynamic relationships. This research focuses on determining the printing performance during printing by implementing a wavelet filtering-based temporal recurrent neural network. To reduce the noise factor in the solder paste inspection (SPI) data, this research applies a three-dimensional dual-tree complex wavelet transformation for low-pass noise filtering and signal reconstruction. A recurrent neural network is utilized to model the performance prediction with low noise interference. Both printing sequence and process setting information are considered in the proposed recurrent network model. The proposed approach is validated using practical dataset and compared with other commonly used data mining approaches. The results show that the proposed wavelet-based multi-dimensional temporal recurrent neural network can effectively predict the printing process performance and can be a high potential approach in reducing the defects and controlling cleaning frequency. The proposed model is expected to advance the current research in the application of smart manufacturing in surface mount technology. 相似文献
4.
In this paper, we used data mining techniques for the automatic discovering of useful temporal abstraction in reinforcement learning. This idea was motivated by the ability of data mining algorithms in automatic discovering of structures and patterns, when applied to large data sets. The state transitions and action trajectories of the learning agent are stored as the data sets for data mining techniques. The proposed state clustering algorithms partition the state space to different regions. Policies for reaching different parts of the space are separately learned and added to the model in a form of options (macro-actions). The main idea of the proposed action sequence mining is to search for patterns that occur frequently within an agent’s accumulated experience. The mined action sequences are also added to the model in a form of options. Our experiments with different data sets indicate a significant speedup of the Q-learning algorithm using the options discovered by the state clustering and action sequence mining algorithms. 相似文献
5.
Stock trading is an important decision-making problem that involves both stock selection and asset management. Though many promising results have been reported for predicting prices, selecting stocks, and managing assets using machine-learning techniques, considering all of them is challenging because of their complexity. In this paper, we present a new stock trading method that incorporates dynamic asset allocation in a reinforcement-learning framework. The proposed asset allocation strategy, called meta policy (MP), is designed to utilize the temporal information from both stock recommendations and the ratio of the stock fund over the asset. Local traders are constructed with pattern-based multiple predictors, and used to decide the purchase money per recommendation. Formulating the MP in the reinforcement learning framework is achieved by a compact design of the environment and the learning agent. Experimental results using the Korean stock market show that the proposed MP method outperforms other fixed asset-allocation strategies, and reduces the risks inherent in local traders. 相似文献
6.
Efficiency and accuracy are critical in the motion control of a batch process. This paper proposes a new intelligent motion control method for a batch process based on reinforcement learning (RL) and iterative learning control (ILC). The proposed learning-based motion control method enables the system to learn from its previous experience. The motion control method can be divided into two parts: (1) RL-based trajectory optimization and (2) ILC-based positioning control. Experiments were conducted to demonstrate the effectiveness of the proposed method. The results indicate that the proposed method not only reduces the process time effectively while ensuring system stability, but also achieves excellent positioning accuracy. 相似文献
7.
Adaptive immunity based reinforcement learning 总被引:2,自引:2,他引:0
Jungo Ito Kazushi Nakano Kazunori Sakurama Shu Hosokawa 《Artificial Life and Robotics》2008,13(1):188-193
Recently much attention has been paid to intelligent systems which can adapt themselves to dynamic and/or unknown environments
by the use of learning methods. However, traditional learning methods have a disadvantage that learning requires enormously
long amounts of time with the degree of complexity of systems and environments to be considered. We thus propose a novel reinforcement
learning method based on adaptive immunity. Our proposed method can provide a near-optimal solution with less learning time
by self-learning using the concept of adaptive immunity. The validity of our method is demonstrated through some simulations
with Sutton’s maze problem.
This work was present in part at the 13th International Symposium on Artificial Life and Robotics, Oita, Japan, January 31–February
2, 2008 相似文献
8.
This paper presents some results from a study of biped dynamic walking using reinforcement learning. During this study a hardware biped robot was built, a new reinforcement learning algorithm as well as a new learning architecture were developed. The biped learned dynamic walking without any previous knowledge about its dynamic model. The self scaling reinforcement (SSR) learning algorithm was developed in order to deal with the problem of reinforcement learning in continuous action domains. The learning architecture was developed in order to solve complex control problems. It uses different modules that consist of simple controllers and small neural networks. The architecture allows for easy incorporation of new modules that represent new knowledge, or new requirements for the desired task. 相似文献
9.
In this article, we propose a new control method using reinforcement learning (RL) with the concept of sliding mode control
(SMC). Some remarkable characteristics of the SMC method are good robustness and stability for deviations from control conditions.
On the other hand, RL may be applicable to complex systems that are difficult to model. However, applying reinforcement learning
to a real system has a serious problem, i.e., many trials are required for learning. We intend to develop a new control method
with good characteristics for both these methods. To realize it, we employ the actor-critic method, a kind of RL, to unite
with the SMC. We are able to verify the effectiveness of the proposed control method through a computer simulation of inverted
pendulum control without the use of inverted pendulum dynamics. In particular, it is shown that the proposed method enables
the RL to learn in fewer trials than the reinforcement learning method.
This work was presented in part at the 13th International Symposium on Artificial Life and Robotics, Oita, Japan, January
31–February 2, 2008 相似文献
10.
针对无线传感器网络面向移动汇聚节点的自适应路由问题,为实现路由过程中对节点能量以及计算、存储、通信资源的优化利用,并对数据传输时延和投递率等服务质量进行优化,提出一种基于强化学习的自适应路由方法,设计综合的奖赏函数以实现对能量、时延和投递率等多个指标的综合优化。从报文结构、路由初始化、路径选择等方面对路由协议进行详细设计,采用汇聚节点声明以及周期性洪泛机制加速收敛速度,从而支持汇聚节点的快速移动。理论分析表明基于强化学习的路由方法具备收敛快、协议开销低以及存储计算需求小等特点,能够适用于能量和资源受限的传感器节点。在仿真平台中通过性能评估和对比分析验证了所述自适应路由算法的可行性和优越性。 相似文献
11.
Batch or semi-batch processing is becoming more prevalent in industrial chemical manufacturing but it has not benefited from advanced control technologies to a same degree as continuous processing. This is due to its several unique aspects which pose challenges to implementing model-based optimal control, such as its highly nonstationary operation and significant run-to-run variabilities. While existing advanced control methods like model predictive control (MPC) have been extended to address some of the challenges, they still suffer from certain limitations which have prevented their widespread industrial adoption. Reinforcement learning (RL) where the agent learns the optimal policy by interacting with the system offers an alternative to the existing model-based methods and has potential for bringing significant improvements to industrial batch process control practice. With such motivation, this paper examines the advantages that RL offers over the traditional model-based optimal control methods and how it can be tailored to better address the characteristics of industrial batch process control problems. After a brief review of the existing batch control methods, the basic concepts and algorithms of RL are introduced and issues for applying them to batch process control problems are discussed. The nascent literature on the use of RL in batch process control is briefly reviewed, both in recipe optimization and tracking control, and our perspectives on future research directions are shared. 相似文献
12.
A reinforcement learning method based on an immune network adapted to a semi-Markov decision process
Nagahisa Kogawa Masanao Obayashi Kunikazu Kobayashi Takashi Kuremoto 《Artificial Life and Robotics》2009,13(2):538-542
The immune system is attracting attention as a new biological information processing-type paradigm. It is a large-scale system
equipped with a complicated biological defense function. It has functions of memory and learning that use interactions such
as stimulus and suppression between immune cells. In this article, we propose and construct a reinforcement learning method
based on an immune network adapted to a semi-Markov decision process (SMDP). We show that the proposed method is capable of
dealing with a problem which is modeled as a SMDP environment through computer simulation.
This work was presented in part at the 13th International Symposium on Artificial Life and Robotics, Oita, Japan, January
31–February 2, 2008 相似文献
13.
This paper studies evolutionary programming and adopts reinforcement learning theory to learn individual mutation operators. A novel algorithm named RLEP (Evolutionary Programming based on Reinforcement Learning) is proposed. In this algorithm, each individual learns its optimal mutation operator based on the immediate and delayed performance of mutation operators. Mutation operator selection is mapped into a reinforcement learning problem. Reinforcement learning methods are used to learn optimal policies by maximizing the accumulated rewards. According to the calculated Q function value of each candidate mutation operator, an optimal mutation operator can be selected to maximize the learned Q function value. Four different mutation operators have been employed as the basic candidate operators in RLEP and one is selected for each individual in different generations. Our simulation shows the performance of RLEP is the same as or better than the best of the four basic mutation operators. 相似文献
14.
In this paper, we propose fuzzy logic-based cooperative reinforcement learning for sharing knowledge among autonomous robots. The ultimate goal of this paper is to entice bio-insects towards desired goal areas using artificial robots without any human aid. To achieve this goal, we found an interaction mechanism using a specific odor source and performed simulations and experiments [1]. For efficient learning without human aid, we employ cooperative reinforcement learning in multi-agent domain. Additionally, we design a fuzzy logic-based expertise measurement system to enhance the learning ability. This structure enables the artificial robots to share knowledge while evaluating and measuring the performance of each robot. Through numerous experiments, the performance of the proposed learning algorithms is evaluated. 相似文献
15.
16.
Wei-Song Lin Author vitae 《Automatica》2011,(5):1047-1052
Adaptive Optimal Control (AOC) by reinforcement synthesis is proposed to facilitate the application of optimal control theory in feedback controls. Reinforcement synthesis uses the critic–actor architecture of reinforcement learning to carry out sequential optimization. Optimality conditions for AOC are formulated using the discrete minimum principle. A proof of the convergence conditions for the reinforcement synthesis algorithm is presented. As the final time extends to infinity, the reinforcement synthesis algorithm is equivalent to the Dual Heuristic dynamic Programming (DHP) algorithm, a version of approximate dynamic programming. Thus, formulating DHP with the AOC approach has rigorous proofs of optimality and convergence. The efficacy of AOC by reinforcement synthesis is demonstrated by solving a linear quadratic regulator problem. 相似文献
17.
Vijaykumar Gullapalli 《Robotics and Autonomous Systems》1995,15(4):237-246
Complexity and uncertainty in modern robots and other autonomous systems make it difficult to design controllers for such systems that can achieve desired levels of precision and robustness. Therefore learning methods are being incorporated into controllers for such systems, thereby providing the adaptibility necessary to meet the performance demands of the task. We argue that for learning tasks arising frequently in control applications, the most useful methods in practice probably are those we call direct associative reinforcement learning methods. We describe direct reinforcement learning methods and also illustrate with an example the utility of these methods for learning skilled robot control under uncertainty. 相似文献
18.
We described a new preteaching method for re-inforcement learning using a self-organizing map (SOM). The purpose is to increase
the learning rate using a small amount of teaching data generated by a human expert. In our proposed method, the SOM is used
to generate the initial teaching data for the reinforcement learning agent from a small amount of teaching data. The reinforcement
learning function of the agent is initialized by using the teaching data generated by the SOM in order to increase the probability
of selecting the optimal actions it estimates. Because the agent can get high rewards from the start of reinforcement learning,
it is expected that the learning rate will increase. The results of a mobile robot simulation showed that the learning rate
had increased even though the human expert had showed only a small amount of teaching data.
This work was presented in part at the 7th International Symposium on Artificial Life and Robotics, Oita, Japan, January 16–18,
2002 相似文献
19.
The learning of complex control behaviour of autonomous mobile robots is one of the actual research topics. In this article an intelligent control architecture is presented which integrates learning methods and available domain knowledge. This control architecture is based on Reinforcement Learning and allows continuous input and output parameters, hierarchical learning, multiple goals, self-organized topology of the used networks and online learning. As a testbed this architecture is applied to the six-legged walking machine LAURON to learn leg control and leg coordination. 相似文献
20.
无人直升机的姿态增强学习控制设计与验证 总被引:1,自引:0,他引:1
针对小型无人直升机的姿态控制问题,考虑到现有基于模型的控制方法对直升机动力学模型的先验信息依赖较大,以及未建模动态系统的影响等问题,设计了一种基于增强学习(RL)的飞行控制算法.仅利用直升机的在线飞行数据,补偿了未建模不确定性的影响.同时为了抑制外界扰动,提高系统的鲁棒性,设计了一种基于误差符号函数积分的鲁棒(RISE)控制算法.将两种算法结合,并利用基于Lyapunov分析的方法,证明了无人机姿态控制误差的半全局渐近收敛.最后在无人直升机飞行控制实验平台上,进行了姿态控制的实时实验验证.实验结果表明,本文提出的控制方法具有良好的控制效果,对系统不确定性和外界风扰具有良好鲁棒性. 相似文献