共查询到20条相似文献,搜索用时 18 毫秒
1.
In recent years, researchers and practitioners alike have devoted a great deal of attention to supply chain management (SCM). The main focus of SCM is the need to integrate operations along the supply chain as part of an overall logistic support function. At the same time, the need for globalization requires that the solution of SCM problems be performed in an international context as part of what we refer to as Global Supply Chain Management (GSCM). This paper proposes an approach to study GSCM problems using an artificial intelligence framework called reinforcement learning (RL). The RL framework allows the management of global supply chains under an integration perspective. The RL approach has remarkable similarities to that of an autonomous agent network (AAN); a similarity that we shall discuss. The RL approach is applied to a case example, namely a networked production system that spans several geographic areas and logistics stages. We discuss the results and provide guidelines and implications for practical applications. 相似文献
2.
3.
Hendrik Schbe 《Quality and Reliability Engineering International》1994,10(3):229-236
In this paper a stochastic approach to consequence tree analysis is considered. A consequence tree is a set of events logically combined by OR and AND connections that occur in sequence, some being prerequisites for others. Consequence trees are applicable to failure propagation in plants. Facilitating paths and inhibiting cuts are defined and considered. The distribution of the time the system needs to reach a certain top event is obtained. Probability weights are defined that can be used to obtain the weakest link in the consequence tree. 相似文献
4.
This paper addresses a stochastic economic lot scheduling problem (SELSP) for a single machine make-to-stock production system in which the demands and the processing times for N types of products are random. The sequence-independent setup times and costs are explicitly considered and may have different values for various types of products. The SELSP is to decide when, what, and how much (the lot size) to produce so that the long-run average total cost, including setup, holding and backorder costs, is minimised. We develop a mathematical model and propose two reinforcement learning (RL) algorithms for real-time decision-making, in which a decision agent is assigned to the machine and improves the accuracy of its action-selection decisions via a ‘learning’ process. Specifically, one is a Q-learning algorithm for a semi-Markov decision process (QLS) and another is a Q-learning algorithm with a learning-improvement heuristic (QLIH) to further improve the performance of QLS. We compare the performance of QLS and QLIH with a benchmarking Brownian policy and the first-come-first-served policy. The numerical results show that QLIH outperforms QLS and both benchmarking policies. 相似文献
5.
This paper gives a compact, self-contained tutorial survey of reinforcement learning, a tool that is increasingly finding
application in the development of intelligent dynamic systems. Research on reinforcement learning during the past decade has
led to the development of a variety of useful algorithms. This paper surveys the literature and presents the algorithms in
a cohesive framework. 相似文献
6.
A physical approach to structural stochastic optimal controls 总被引:3,自引:0,他引:3
The generalized density evolution equation proposed in recent years profoundly reveals the intrinsic connection between deterministic systems and stochastic systems by introducing physical relationships into stochastic systems. On this basis, a physical stochastic optimal control scheme of structures is developed in this paper, which extends the classical stochastic optimal control methods, and can govern the evolution details of system performance, while the classical stochastic optimal control schemes, such as the LQG control, essentially hold the system statistics since there is still a lack of efficient methods to solve the response process of the stochastic systems with strong nonlinearities in the context of classical random mechanics. It is practically useful to general nonlinear systems driven by non-stationary and non-Gaussian stochastic processes. The celebrated Pontryagin’s maximum principles is employed to conduct the physical solutions of the state vector and the control force vector of stochastic optimal controls of closed-loop systems by synthesizing deterministic optimal control solutions of a collection of representative excitation driven systems using the generalized density evolution equation. Further, the selection strategy of weighting matrices of stochastic optimal controls is discussed to construct optimal control policies based on a control criterion of system second-order statistics assessment. The stochastic optimal control of an active tension control system is investigated, subjected to the random ground motion represented by a physical stochastic earthquake model. The investigation reveals that the structural seismic performance is significantly improved when the optimal control strategy is applied. A comparative study, meanwhile, between the advocated method and the LQG control is carried out, indicating that the LQG control using nominal Gaussian white noise as the external excitation cannot be used to design a reasonable control system for civil engineering structures, while the advocated method can reach the desirable objective performance. The optimal control strategy is then further employed in the investigation of the stochastic optimal control of an eight-storey shear frame. Numerical examples elucidate the validity and applicability of the developed physical stochastic optimal control methodology. 相似文献
7.
This paper proposes and tests an approximation of the solution of a class of piecewise deterministic control problems, typically used in the modeling of manufacturing flow processes. This approximation uses a stochastic programming approach on a suitably discretized and sampled system. The method proceeds through two stages: (i) the Hamilton-Jacobi-Bellman (HJB) dynamic programming equations for the finite horizon continuous time stochastic control problem are discretized over a set of sampled times; this defines an associated discrete time stochastic control problem which, due to the finiteness of the sample path set for the Markov disturbance process, can be written as a stochastic programming problem; and (ii) the very large event tree representing the sample path set is replaced with a reduced tree obtained by randomly sampling over the set of all possible paths. It is shown that the solution of the stochastic program defined on the randomly sampled tree converges toward the solution of the discrete time control problem when the sample size increases to infinity. The discrete time control problem solution converges to the solution of the flow control problem when the discretization mesh tends to zero. A comparison with a direct numerical solution of the dynamic programming equations is made for a single part manufacturing flow control model in order to illustrate the convergence properties. Applications to larger models affected by the curse of dimensionality in a standard dynamic programming techniques show the possible advantages of the method. 相似文献
8.
9.
A reinforcement learning approach to a single leg airline revenue management problem with multiple fare classes and overbooking 总被引:1,自引:0,他引:1
The airline industry strives to maximize the revenue obtained from the sale of tickets on every flight. This is referred to as revenue management and it forms a crucial aspect of airline logistics. Ticket pricing, seat or discount allocation, and overbooking are some of the important aspects of a revenue management problem. Though ticket pricing is usually heavily influenced by factors beyond the control of an airline company, significant amount of control can be exercised over the seat allocation and the overbooking aspects. A realistic model for a single leg of a flight should consider multiple fare classes, overbooking of the flight, concurrent demand arrivals of passengers from the different fare classes, and class-dependent, random cancellations. Accommodating all these factors in one optimization model is a challenging task because that makes it a very large-scale stochastic optimization problem. Almost all papers in the existing literature either accommodate only a subset of these factors or use a discrete approximation in order to make the model tractable. We consider all these factors and cast the single leg problem as a semi-Markov Decision Problem (SMDP) under the average reward optimizing criterion over an infinite time horizon. We solve it using a stochastic optimization technique called Reinforcement Learning. Not only is Reinforcement Learning able to scale up to a huge state-space but because it is simulation-based it can also handle complex modeling assumptions such as the ones mentioned above. The state-space of the numerical test problem scenarios considered here is non-denumerable; its countable part being of the order of 109. Our solution procedure involves a multi-step extension of the SMART algorithm which is based on the one-step Bellman equation. Numerical results presented here show that our approach is able to outperform a heuristic, namely the nested version of the EMSR heuristic, which is widely used in the airline industry. We also present a detailed study of the sensitivity of some modeling parameters via a full factorial experiment. 相似文献
10.
Statistical Process Control (SPC) techniques have been successfully used in manufacturing industries to trigger and identify the root cause of variations so as to promote quality improvement. This paper develops a SPC framework to identify important changes deserved in business activity monitoring. To model and track thousands of diversified customer behaviors, the proposed SPC system consists of efficient and robust profiling methods to accommodate different behavior patterns including business changes, structural breakdowns, and unnecessary errors. Several customer profiling techniques are discussed and the activity monitoring performance based on the profiling algorithms is compared in a simulation example and a customer churn detection example in a telecommunications setting. The enhanced system will allow business managers and engineers to establish successful customer loyalty programs for churn prevention and fraud detection. 相似文献
11.
Bayesian forecasting models provide distributional estimates for random parameters, and relative to classical schemes, have the advantage that they can rapidly capture changes in nonstationary systems using limited historical data. Unlike deterministic optimization, stochastic programs explicitly incorporate distributions for random parameters in the model formulation, and thus have the advantage that the resulting solutions more fully hedge against future contingencies. In this paper, we exploit the strengths of Bayesian prediction and stochastic programming in a rolling-horizon approach that can be applied to solve real-world problems. We illustrate the methodology on an employee production scheduling problem with uncertain up-times of manufacturing equipment and uncertain production rates. Computational results indicate the value of our approach. 相似文献
12.
Stochastic inventory control in multi-echelon systems poses hard problems in optimisation under uncertainty. Stochastic programming can solve small instances optimally, and approximately solve larger instances via scenario reduction techniques, but it cannot handle arbitrary nonlinear constraints or other non-standard features. Simulation optimisation is an alternative approach that has recently been applied to such problems, using policies that require only a few decision variables to be determined. However, to find optimal or near-optimal solutions we must consider exponentially large scenario trees with a corresponding number of decision variables. We propose instead a neuroevolutionary approach: using an artificial neural network to compactly represent the scenario tree, and training the network by a simulation-based evolutionary algorithm. We show experimentally that this method can quickly find high-quality plans using networks of a very simple form. 相似文献
13.
A. Barreiros 《工程优选》2013,45(5):475-488
A new numerical approach to the solution of two-stage stochastic linear programming problems is described and evaluated. The approach avoids the solution of the first-stage problem and uses the underlying deterministic problem to generate a sequence of values of the first-stage variables which lead to successive improvements of the objective function towards the optimal policy. The model is evaluated using an example in which randomness is described by two correlated factors. The dynamics of these factors are described by stochastic processes simulated using lattice techniques. In this way, discrete distributions of the random parameters are assembled. The solutions obtained with the new iterative procedure are compared with solutions obtained with a deterministic equivalent linear programming problem. It is concluded that they are almost identical. However, the computational effort required for the new approach is negligible compared with that needed for the deterministic equivalent problem. 相似文献
14.
15.
《Composites Science and Technology》2002,62(10-11):1381-1395
A 3 dimensional stochastic finite element technique is presented herein for simulating the nonlinear behaviour of strand-based wood composites with strands of varying grain-angle. The approach is based on the constitutive properties of the individual strands to study the effects of varying strand characteristics (such as species or geometry) on the performance of the member. The constitutive properties of the strands are found empirically and are subsequently used in a 3 dimensional finite element program. The program is formulated in a probabilistic manner using random variable material properties as input. The constitutive model incorporates classic plasticity theory whereby anisotropic hardening and eventual failure of the material is established by the Tsai–Wu criterion with an associated flow rule. Failure is marked by an upper bound surface whereupon either perfect plasticity (i.e. ductile behavior) or an abrupt loss of strength and stiffness (i.e. brittle behavior) ensues. The ability of this technique to reproduce experimental findings for the stress–strain curves of angle-ply laminates in tension, compression as well as 3 point bending is validated. 相似文献
16.
Y. Emre Kılıç 《国际生产研究杂志》2013,51(4):1291-1306
In many industries, distribution activities are realised in a dynamic environment including uncertainties. Besides, adding transportation mode alternatives, inventory-stocking opportunities in wholesalers, unmet demand permission in distribution centres, etc. increase the difficulty of problem modelling and solving for large-scale networks. In this study, the problem of physical distribution network (DN) design with profit maximisation objective function is modelled to tackle with realistic cases. Two-stage stochastic mixed-integer programming method is used to handle the uncertainties and to consider the probable scenarios. The first-stage decisions of the proposed model are related with the selection of facility location in strategic level, and the second-stage decisions are related with the transported and stocked products or unmet demand quantities. Here, a multi-product, two-echelon, multi-mode and multi-period network model is applied to a hypothetically created problem, inspired from the physical DN of home appliance companies. Various scenarios including stochastic demand and price data with different realisation probabilities are used in the model. The motivation of this study is the lack of reaching a global optimum result using transportation modes as stochastic parameters, considering their own lead times and capacities. Finally, various results are obtained for different cases and analysed in detail. 相似文献
17.
In the paper, we show how some basic informational quality measures (e.g. the Shannon entropy and the relative entropy/Kullback-Leibler divergence) defined for stochastic dynamical systems change in time and how they depend on the system properties and intensity of random disturbances. First, the Liouvillian systems (when randomness is present in the initial states only) are discussed and then various (linear and nonlinear) systems with random external excitations are treated in detail. Both, general and specific systems are considered, including numerical and graphical illustrations. 相似文献
18.
Yield improvement is one of the most important topics in semiconductor manufacturing. Traditional statistical methods are no longer feasible nor efficient, if possible, in analysing the vast amounts of data in a modern semiconductor manufacturing process. For instance, a typical wafer fabrication process has more than 1000 process parameters to record on a single wafer and one manufacturing plant may produce tens of thousands wafers a day. Traditional approaches have limits in extracting the full benefits of the data. Therefore, the manufacturing data is poorly exploited even in the most sophisticated processes. Now it is widely accepted that machine learning techniques can provide powerful tools for continuous quality improvement in a large and complex process such as semiconductor manufacturing. In this work, memory based reasoning (MBR) and neural network (NN) learning are combined for yield improvement and an integrated framework is proposed for a yield management system based on hybrid machine learning techniques. In this hybrid system of NN and MBR, the feature weight set which is calculated from the trained neural network plays the core role in connecting both learning strategies and the explanation on prediction can be given by obtaining and presenting the most similar examples from the case base. The proposed system has advantages in typical semiconductor manufacturing problems such as scalability to large datasets, high dimensions and adaptability to dynamic situations. 相似文献
19.
Interactions among individuals in natural populations often occur in a dynamically changing environment. Understanding the role of environmental variation in population dynamics has long been a central topic in theoretical ecology and population biology. However, the key question of how individuals, in the middle of challenging social dilemmas (e.g. the ‘tragedy of the commons’), modulate their behaviours to adapt to the fluctuation of the environment has not yet been addressed satisfactorily. Using evolutionary game theory, we develop a framework of stochastic games that incorporates the adaptive mechanism of reinforcement learning to investigate whether cooperative behaviours can evolve in the ever-changing group interaction environment. When the action choices of players are just slightly influenced by past reinforcements, we construct an analytical condition to determine whether cooperation can be favoured over defection. Intuitively, this condition reveals why and how the environment can mediate cooperative dilemmas. Under our model architecture, we also compare this learning mechanism with two non-learning decision rules, and we find that learning significantly improves the propensity for cooperation in weak social dilemmas, and, in sharp contrast, hinders cooperation in strong social dilemmas. Our results suggest that in complex social–ecological dilemmas, learning enables the adaptation of individuals to varying environments. 相似文献