首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 46 毫秒
1.
2.
In this paper, we propose a received signal strength (RSS)-based single-attribute handoff decision algorithm at first, and investigate handoff decision model based on connection lifetime, which can keep mobile terminals (MTs) staying long enough in the preferred network. Since the preferred quality of service (QoS) parameters may be distinct among different MTs, we then formulate the vertical handoff decision problem as a Markov decision process, with the objectives of maximizing the expected total reward and minimizing average number of handoffs. A reward function is constructed to assess the QoS during each connection, and the G1 and entropy methods are applied in an iterative way, by which we can work out a stationary deterministic handoff decision policy. Numerical results demonstrate the superiority of our proposed schemes compared with other existing algorithms.  相似文献   

3.
We consider the design and analysis of algorithms that learn from the consequences of their actions with the goal of maximizing their cumulative reward, when the consequence of a given action is felt immediately, and a linear function, which is unknown a priori, (approximately) relates a feature vector for each action/state pair to the (expected) associated reward. We focus on two cases, one in which a continuous-valued reward is (approximately) given by applying the unknown linear function, and another in which the probability of receiving the larger of binary-valued rewards is obtained. For these cases we provide bounds on the per-trial regret for our algorithms that go to zero as the number of trials approaches infinity. We also provide lower bounds that show that the rate of convergence is nearly optimal.  相似文献   

4.
We consider a class of queueing networks referred to as "generalized constrained queueing networks" which form the basis of several different communication networks and information systems. These networks consist of a collection of queues such that only certain sets of queues can be concurrently served. Whenever a queue is served, the system receives a certain reward. Different rewards are obtained for serving different queues, and furthermore, the reward obtained for serving a queue depends on the set of concurrently served queues. We demonstrate that the dependence of the rewards on the schedules alter fundamental relations between performance metrics like throughput and stability. Specifically, maximizing the throughput is no longer equivalent to maximizing the stability region; we therefore need to maximize one subject to certain constraints on the other. Since stability is critical for bounding packet delays and buffer overflow, we focus on maximizing the throughput subject to stabilizing the system. We design provably optimal scheduling strategies that attain this goal by scheduling the queues for service based on the queue lengths and the rewards provided by different selections. The proposed scheduling strategies are however computationally complex. We subsequently develop techniques to reduce the complexity and yet attain the same throughput and stability region. We demonstrate that our framework is general enough to accommodate random rewards and random scheduling constraints.  相似文献   

5.
We study the design of incentive contracts based on customer satisfaction (CS) surveys with reward budget limits. We extend principal‐agent models to consider budget constraints, survey response rates, and correlation between CS measure and demand. We derive the optimal incentive contract and study the impacts of these factors on contract performance. In contrast to the common belief that customer future values are the drivers of CS incentives, we show that CS incentives can benefit principals even in a single‐period setting where customers bring no future value. Improvements can be achieved without increasing total reward, because the CS incentive program reveals additional information about agents' service effort and diversifies their risk. Such effects are overlooked in existing CS research. With consideration of correlation between sales and CS measures, we provide a metrics selection rule regarding which reward(s)—CS, sales commission, or both—should be included in an incentive plan. We also study cumulative incentive schemes based on commonly used average CS measures and show that such incentive schemes may fail to motivate agents to increase service effort. Therefore, designing proper reward schemes is a critical issue for effective CS management and deserves future research.  相似文献   

6.
This paper examines comprehensive evaluation of aperiodic time-based checkpointing and rejuvenation schemes maximizing the steady-state system availability in an operational software system. We consider two kinds of maintenance policies: checkpointing prior to rejuvenating (CPTR) and rejuvenating prior to checkpointing (RPTC). These schemes are complementary from each other to schedule checkpoints and rejuvenation points. In addition, under a periodic full maintenance operation, we show that aperiodic checkpointing or rejuvenation scheme is optimal to maximize the steady-state system availability by applying the dynamic programming. In numerical examples, CPTR and RPTC are comparatively examined with same overhead parameters, and the effects of CPTR and RPTC on maximizing the steady-state system availability are investigated.  相似文献   

7.
Optimal parameter selection is an important aspect of optimizing system performance. This paper examines the effect of different incentive structures, including reward and penalty based structures, for employees within an engineering firm on the value captured by that firm. Incentives are used to communicate the firm's values to the employee without revealing the firm's value function. We use a high-speed milling example to illustrate the approach and derive results. We show that, in certain cases, simple incentive structures can be aligned such that they induce profit maximizing behaviour. In other cases, we show that incentive structures result in a loss of value that we term the value gap. In the milling case considered, reward-based incentives coincide with optimal parameters while penalty-based incentives result in a greater than four-fold increase in costs. The effect of uncertainty within a system can also be analysed. We consider uncertainty in the process dynamics as well as tool life and that the inclusion of uncertainty in the analysis may not change the results in some cases.  相似文献   

8.
Combinatorial reverse auctions represent a popular business model in procurement. For multiple buyers, different procurement models based on combinatorial reverse auctions may be applied. For example, each buyer may hold one combinatorial reverse auction independently. Alternatively, the buyers may delegate the auction to a group-buyer and let the group-buyer hold only one combinatorial reverse auction on behalf of all the buyers. A combination of a combinatorial reverse auctions with the group-buying model makes it possible to reduce the overall cost to acquire the required items significantly due to complementarities between items. However, combinatorial reverse auctions suffer from high computational complexity. To assess the advantage of combining group-buying with combinatorial reverse auctions, three issues must be addressed, including performance, computational efficiency and the scheme to reward the buyers. This motivates us to compare the performance and efficiency of the aforementioned two different combinatorial reverse auction models and to study the possible schemes to reward the buyers. To achieve these objectives, we first illustrate the advantage of group-buying-based combinatorial reverse auctions over multiple independent combinatorial reverse auctions. We then formulate the problems for these two combinatorial reverse auction models and propose solution algorithms for them. We compare performance and computational efficiency for these two combinatorial reverse auction models. Our analysis indicates that a group-buying-based combinatorial reverse auction not only outperforms multiple independent combinatorial reverse auctions but also is more efficient than multiple independent combinatorial reverse auctions. We also propose a non-uniform scheme to reward the buyers in group-buying based combinatorial reverse auctions.  相似文献   

9.
In this paper, we introduce triangular subdivision operators which are composed of a refinement operator and several averaging operators, where the refinement operator splits each triangle uniformly into four congruent triangles and in each averaging operation, every vertex will be replaced by a convex combination of itself and its neighboring vertices. These operators form an infinite class of triangular subdivision schemes including Loop's algorithm with a restricted parameter range and the midpoint schemes for triangular meshes. We analyze the smoothness of the resulting subdivision surfaces at their regular and extraordinary points by generalizing an established technique for analyzing midpoint subdivision on quadrilateral meshes. General triangular midpoint subdivision surfaces are smooth at all regular points and they are also smooth at extraordinary points under certain conditions. We show some general triangular subdivision surfaces and compare them with Loop subdivision surfaces.  相似文献   

10.
Unbalanced energy consumption is an inherent problem in wireless sensor networks characterized by multihop routing and many-to-one traffic pattern, and this uneven energy dissipation can significantly reduce network lifetime. In this paper, we study the problem of maximizing network lifetime through balancing energy consumption for uniformly deployed data-gathering sensor networks. We formulate the energy consumption balancing problem as an optimal transmitting data distribution problem by combining the ideas of corona-based network division and mixed-routing strategy together with data aggregation. We first propose a localized zone-based routing scheme that guarantees balanced energy consumption among nodes within each corona. We then design an offline centralized algorithm with time complexity O(n) (n is the number of coronas) to solve the transmitting data distribution problem aimed at balancing energy consumption among nodes in different coronas. The approach for computing the optimal number of coronas in terms of maximizing network lifetime is also presented. Based on the mathematical model, an energy-balanced data gathering (EBDG) protocol is designed and the solution for extending EBDG to large-scale data-gathering sensor networks is also presented. Simulation results demonstrate that EBDG significantly outperforms conventional multihop transmission schemes, direct transmission schemes, and cluster-head rotation schemes in terms of network lifetime.  相似文献   

11.
In facility layout design, the problem of locating facilities with material flow between them was formulated as a quadratic assignment problem (QAP), so that the total cost to move the required material between the facilities is minimized, where the cost is defined by a quadratic function. In this paper, we propose a modification to iterated fast local search algorithm (IFLS) with a new recombination crossover operator and the modified IFLS is addressed as NIFLS. The ideas we incorporate in the NIFLS are iterated self-improvement with evolutionary based perturbation tool, which includes (i) recombination crossover as perturbation tool and (ii) self-improvement in mutation operation followed by a local search. Three schemes of NIFLS are proposed and the obtained solution qualities by the three schemes are compared. We test our algorithm on all the benchmark instances of QAPLIB, a well-known library of QAP instances. The performance of proposed recombination crossover with sliding mutation (RCSM) scheme of NIFLS is well superior to the other two schemes of NIFLS.  相似文献   

12.
针对移动终端在异构网络环境下,需要在垂直切换过程中进行网络选择的问题,提出一种面向QoS的马尔可夫选择决策算法,通过对算法模型合理化构建过程与异构环境特点的紧密结合、报酬函数的正确定义与求解,可以为用户选择合适的接入网络,最大程度地满足在异构网络环境中用户QoS的长期效益。仿真结果表明,该算法可以有效提高判决水平,改善业务的QoS。  相似文献   

13.
This paper develops a method for solving the multiple attribute decision-making problems with the single-valued neutrosophic information or interval neutrosophic information. We first propose two discrimination functions referred to as score function and accuracy function for ranking the neutrosophic numbers. An optimization model to determine the attribute weights that are partly known is established based on the maximizing deviation method. For the special situations where the information about attribute weights is completely unknown, we propose another optimization model. A practical and useful formula which can be used to determine the attribute weights is obtained by solving a proposed nonlinear optimization problem. To aggregate the neutrosophic information corresponding to each alternative, we utilize the neutrosophic weighted averaging operators which are the single-valued neutrosophic weighted averaging operator and the interval neutrosophic weighted averaging operator. Thus, we can determine the order of alternatives and choose the most desirable one(s) based on the score function and accuracy function. Finally, some illustrative examples are presented to verify the proposed approach and to present its effectiveness and practicality.  相似文献   

14.
Video services are likely to dominate the traffic in future broadband networks. Most of these services will be provided by large- scale public-access video servers. Research to date has shown that disk arrays are a promising technology for providing the storage and throughput required to serve many independent video streams to a large customer population. Large disk arrays, however, are susceptible to disk failures which can greatly affect their reliability. In this paper, we discuss suitable redundancy mechanisms to increase the reliability of disk arrays and compare the performance of the RAID-3 and RAID-5 redundancy schemes. We use cost and performability analyses to rigorously compare the two schemes over a variety of conditions. Accurate cost models are developed and Markov reward models (with time-dependent reward structures) are developed and used to give insight into the tradeoffs between system cost and revenue earning potential. The paper concludes that for large-scale video servers, coarse-grained striping in a RAID-5 style of disk array is most cost effective.  相似文献   

15.
We study the decision-making problem with Dempster-Shafer theory of evidence. We analyze how to deal with this model when the available information is uncertain and it can be represented with fuzzy numbers. We use different types of aggregation operators that aggregate fuzzy numbers such as the fuzzy weighted average (FWA), the fuzzy ordered weighted averaging (FOWA) operator and the fuzzy hybrid averaging (FHA) operator. As a result, we get the belief structure fuzzy weighted average (BS-FWA), the belief structure fuzzy ordered weighted averaging (BS-FOWA) operator and the belief structure fuzzy hybrid averaging (BS-FHA) operator. We further generalize this new approach by using generalized and quasi-arithmetic means. We also develop an illustrative example regarding the selection of investments where we can see the different results obtained by using different types of fuzzy aggregation operators.  相似文献   

16.
依据现实交通网络中路段容量与出行终点停车容量空间有限性的特征,建立带路段流量和终点需求双约束的Logit随机用户均衡问题的不动点模型,设计了一种有效的Lagrangian乘子法来求解,通过合理调整Lagrangian乘子使算法快速趋于收敛。在算法的迭代过程中,对通常Logit均衡问题则设计改进的自适应相继加权平均法来求解,使路段流量不超过相应路段容量并避免了繁琐的路线枚举,改进了算法的计算效率。数值实验验证了算法的有效性和结果的可行性。  相似文献   

17.
This paper describes a novel cost-driven disk scheduling algorithm for environments consisting of multipriority requests. An example application is a video-on-demand (VOD) system that provides high and low quality services, termed priority 2 and 1, respectively. Customers ordering a high quality (priority 2) service pay a higher fee and are assigned a higher priority by the underlying system. Our proposed algorithm minimizes costs by maintaining one-queue and managing requests intelligently in order to meet the deadline of as many priority 1 requests as possible while maximizing the number of priority 2 requests that meet their deadline. Our algorithm is general enough to accommodate an arbitrary number of priority levels. Prior schemes, collectively termed "multiqueue" schemes maintain a separate queue for each priority level in order to optimize the performance of the high priority requests only. When compared with our proposed scheme, in certain cases, our technique provides more than one order of magnitude improvement in total cost.  相似文献   

18.
The ability to analyze the effectiveness of agent reward structures is critical to the successful design of multiagent learning algorithms. Though final system performance is the best indicator of the suitability of a given reward structure, it is often preferable to analyze the reward properties that lead to good system behavior (i.e., properties promoting coordination among the agents and providing agents with strong signal to noise ratios). This step is particularly helpful in continuous, dynamic, stochastic domains ill-suited to simple table backup schemes commonly used in TD(λ)/Q-learning where the effectiveness of the reward structure is difficult to distinguish from the effectiveness of the chosen learning algorithm. In this paper, we present a new reward evaluation method that provides a visualization of the tradeoff between the level of coordination among the agents and the difficulty of the learning problem each agent faces. This method is independent of the learning algorithm and is only a function of the problem domain and the agents’ reward structure. We use this reward property visualization method to determine an effective reward without performing extensive simulations. We then test this method in both a static and a dynamic multi-rover learning domain where the agents have continuous state spaces and take noisy actions (e.g., the agents’ movement decisions are not always carried out properly). Our results show that in the more difficult dynamic domain, the reward efficiency visualization method provides a two order of magnitude speedup in selecting good rewards, compared to running a full simulation. In addition, this method facilitates the design and analysis of new rewards tailored to the observational limitations of the domain, providing rewards that combine the best properties of traditional rewards.  相似文献   

19.
The authors consider a controlled Markov chain whose transition probabilities and initial distribution are parametrized by an unknown parameter &thetas; belonging to some known parameter space Θ. There is a one-step reward associated with each pair of control and the following state of the process. The objective is to maximize the expected value of the sum of one-step rewards over an infinite horizon. The loss associated with a control scheme at a parameter value is the function of time giving the difference between the maximum reward that could have been achieved if the parameter were known and the reward achieved by the scheme. Since it is impossible to minimize the loss uniformly for all parameter values, the authors define uniformly good adaptive control schemes and restrict attention to these schemes. They develop a lower bound on the loss associated with any uniformly good control scheme. They construct an adaptive control scheme whose loss equals the lower bound for every parameter value and is therefore asymptotically efficient  相似文献   

20.
We present a non-equilibrium analysis and control approach for the Active Queue Management (AQM) problem in communication networks. Using simplified fluid models, we carry out a bifurcation study of the complex dynamic queue behavior to show that non-equilibrium methods are essential for analysis and optimization in the AQM problem. We investigate an ergodic theoretic framework for stochastic modeling of the non-equilibrium behavior in deterministic models and use it to identify parameters of a fluid model from packet level simulations. For computational tractability, we use set-oriented numerical methods to construct finite-dimensional Markov models, including control Markov chains and hidden Markov models. Subsequently, we develop and analyze an example AQM algorithm using a Markov Decision Process (MDP) based control framework. The control scheme developed is optimal with respect to a reward function, defined over the queue size and aggregate flow rate. We implement and simulate our illustrative AQM algorithm in the ns-2 network simulator. The results obtained confirm the theoretical analysis and exhibit promising performance when compared with well-known alternative schemes under persistent non-equilibrium queue behavior.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号