期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

全文获取类型

收费全文	38篇
免费	7篇
国内免费	1篇

专业分类

电工技术	1篇
综合类	2篇
化学工业	4篇
金属工艺	1篇
石油天然气	1篇
无线电	10篇
一般工业技术	4篇
自动化技术	23篇

出版年

2023年	2篇
2022年	2篇
2021年	3篇
2020年	5篇
2019年	3篇
2018年	2篇
2017年	1篇
2016年	1篇
2015年	2篇
2014年	1篇
2013年	3篇
2012年	5篇
2011年	4篇
2010年	1篇
2009年	1篇
2008年	1篇
2007年	1篇
2006年	2篇
2005年	1篇
2002年	1篇
2000年	1篇
1999年	1篇
1993年	1篇
1989年	1篇

排序方式： 共有46条查询结果，搜索用时 187 毫秒

1 [2] [3] [4] [5] 下一页 » 末页»

基于多摇臂赌博机的产品定价算法

下载免费PDF全文

毕文杰郭乐薇《计算机工程与应用》2021,57(11):224-231

针对在线零售商在不完全需求信息下的单产品定价问题,提出了一种基于多摇臂赌博机的产品定价算法。为了提升多摇臂赌博机算法在定价问题中的效果,该算法利用了需求曲线的单调性,并加入了消费者偏好识别。对消费者的保留价格进行分析得到消费者购买概率,将在线零售商的定价问题建模为多摇臂赌博机模型,给出了相应的定价算法并进行了理论分析,最后通过仿真实验比较了相关算法的定价效果。仿真结果表明该算法提高了在线零售商的收益。相似文献

Exploiting channel memory for multiuser wireless scheduling without channel measurement: Capacity regions and algorithms

Chih-ping LiAuthor Vitae Michael J. Neely^{Author Vitae} 《Performance Evaluation》2011,68(8):631-657

相似文献

Green cell association for multimedia transmission in cognitive heterogeneous networks

Xi Li Shanzhi Chen Dan Chen Hong Ji Victor C. M. Leung 《International Journal of Communication Systems》2013,26(4):530-548

With the introduction of low‐powered pico/femto‐base stations and relay nodes into the macro‐cell, recent heterogeneous networks provide an attractive approach for future wireless communication. Although it may achieve better coverage and higher capacity, several problems remain unsolved before practical deployment. For example, how to select the proper cell from neighbor low‐powered cells and then occupy the radio resource without interference on macro‐users is both important and challenging, especially for rigorous multimedia applications. The traditional cell access algorithms and quality‐control parameters such as delay or throughput no longer suit well in this complex environment. An effective approach should be pursued. In this paper, we investigate this interesting cell association problem and propose a complete green resolution on the basis of thorough discussions about the multimedia transmission under these concerns. Cognitive radio is introduced to share spectrum between macro‐cell and low‐powered cells while securing the transmission of authorized macro‐users. We also bring forth the concept of ‘interference balance’ to better manage the overall interference and energy consumption in the network. Restless bandit model is formulated on the basis of channel state, data rate, interference control, and the carefully chosen intra‐refreshing rate for multimedia traffic. Then the cell association scheme is designed to be efficient and practical because of the simple index property of our model output. Simulation results have proven the performance of our proposed resolution compared with existing algorithms on interference constraint, multimedia distortion, and overall network energy consumption balance. Copyright © 2013 John Wiley & Sons, Ltd. 相似文献

Simple learning rules to cope with changing environments

下载免费PDF全文

Roderich Gro? Alasdair I Houston Edmund J Collins John M McNamara Fran?ois-Xavier Dechaume-Moncharmont Nigel R Franks 《Journal of the Royal Society Interface》2008,5(27):1193-1202

We consider an agent that must choose repeatedly among several actions. Each action has a certain probability of giving the agent an energy reward, and costs may be associated with switching between actions. The agent does not know which action has the highest reward probability, and the probabilities change randomly over time. We study two learning rules that have been widely used to model decision-making processes in animals—one deterministic and the other stochastic. In particular, we examine the influence of the rules'' ‘learning rate’ on the agent''s energy gain. We compare the performance of each rule with the best performance attainable when the agent has either full knowledge or no knowledge of the environment. Over relatively short periods of time, both rules are successful in enabling agents to exploit their environment. Moreover, under a range of effective learning rates, both rules are equivalent, and can be expressed by a third rule that requires the agent to select the action for which the current run of unsuccessful trials is shortest. However, the performance of both rules is relatively poor over longer periods of time, and under most circumstances no better than the performance an agent could achieve without knowledge of the environment. We propose a simple extension to the original rules that enables agents to learn about and effectively exploit a changing environment for an unlimited period of time. 相似文献

Certainty equivalence control with forcing: revisited

Rajeev Agrawal 《Systems & Control Letters》1989,13(5):405-412

Certainty equivalence control with forcing has been shown to be optimal for several stochastic adaptive control problems with the average cost per unit time criterion. Recently researchers have started looking at stochastic adaptive control problems with a view to minimizing the rate of increase of the learning loss. This criterion is stronger than the average cost per unit time criterion. Certainty equivalence control with forcing does not usually suffice for the learning loss criterion and one has to develop fairly complicated schemes in order to achieve optimality. The objective of this paper is to see how well one might be able to do with a certainty-equivalence-control-with-forcing type of scheme. In particular we construct a class of such schemes whose learning loss is O((log n)^1+δ) for δ > 0, whereas optimal schemes typically have a O(log n)learning loss. 相似文献

Strict greedy design paradigm applied to the stochastic multi-armed bandit problem

Joey Hong 《机床与液压》2015,43(6):1-6

The process of making decisions is something humans do inherently and routinely,to the extent that it appears commonplace. However,in order to achieve good overall performance,decisions must take into account both the outcomes of past decisions and opportunities of future ones. Reinforcement learning,which is fundamental to sequential decision-making,consists of the following components: 1 A set of decisions epochs; 2 A set of environment states; 3 A set of available actions to transition states; 4 State-action dependent immediate rewards for each action.At each decision,the environment state provides the decision maker with a set of available actions from which to choose. As a result of selecting a particular action in the state,the environment generates an immediate reward for the decision maker and shifts to a different state and decision. The ultimate goal for the decision maker is to maximize the total reward after a sequence of time steps.This paper will focus on an archetypal example of reinforcement learning,the stochastic multi-armed bandit problem. After introducing the dilemma,I will briefly cover the most common methods used to solve it,namely the UCB and εn- greedy algorithms. I will also introduce my own greedy implementation,the strict-greedy algorithm,which more tightly follows the greedy pattern in algorithm design,and show that it runs comparably to the two accepted algorithms. 相似文献

Multi-armed Bandit processes with optimal selection of the operating times

Pilar Ibarrola Ricardo Vélez 《TEST》2005,14(1):239-255

A multi-armed Bandit Problem is considered such that at each decision epoch it is to be decided the next project to be undertaken and the span of time to be spent in this project, instead of reconsidering the new project at each stage. This extended model, inspired in sequentially planned decision procedures (Schmitz, 1993), is formulated in Section 1 and tries to exploit the reduction of costs produced by longer periods dedicated to the same activity. Following the method by Whittle (1980). Section 2 introduces a retirement option with a variable rewardM, and Section 3 extends Gittins indexes to this case. Another relevant conclusion is that the optimal period of activity for each project does not depend on the retirement rewardM. Finally, we show that the optimal strategy is to choose the project with the highest Gittins index. 相似文献

基于类别不平衡数据联邦学习的设备选择算法

王惜民范睿《计算机应用研究》2021,38(10):2968-2973

考虑移动边缘计算下的联邦学习,其中全局服务器通过网络连接大量移动设备共同训练深度神经网络模型.全局类别不平衡和设备本地类别不平衡的数据分布往往会导致标准联邦平均算法性能下降.提出了一种基于组合式多臂老虎机在线学习算法框架的设备选择算法,并设计了一种类别估计方案.通过每一轮通信中选取与前次全局模型的类别测试性能偏移最互补的设备子集,使得训练后线性组合的全局模型各类别测试性能更平衡,从而获得更快的收敛性、更稳定的训练过程以及更好的测试性能.数值实验充分探究了不同参数对基于类别不平衡联邦平均算法的影响,以及验证了所提设备选择算法的有效性. 相似文献

四十臂井径测井资料精细解释方法探讨

夏竹君田烨李俊舫蔡冬梅向素利《石油仪器》2011,25(5):39-41,103

针对四十臂井径测井仪器在大斜度井中居中效果不理想,提出了加大扶正臂支撑力度、优化测井程序等改进方法,提高了仪器在大斜度井中的测量精度;为了提高测井资料解释精度,探讨性提出了斜井校正方法、变形段最小及最大井径值优化取值方法以及套管变形定量划分标准。实际应用结果表明,改进后的技术满足了中原油田对套损井检测及评价的要求。该方法对于其他多臂井径测井技术也具有一定的借鉴意义。相似文献

10.

多臂型端酯基聚叠氮缩水甘油醚的合成与表征 总被引：1，自引：0，他引：1

王晓罗运军葛震柴春鹏《高分子材料科学与工程》2012,(2):1-4

分别以丙三醇、季戊四醇为起始剂,四氯化锡为催化剂,环氧氯丙烷阳离子经开环聚合得到两种不同多臂型端羟基聚环氧氯丙烷(PECH-OH),其端羟基经酯化改性、氯甲基侧基的氯被叠氮基取代,得到两种不同多臂型GAPE。对多臂型GAPE进行傅里叶红外光谱(FT-IR)、凝胶色谱(GPC)、热失重(TG)和差示扫描量热(DSC)表征。结果表明,GAPE-3和GAPE-4的分子量分别是890和1260,玻璃化转变温度分别是-55.2℃和-53.9℃,GAPE的热分解过程为叠氮基热分解和聚醚主链热分解两个相对独立的阶段。相似文献

1 [2] [3] [4] [5] 下一页 » 末页»