共查询到17条相似文献,搜索用时 93 毫秒
1.
2.
3.
首先介绍了认知无线电技术产生的背景,以及强化学习的发展和应用于认知领域的优势;接着对强化学习的基本原理及其2个常见的模型Q-Learning和POMDP作了介绍,并对其模型定义、思想、所要描述的问题和使用的场景都做了较详细的阐述;然后针对这个方向最近几年的顶级会议和期刊论文,分析了其主要内容;通过最近几年的学术、会议论文中所述的研究现状及成果,说明强化学习的主要特点是能够准确、快速学习到最优策略,能够模拟真实环境,自适应性强,提高频谱感知、分配效率,从而最大化系统吞吐量,这些优势充分证明了强化学习将是认知领域里一种很有前景的技术。 相似文献
4.
先前的研究仅考虑仅仅存在认知车联网单一通信环境的问题,这样不能充分利用频谱资源。为了提高频谱利用率,提出一种适用于同时存在多个认知车辆的认知车联网环境的方法。同时为了提高认知车辆频谱接入的成功率,通过结合不同情况下的授权车辆和认知车辆的吞吐量设计了不同的反馈函数提出了一种改进的深度强化学习方法。所提出方法的性能明显优于传统的Q学习算法,能够更明显地提高频谱利用率,满足日益增长的车联网通信需求。 相似文献
5.
6.
7.
8.
在认知无线电网络中,对于Underlay接入方式的功率控制问题,现有基于强化学习的方法存在次用户接入信道的成功率和吞吐量较低。针对这一问题,提出了一种基于A3C的功率控制算法。仿真结果表明,所提基于A3C的功率控制算法比现有基于DQN的功率控制算法有效地提高了次用户接入信道的成功率和吞吐量。为了进一步优化次用户的吞吐量,将次用户功率选择空间连续化。仿真结果表明,在连续功率场景下,所提基于A3C的功率控制算法可以进一步提高次用户的吞吐量。 相似文献
9.
10.
11.
为了提升反向散射网络中物联网设备的平均吞吐量,提出了一种资源分配机制,构建了用户配对和时隙分配联合优化资源分配模型。由于该模型直接利用深度强化学习(Deep Reinforcement Learning,DRL )算法求解导致动作空间维度较高且神经网络复杂,故将其分解为两层子问题以降低动作空间维度:首先,基于深度强化学习算法,利用历史信道信息推断当前的信道信息以进行最优的用户配对;然后,在用户固定配对的情况下,基于凸优化算法,以最大化物联网设备总吞吐量为目标进行最优的时隙分配。仿真结果表明,与其他资源分配方法相比,所提资源分配方法能有效提升系统吞吐量,且有较好的信道适应性和收敛性。 相似文献
12.
Yi Li Hong Ji Xi Li Victor C.M. Leung 《International Journal of Communication Systems》2012,25(8):1077-1090
The Internet of Things (IoT) is the next big possibility and challenge for the future information networks. It makes the interaction between people and things more active and provides the connection among different existing networks. Ubiquitous short‐range wireless access and cognitive radio are key technologies for the IoT's realization. This paper deals with some problems in an integrated system of wireless local area network (WLAN) and cognitive radio — cognitive WLAN over fiber (CWLANoF). CWLANoF is a cost‐effective and efficient architecture that combines radio over fiber and cognitive radio technologies to provide centralized radio resource management and equal spectrum access in infrastructure‐based IEEE 802.11 WLANs. In this paper, a reinforcement learning approach is applied to implement dynamic channel selection in CWLANoF. The cognitive access points select the best channels among the industrial, scientific, and medical band for data packet transmission, given that the objective is to minimize external interference and acquire better network‐wide performance. The reinforcement learning method avoids solving complex optimization problems while being able to explore the states of a CWLANoF system during normal operations. Simulation results reveal that the proposed strategy is effective in avoiding aggregated interference, reducing outage probability, and improving network throughput. Copyright © 2012 John Wiley & Sons, Ltd. 相似文献
13.
Pei Zhang Xiaohui Wang Zhiguo Ma Shuaijun Liu Junde Song 《International Journal of Satellite Communications and Networking》2020,38(5):450-461
Dynamic power allocation (DPA) is the key technique to improve the system throughput by matching the offered capacity with that required among distributed beams in multibeam satellite systems. Existing power allocation studies tend to adopt the metaheuristic optimization algorithms such as the genetic algorithm. The achieved DPA cannot adapt to the dynamic environments due to the varying traffic demands and the channel conditions. To solve this problem, an online algorithm named deep reinforcement learning‐based dynamic power allocation (DRL‐DPA) algorithm is proposed in this paper. The key idea of the proposed DRL‐DPA lies in the online power allocation decision making other than the offline way of the traditional metaheuristic methods. Simulation results show that the proposed DRL‐DPA algorithm can improve the system performance in terms of system throughput and power consumption in multibeam satellite systems. 相似文献
14.
针对传统的物联网边缘计算方法存在计算成本过高,计算时间过长等问题,文中引入了深度强化学习技术,对物联网边缘计算方法进行优化.通过物联网拓扑结构设定物联网边缘计算周期,获取数据上传速度.设计边缘计算执行过程,提升边缘计算资源分配效率.引入深度强化学习技术中的CNN模型实现卷积计算,完成物联网边缘计算的资源分配.至此,实现... 相似文献
15.
In order to solve multi-objective optimization problem,a resource allocation algorithm based on deep reinforcement learning in cellular networks was proposed.Firstly,deep neural network (DNN) was built to optimize the transmission rate of cellular system and to complete the forward transmission process of the algorithm.Then,the Q-learning mechanism was utilized to construct the error function,which used energy efficiency as the rewards.The gradient descent method was used to train the weights of DNN,and the reverse training process of the algorithm was completed.The simulation results show that the proposed algorithm can determine optimization extent of optimal resource allocation scheme with rapid convergence ability,it is obviously superior to the other algorithms in terms of transmission rate and system energy consumption optimization. 相似文献
16.