TY - JOUR
T1 - Make Smart Decisions Faster
T2 - Deciding D2D Resource Allocation via Stackelberg Game Guided Multi-Agent Deep Reinforcement Learning
AU - Shi, Dian
AU - Li, Liang
AU - Ohtsuki, Tomoaki
AU - Pan, Miao
AU - Han, Zhu
AU - Poor, H. Vincent
N1 - Publisher Copyright:
© 2002-2012 IEEE.
PY - 2022/12/1
Y1 - 2022/12/1
N2 - Device-to-Device (D2D) communication enabling direct data transmission between two mobile users has emerged as a vital component for 5G cellular networks to improve spectrum utilization and enhance system capacity. A critical issue for realizing these benefits in D2D-enabled networks is to properly allocate radio resources while coordinating the co-channel interference in a time-varying communication environment. In this paper, we propose a Stackelberg game (SG) guided multi-agent deep reinforcement learning (MADRL) approach, which allows D2D users to make smart power control and channel allocation decisions in a distributed manner. In particular, we define a crucial Stackelberg Q-value (ST-Q) to guide the learning direction, which can be calculated based on the equilibrium achieved in the Stackelberg game. With the guidance of the Stackelberg equilibrium, our approach converges faster with fewer iterations than the general MADRL method and thereby exhibits better performance in handling the network dynamics. After the initial training, each agent can infer timely D2D resource allocation strategies with distributed execution. Extensive simulations are conducted to validate the efficacy of our proposed scheme in developing timely resource allocation strategies. The results also show that our method outperforms the general MADRL based approach in terms of the average utility, channel capacity, and training time.
AB - Device-to-Device (D2D) communication enabling direct data transmission between two mobile users has emerged as a vital component for 5G cellular networks to improve spectrum utilization and enhance system capacity. A critical issue for realizing these benefits in D2D-enabled networks is to properly allocate radio resources while coordinating the co-channel interference in a time-varying communication environment. In this paper, we propose a Stackelberg game (SG) guided multi-agent deep reinforcement learning (MADRL) approach, which allows D2D users to make smart power control and channel allocation decisions in a distributed manner. In particular, we define a crucial Stackelberg Q-value (ST-Q) to guide the learning direction, which can be calculated based on the equilibrium achieved in the Stackelberg game. With the guidance of the Stackelberg equilibrium, our approach converges faster with fewer iterations than the general MADRL method and thereby exhibits better performance in handling the network dynamics. After the initial training, each agent can infer timely D2D resource allocation strategies with distributed execution. Extensive simulations are conducted to validate the efficacy of our proposed scheme in developing timely resource allocation strategies. The results also show that our method outperforms the general MADRL based approach in terms of the average utility, channel capacity, and training time.
KW - D2D communications
KW - Deep reinforcement learning
KW - resource allocation
KW - stackelberg game
UR - http://www.scopus.com/inward/record.url?scp=85107334653&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85107334653&partnerID=8YFLogxK
U2 - 10.1109/TMC.2021.3085206
DO - 10.1109/TMC.2021.3085206
M3 - Article
AN - SCOPUS:85107334653
SN - 1536-1233
VL - 21
SP - 4426
EP - 4438
JO - IEEE Transactions on Mobile Computing
JF - IEEE Transactions on Mobile Computing
IS - 12
ER -