AoI-Aware Resource Allocation for Platoon-Based C-V2X Networks via Multi-Agent Multi-Task Reinforcement Learning

Citation Author(s):: Mohammad Parvini
Submitted by:: Mohammad Parvini
Last updated:: Mon, 05/10/2021 - 08:27
DOI:: 10.21227/3kfr-ct25

1107 views

Categories:

Keywords:

ACCESS DATASET CITE

Abstract

The simulation code for the paper:

"AoI-Aware Resource Allocation for Platoon-Based C-V2X Networks via Multi-Agent Multi-Task Reinforcement Learning"

The overall architecture of the proposed MARL framework is shown in the figure.

Modified MADDPG: This algorithm trains two critics (different from legacy MADDPG) with the following functionalities:

The global critic which estimates the global expected reward and motivates the agents toward a cooperating behavior and an exclusive local critic for each agent that estimates the local individual reward.

Modified MADDPG with Task decomposition: This algorithm is similar to the Modified MADDPG; however, in this algorithm, the local holistic reward function of each agent is further decomposed into multiple sub-reward functions based on the tasks each agent has to accomplish, and the task-wise value functions are learned separately.

"For both algorithms, the global critic is built upon the twin delayed policy gradient (TD3)."