Optimal control of HalfCheetah benchmark

Citation Author(s):: Shanwu Li (Harbin Institute of Technology)

Yongchao Yang (Eastern Institute of Technology, Ningbo)
Submitted by:: Yongchao Yang
Last updated:: Thu, 05/01/2025 - 03:57
DOI:: 10.21227/2ak7-wr69
Data Format:: *.mp4

6 views

Categories:

Artificial Intelligence

Keywords:

optimal control

reinforcement learning

ACCESS DATASET CITE

Abstract

The video demonstration corresponding to the 100th time step in Figure 13 for the HalfCheetah controlled by the random policy and the learned
policies with different methods. MDDPG(5) denotes the model-free counterpart with 5-step TD target. FNN-Model-MDDPG(5) and ResNet-Model-MDDPG(5) denote the FNN-model-based and our ResNet-model-based schemes with 5 dynamics models, respectively.