Optimal control of HalfCheetah benchmark

- Citation Author(s):
- Submitted by:
- Yongchao Yang
- Last updated:
- DOI:
- 10.21227/2ak7-wr69
- Data Format:
- Categories:
- Keywords:
Abstract
The video demonstration corresponding to the 100th time step in Figure 13 for the HalfCheetah controlled by the random policy and the learned
policies with different methods. MDDPG(5) denotes the model-free counterpart with 5-step TD target. FNN-Model-MDDPG(5) and ResNet-Model-MDDPG(5) denote the FNN-model-based and our ResNet-model-based schemes with 5 dynamics models, respectively.
Instructions:
None