Datasets
Standard Dataset
A Gradient-Based Reinforcement Learning Algorithm for Multiple Cooperative Agents
- Citation Author(s):
- Zhen Zhang
- Submitted by:
- Zhen Zhang
- Last updated:
- Thu, 11/08/2018 - 10:34
- DOI:
- 10.21227/fpkq-za03
- Data Format:
- License:
- Categories:
- Keywords:
Abstract
The dataset contains the simulation results on two stochastic games -- box pushing and distributed sensor network (DSN). The setting of parameters is given in the manuscript named "A Gradient-Based Reinforcement Learning Algorithm for Multiple Cooperative Agents".
here two folders -- box pushing and DSN in the data folder. The simulation results for each task is in the correspongding folders.
There are four folders -- EMA, PMR-EGA, single-agent RL and WoLF-PHC under the above two folers, corresponding to results of the four algorithms.
An example:
Box pushing: PMR-EGA:
Within the folder of data/box pushing/PMR-EGA, the folders named 10w, 20w, 40w, 80w, 120w store the results through learning by PMR-EGA after 10w, 20w, 40w, 80w, 120w episodes respectively. EMA, single-agent RL and WoLF-PHC have the similar folders within themselves.
Under the path of data/box pushing/PMR-EGA/10w, there are files as follows:
all_avr_step.txt -- the average steps during evaluation, which is shown in Table I of the paper.
averageReward.dat -- the sliding average reward during learning
averageStep.dat -- the sliding average steps during learning
avr_successRate.dat -- the average success rate during evaluation. It is shown in Table II of the paper.
avr_successTimes.dat -- the average success times during evaluation.
completeRecord.dat -- for degugging
Q_single_Sarsa.dat -- the Q-table used to evaluate the gradient. Each row represents a state, and each column represents the Q-vlue of a joint action under each state.
QTable_agent1.dat -- the Q-table of agent 1, which is used as the strategy of agent 1, Each row represents a state, and each column represents the Q-vlue of its own action under each state.
QTable_agent2.dat -- the Q-table of agent 2, which is used as the strategy of agent 2.
QTable_agent3.dat -- the Q-table of agent 3, which is used as the strategy of agent 3.
QTable_agent4.dat -- the Q-table of agent 4, which is used as the strategy of agent 4.
successRate.dat -- Each row represents the average success rate during evaluation for each run.
successTimes.dat -- Each row represents the average success times during evaluation for each run.
updateQTime.dat --for debugging
The above files in the other folders within box pushing have the same meaning, except that for single-agent RL, the following file has other meanings:
Q_single_Sarsa.dat -- the Q-table of the joint actions, which is used as joint strategies of all agents. Each row represents a state, and each column represents the Q-vlue of a joint action under each state.
An example:
DSN: PMR-EGA:
Within the folder of data/DSN/PMR-EGA, the folders named 20w, 30w, 40w store the results through learning by PMR-EGA after 20w, 30w, 40w episodes respectively. EMA, single-agent RL and WoLF-PHC have the similar folders within themselves.
Under the path of data/box pushing/PMR-EGA/10w, there are files as follows:
all_avr_reward.dat -- the average cumulative reward during evaluation, which is shown in Table IV of the paper.
all_avr_step.dat -- the average steps during evaluation, which is shown in Table V of the paper.
all_avr_successRate.dat -- the average success rate during evaluation, which is shown in Table III of the paper.
averageReward.dat -- the sliding average reward during learning
averageStep.dat -- the sliding average steps during learning
avr_reward.dat -- Each row represents the average cumulative reward during evaluation for each run.
avr_step.dat -- Each row represents the average steps during evaluation for each run.
completeRecord.dat -- for degugging
QTable_agent0.dat -- the Q-table of agent 0, which is used as the policy of agent 0, Each row represents a state, and each column represents the Q-vlue of its own action under each state.
QTable_agent1.dat -- the Q-table of agent 1, which is used as the strategy of agent 1.
QTable_agent2.dat -- the Q-table of agent 2, which is used as the strategy of agent 2.
QTable_agent3.dat -- the Q-table of agent 3, which is used as the strategy of agent 3.
QTable_agent4.dat -- the Q-table of agent 4, which is used as the strategy of agent 4.
QTable_agent5.dat -- the Q-table of agent 5, which is used as the strategy of agent 5.
QTable_agent6.dat -- the Q-table of agent 6, which is used as the strategy of agent 6.
QTable_agent7.dat -- the Q-table of agent 7, which is used as the strategy of agent 7.
successRate.dat -- Each row represents the average success rate during evaluation for each run.
successTimes.dat -- Each row represents the average success times during evaluation for each run.
The above files in the other folders within DSN have the same meaning.
Documentation
Attachment | Size |
---|---|
The detail of dataset | 4.39 KB |