We assume you have access to a gpu that can run CUDA 9.2. Then, the simplest way to install all required dependencies is to create an anaconda environment by running:
This will produce 'log' folder, where all the outputs are going to be stored including train/eval logs, tensorboard blobs, and evaluation episode videos. One can attacha tensorboard to monitor training by running:
Which just tells the expected reward `ER` evaluating current policy after `S` steps. Note that `ER` is average evaluation performance over `num_eval_episodes` episodes (usually 10).
## Running on the cluster
You can find the `run_cluster.sh` script file that allows you run training on the cluster. It is a simple bash script, that is super easy to modify. We usually run 10 different seeds for each configuration to get reliable results. For example to schedule 10 runs of `walker walk` simple do this:
```
./run_cluster.sh walker walk
```
This script will schedule 10 jobs and all the output will be stored under `./runs/walker_walk/{configuration_name}/seed_i`. The folder structure looks like this:
```
runs/
walker_walk/
sac_states/
seed_1/
id # slurm job id
stdout # standard output of your job
stderr # standard error of your jobs
run.sh # starting script
run.slrm # slurm script
eval.log # log file for evaluation
train.log # log file for training
tb/ # folder that stores tensorboard output
video/ # folder stores evaluation videos
10000.mp4 # video of one episode after 10000 steps
seed_2/
...
```
Again, you can attach tensorboard to a particular configuration, for example:
```
tensorboard --logdir runs/walker_walk/sac_states
```
For convinience, you can also use an iPython notebook to get aggregated over 10 seeds results. An example of such notebook is `runs.ipynb`
## Run entire testbed
Another scirpt that allow to run all 10 dm_control task on the cluster is here:
```
./run_all.sh
```
It will call `run_cluster.sh` for each task, so you only need to modify `run_cluster.sh` to change the hyper parameters.