Running Experiments

Launching from command line

check spinup/run.py

e.g.

python -m spinup.run ppo --env Walker2d-v2 --exp_name walker

The standard way to run a Spinning Up algorithm from the command line is

python -m spinup.run [algo name] [experiment flags]

# e.g.
# python -m spinup.run ppo --env Walker2d-v2 --exp_name walker

Detailed Quickstart Guide

python -m spinup.run ppo --exp_name ppo_ant --env Ant-v2 --clip_ratio 0.1 0.2
    --hid[h] [32,32] [64,32] --act torch.nn.Tanh --seed 0 10 20 --dt
    --data_dir path/to/data

runs PPO in the Ant-v2 Gym environment, with various settings controlled by the flags.

By default, the PyTorch version will run (except for with TRPO, since Spinning Up doesn’t have a PyTorch TRPO yet). Substitute ppo with ppo_tf1 for the Tensorflow version.

clip_ratio, hid, and act are flags to set some algorithm hyperparameters. You can provide multiple values for hyperparameters to run multiple experiments. Check the docs to see what hyperparameters you can set (click here for the PPO documentation).

Flag	Description
hid and act	Special shortcut flags for setting the hidden sizes and activation function for the neural networks trained by the algorithm.
seed	Sets the seed for the random number generator. RL algorithms have high variance, so try multiple seeds to get a feel for how performance varies.
dt / datestamp	Ensures that the save directory names will have timestamps in them.
data_dir	Allows you to set the base save folder for results. The default value is set by `DEFAULT_DATA_DIR` in `spinup/user_config.py`.
exp_name	The experiment name. This is used in naming the save directory for each experiment. The default is "cmd" + `[algo name]`.
env / env_name	The name of an environment in the OpenAI Gym. This is converted to a function that builds the correct gym environment.
cpu / num_cpu	Sets the number of processes to launch for the experiment. Some algorithms are amenable to this sort of parallelization but not all.

Launching Multiple Experiments at Once

simply provide extra arguments to knob (parameter)

e.g. multiple random seeds: