What’s Included

The following algorithms are implemented in the Spinning Up package:

On-Policy Algorithms

don’t use old data → weaker on sample efficiency.

works out mathematically

tradeoff: sample efficiency vs stability

Off-Policy Algorithms

can use old data → stronger sample efficiency

Bellman’s equations for optimality

DDPG and Q-Learning

Algorithm Function for a PyTorch Implementation