Tools: Reinforcement Learning ML

OpenAI Gym

OpenAI Gym is an open source Python library providing environment for reinforcement learning algorithm. It will gradually transfer to Gymnasium library, I will try the Gymnasium in the future.

I first use Gym in my final project for the course, intelligent control.

bipedal-walker normal and hardcore

In the course’s final project, we implement PPO and FORK respectively to pass the normal and hardcore environment.

Normal mode code

Hardcore mode code

We also present our work to the class.



Later, as side projects, I followed the tutorial to implement DQN (Deep Reinforcement Learning) and DDPG (Deep Deterministic Policy Gradient) in different environment using Stable Baseline3.

Stable Baseline3 is a set of reliable implementations of reinforcement learning algorithms in PyTorch. To support Gym version >= 0.26, you will need the pull request here.

highway-fast-v0

Code

The first task is on a highway environment, the car needs to avoid crashing into other cars by switching to different lanes smoothly.

If we let the car idle, it will crash.

Original

After using DQN training for 50000 steps, the best model success rate is 0.84 out of 25 tests.

DQN

The learning curve and the evaluation result.

rollout

eval

PandaReach-v2

Code

The second task is the localization of a Panda robot arm. Hindsight Experience Replay Buffer is added to help the robot learn sparse reward better.

Let’s first move the arm using intuition.

Original

After using DDPG training for 300000 steps, the final success rate is 0.97 out of 30 tests.

DDPG

There are still vibration after reaching the goal, future work would focus on imposing speed limit or combine traditional control to improve the performance and increase safety.