OpenAI Gym
OpenAI Gym is an open source Python library providing environment for reinforcement learning algorithm. It will gradually transfer to Gymnasium library, I will try the Gymnasium in the future.
I first use Gym in my final project for the course, intelligent control.
bipedal-walker normal and hardcore
In the course’s final project, we implement PPO and FORK respectively to pass the normal and hardcore environment.
We also present our work to the class.
Later, as side projects, I followed the tutorial to implement DQN (Deep Reinforcement Learning) and DDPG (Deep Deterministic Policy Gradient) in different environment using Stable Baseline3.
Stable Baseline3 is a set of reliable implementations of reinforcement learning algorithms in PyTorch. To support Gym version >= 0.26, you will need the pull request here.
highway-fast-v0
The first task is on a highway environment, the car needs to avoid crashing into other cars by switching to different lanes smoothly.
If we let the car idle, it will crash.
After using DQN training for 50000 steps, the best model success rate is 0.84 out of 25 tests.
The learning curve and the evaluation result.
PandaReach-v2
The second task is the localization of a Panda robot arm. Hindsight Experience Replay Buffer is added to help the robot learn sparse reward better.
Let’s first move the arm using intuition.
After using DDPG training for 300000 steps, the final success rate is 0.97 out of 30 tests.
There are still vibration after reaching the goal, future work would focus on imposing speed limit or combine traditional control to improve the performance and increase safety.