Laker worked on the blog, while Naomi looked at PPO (https://huggingface.co/docs/trl/main/en/ppo_trainer) and got the blog working for her. Forked the repository so we can push changes there.

TODOs: