Tuesday, 12/5 | Notion

Laker worked on the blog, while Naomi looked at PPO (https://huggingface.co/docs/trl/main/en/ppo_trainer) and got the blog working for her. Forked the repository so we can push changes there.

TODOs:

Work on the blog (literature review, explaining SAEs, Laker)
Look at features in SAE 5
Surgically insert SAE 5 into Pythia-6.9B, with HookedTransformer (Naomi)
Do PPO on overall new model (Naomi)
Do PPO on only SAE in new model? (Laker + Naomi, when the time comes)
- First step, if helpful: try to do PPO on a 2-parameter model but only allow 1 parameter to change, or something