Goal for today: interpret and become happy with our sparse autoencoder.
While it trains, we can learn how to use the blog format well.
Progress today:
- Naomi found PPO notebook that she is working on understanding and running. https://github.com/huggingface/trl/tree/a60ceefa694d62565789b03e8fa35244bc46c9ba
- We interpreted the top 2 most frequent features in
4_checkpoint_7.pt
SAE
- Laker implemented dead neuron resampling and changed to a lower learning rate. We are training a new sparse autoencoder now!
Next steps:
- Naomi had trouble compiling the blog format and needs help from TAs (try on Tuesday).
- On Monday, Laker will begin writing our blog post. He will focus on literature review and maybe explore interactive plotly figures.
- For the rest of Sunday, Naomi will continue exploring PPO in the notebook she found.
We will meet on Tuesday at 11 AM.