Sunday, 12/3/2023 | Notion

Goal for today: interpret and become happy with our sparse autoencoder.

While it trains, we can learn how to use the blog format well.

Progress today:

Naomi found PPO notebook that she is working on understanding and running. https://github.com/huggingface/trl/tree/a60ceefa694d62565789b03e8fa35244bc46c9ba
We interpreted the top 2 most frequent features in 4_checkpoint_7.pt SAE
Laker implemented dead neuron resampling and changed to a lower learning rate. We are training a new sparse autoencoder now!

Next steps:

Naomi had trouble compiling the blog format and needs help from TAs (try on Tuesday).
On Monday, Laker will begin writing our blog post. He will focus on literature review and maybe explore interactive plotly figures.
For the rest of Sunday, Naomi will continue exploring PPO in the notebook she found.

We will meet on Tuesday at 11 AM.