Friday, 10/27/2023

Log:

Today we (mostly) finished yesterday’s work: we figured out how to upload models to HuggingFace (go Naomi!) and we made a functional untied autoencoder (go Laker!).
Insight: software engineering practices will become important to manage complexity as our project progresses (docstrings, good variable names, useful functions).

Todos:

Train an untied autoencoder on Pythia 70M Chess. Run it through Neel’s visualizations.
- Subtask: upload the chess dataset to HuggingFace.
- Naomi will try to do these today.
Naomi could train a sparse autoencoder on Pythia 6B overnight.
- We can use the small dataset of chess games to make an autoencoder. Even though the encoder won’t be great, it’ll let us create the pipeline to fine tune.
Fine tuning:
- To let biases change, let b_encoder have required_grad = True.
- To let feature weights change, create float vector (shape d_autoencoder) with required_grad = True.
- Goal to create an extension class on top of untied autoencoder that stores the extra feature_weights vector. (Or add this vector to the original autoencoder class.) Write a training loop for fine tuning.

Feature 29 on the version 3 model seems to be the move number!

Untitled

It mostly activates on <space><number>.

I (Naomi) had to use V100 because a T4 was running out of memory…