Log:
- Today we (mostly) finished yesterday’s work: we figured out how to upload models to HuggingFace (go Naomi!) and we made a functional untied autoencoder (go Laker!).
- Insight: software engineering practices will become important to manage complexity as our project progresses (docstrings, good variable names, useful functions).
Todos:
- Train an untied autoencoder on Pythia 70M Chess. Run it through Neel’s visualizations.
- Subtask: upload the chess dataset to HuggingFace.
- Naomi will try to do these today.
- Naomi could train a sparse autoencoder on Pythia 6B overnight.
- We can use the small dataset of chess games to make an autoencoder. Even though the encoder won’t be great, it’ll let us create the pipeline to fine tune.
- Fine tuning:
- To let biases change, let b_encoder have required_grad = True.
- To let feature weights change, create float vector (shape d_autoencoder) with required_grad = True.
- Goal to create an extension class on top of untied autoencoder that stores the extra feature_weights vector. (Or add this vector to the original autoencoder class.) Write a training loop for fine tuning.
Feature 29 on the version 3 model seems to be the move number!
It mostly activates on <space><number>.
I (Naomi) had to use V100 because a T4 was running out of memory…