Simple starter code for experiments on open-source LLMs. Built for my SPAR project participants, but anyone is welcome to use it.
# optional: create a virtual environment
python3 -m venv venv
source venv/bin/activate
# run from the root of the repo, this will install everything you need
pip install -e .To download Llama models from huggingface and/or use Claude API, add a .env file in the root of the repo with your API keys (see .env.example).
All code is in lmexp/
Example data and generation scripts using Claude API.
Example Llama 3 fine-tuning implementation. Quantizes to 8-bit. You may also want to try LoRA / PEFT methods / torchtune. Meta's fine-tuning example code can be found here.
Implementation of model-internals techniques like CAA and linear probing in terms of an abstract HookedModel class. See models/implementations/gpt2small.py for an example of how to use this class. The idea is that we can write a single implementation of a technique, and then apply it to any model we want. Note that this is very similar to the TransformerLens paradigm but pared down a lot to just provide the functionality we're likely to use. Feel free to use TransformerLens if you want more features.
Model implementations. Currently only gpt2 is implemented as a basic example that will load on your laptop. You can add more models by following the same pattern.
Jupyter notebooks demonstrating basic use-cases.
- Integrate with OpenAI GPT-2 sparse autoencoders
- Implement more activation modification approaches such as projection/clamping, token-id-aware steering