Awesome Open Source
Search
Programming Languages
Languages
All Categories
Categories
About
Search results for mechanistic interpretability
mechanistic-interpretability
x
9 search results found
Pyvene
⭐
101
Stanford NLP Python Library for Understanding and Improving PyTorch Models via Interventions
Decisiontransformerinterpretability
⭐
48
Interpreting how transformers simulate agents performing RL tasks
Sparse Probing Paper
⭐
29
Sparse probing paper full code.
Interpretability Starter
⭐
26
🧠 Starter templates for doing interpretability research
Automated Explanations
⭐
26
Explain a black-box module in natural language.
Codebook Features
⭐
14
Sparse and discrete interpretability tool for neural networks
Universal Neurons
⭐
8
Universal Neurons in GPT2 Language Models
Deepdecipher
⭐
7
🦠 DeepDecipher: An open source API to MLP neurons
Steering Vectors
⭐
5
Steering vectors for transformer language models in Pytorch / Huggingface
Related Searches
Jupyter Notebook Mechanistic Interpretability (6)
Python Mechanistic Interpretability (3)
Large Language Models Mechanistic Interpretability (3)
1-9 of 9 search results
Privacy
|
About
|
Terms
|
Follow Us On Twitter
Copyright 2018-2024 Awesome Open Source. All rights reserved.