🧠 Starter templates for doing interpretability research
-
Updated
Jul 16, 2023
🧠 Starter templates for doing interpretability research
📊 Benchmarking the safety of AI systems
This Alignment Jam Hackathon project explores whether the concept of "logit lens" applies to the encoder and decoder layers in Whisper, an end-to-end speech recognition model.
Add a description, image, and links to the alignment-jam topic page so that developers can more easily learn about it.
To associate your repository with the alignment-jam topic, visit your repo's landing page and select "manage topics."