Skip to content

mines-opt-ml/decoding-gpt

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

30 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

(copy of syllabus.pdf)

Course Code MATH 598B
Credit Hours 3 Credit Hours
Meeting Times 12:30pm-1:45pm, Tuesdays/Thursdays
Location 141 Alderson Hall
Instructors Samy Wu Fung, Daniel McKenzie, Michael Ivanitskiy
Contact mivanits@mines.edu
Office Hours 269 Chauvenet Hall or Zoom by request. Time TBD, poll here
Course materials github.com/mines-opt-ml/decoding-gpt
Course website miv.name/decoding-gpt

Course Description

Since the public release of GPT-3 in 2020, Large Language Models have made drastic progress across a wide variety of tasks thought to be exclusively in the domain of human reasoning. However, the internal mechanisms by which these models are capable of performing such tasks is not understood. A large fraction of machine learning researchers believe that there are significant risks from training and deploying such models, ranging from mass unemployment and societal harms due to misinformation, to existential risks due to misaligned AI systems. This course will explore the mathematical foundations of Transformer networks, the issues that come with trying to impart human values onto such systems, and the current state of the art in interpretability and alignment research.

Learning outcomes

Over the duration of the course, students will gain:

  1. A solid theoretical understanding of the mechanics of a transformer networks and attention heads
  2. Practical experience with implementing, training, and deploying GPTs for simple tasks
  3. Understanding of the fundamentals of the AI alignment problem, present and future risks and harms, and a broad overview of the current state of the field
  4. Familiarity with current results and techniques in interpretability research for GPT systems

Prerequisites

  • Linear Algebra: Students should have a strong grasp of linear algebra, including matrix multiplication, vector spaces, matrix decompositions, and eigenvalues/eigenvectors. MATH 500 recommended.
  • Machine Learning: Students should be familiar with basic Deep Neural Networks and stochastic gradient descent via backpropagation. CSCI 470 or above recommended.
  • Software: Students should be very comfortable writing software in python. Familiarity with setting up virtual environments, dependency management, and version control via git is recommended. Experience with PyTorch or another deep learning framework is highly recommended.
  • Research Skills: Students should be comfortable finding and reading relevant papers in depth. How you read papers, whether you take notes, etc. is up to you, but you should be able to understand novel material from a paper in depth and be able to explain it to others.

Course Materials

This field moves too quickly for there to currently be an up-to-date textbook on interpretability and alignment for transformers. Below are provided some useful introductory materials which we will be going over in part. Reading or at least skimming some these before the start of the course is recommended -- they are listed in a rough order of priority, but feel free to skip around. We will also be reading a wide variety of papers throughout the course, and you will be expected to find interesting and useful ones.

Evaluation

(Subject to change)

  • Paper presentations: (30%) Students will be expected to select and present elevant papers throughout the course of the semester. Presentations should be ~30 minutes long. A further 15 minutes will be alloted for questions, and students should participate in paper discussions. These papers should be selected with the aim of giving background for the final projects.
  • Mini Project (10%) Working on their own, students will be expected to work through a tutorial on how to use transformers, and write a short report on their findings. Further details will be provided.
  • Final Project: (50%) Students working in groups should select a topic related to the course material, and write a 10-15 page report on their findings. Example topics will be provided, but topic selection is flexible, as long as it relates to alignment or interpretability for ML systems.
  • Class participation (10%): Students will be expected to attend course lectures, participate in discussions, and ask questions. Allowances for absences will be made.

Tenative Course Outline

  • Background
    • Neural Networks
    • Optimization theory
    • Architectures
    • Language Modeling
  • Attention Heads & the Transformer
    • attention heads
    • positional encodings, causal attention
    • Transformers
    • Lab: using transformers
  • Interpretability & Alignment
    • the AI Alignment problem
    • AI safety, ethics, and policy
    • Intro to interpretability
    • Interpretability papers
  • Student Presentations
    • paper presentations
    • final project presentations

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published