ToyAI repo contains the toy implementation of ML and AI concepts like gradient descent and
- ToyGrad: A toy implementation of automatic differentiation, inspired by A. Karparthy's
- ToyGPT: A toy implementation of GPT-2 transformer architecture in PyTorch, from Karpathy's tutorial:
See more about Autograd's basic concpets, architecture, and implementation at the basic concept of automatic differentiation.
See demo in Jupyter notebook how to use ToyGrad.
class Model(Module):
def __init__(self, input_features, output_features):
self.layer1 = Linear(input_features, 8)
self.layer2 = Linear(8, 4)
self.output = Linear(4, output_features)
def forward(self, X):
X = self.layer1(X)
X = [xi.relu() for xi in X]
X = self.layer2(X)
X = [xi.relu() for xi in X]
output = self.output(X)
return output
ToyGPT implements the GPT-2 architecture, mostly from Karpathy's step-by-step explanation of the re-implementing GPT-2 tutorial from
It's a great lesson to learn about the inner workings of transformer architecture, from tokenization, embedding, attention mechanism, and generation concepts like sampling.
# initializing
def __init__(self, ...):
self.transformer = nn.ModuleDict(
# embedding layer, translates input ids to embedding representation
wte=nn.Embedding(config.vocab_size, config.n_embed),
# positional encoding layer
wpe=nn.Embedding(config.block_size, config.n_embed),
# multi-head attention layers
h=nn.ModuleList([Block(config) for _ in range(config.n_layers)]),
# layer normalization
It currently supports inference on CPU.
Example experiments with generations:
from modeling_gpt import GPT
model = GPT.from_pretrained('gpt2')
import sys
def gen(prompt, **kwargs):
for t in model.generate(prompt=prompt, **kwargs):
prompt = '''Hello world how are'''
gen(prompt, temperature=1.0, sampling=True)
you doing? How are you doing? You don't have a job?
It's the most wonderful thing in the world. We're living in this world, you're so wonderful, I can't believe you're here. I'm so happy. It's the happiest day of my life