The Pipeline

Here is the path from raw text to a model that generates new sentences:

training data  ->  tokenizer  ->  model  ->  training  ->  saved weights
                                                               |
                   tokenizer  ->  model  ->  generation  <-  load weights

We build each piece from scratch, in order. Each chapter introduces one component and ends with a Complete Code page containing the finished source for that stage.

What You Will Build

Chapter	You will create	Pipeline stage
The Tokenizer	`tokenizer.ts`	Turns words into numbers and back
The Autograd Engine	`autograd.ts`	Automatic differentiation (makes training possible)
Neural Network Primitives	`nn.ts`	Linear layers, softmax, normalization
The Model	`model.ts`, `rng.ts`	The GPT architecture: config, weights, forward pass
Training	`train.ts`	The training loop and optimizer
Saving the Model	`saveModel`, `loadModel`	Serialize trained weights to disk and load them back
Generation	`generate.ts`	Inference: turning a trained model into sentences
Smoke Test	`phrases-train.ts`, `phrases-generate.ts`	Entry points to train and generate
Fine-Tuning	`phrases-fine-tune.ts`	Adapt a trained model to new data

Keyboard shortcuts

LLMs, the Hard Way

The Pipeline

What You Will Build