Table of Contents
Concepts
Architecture
Normalization & Stability
Efficiency & Pruning
Scaling
Concepts
Definitions and explanations of machine learning terms, organized as an interconnected graph.
Architecture
LLM
Softmax Attention
Scaled Dot-Product Attention
Multi-Head Attention
Residual Connections
Gradient Highway
Mixture of Experts
Expert Routing
Linear Attention
RNN
Normalization & Stability
Layer Normalization
PreNorm
PostNorm
Hidden-State Growth
Vanishing Gradients
Efficiency & Pruning
Model Pruning
Layer Pruning
Block AttnRes
Pipeline Communication
Scaling
Scaling Laws
Chinchilla Scaling
Neural Scaling
Kimi Linear