π§ AI Math Roadmap β From Zero to Building GPT From Scratch
Phase 1: Foundations (Start Here)
1. Basic Math Refresher (if needed)
- Arithmetic, exponents, logarithms
- Algebra: solving equations, factorization
- Functions and graphs
- Inequalities, absolute value
2. Linear Algebra (Core for All AI Models)
- Vectors and vector spaces
- Dot product, cross product
- Matrices: multiplication, inverse, transpose
- Matrix rank, determinant, trace
- Systems of linear equations (Gaussian elimination)
- Eigenvalues and eigenvectors
- Diagonalization and Singular Value Decomposition (SVD)
- Norms, projections, orthogonality
π Resources:
- 3Blue1Brownβs Essence of Linear Algebra
- MIT OCW - Linear Algebra (Gilbert Strang)
3. Calculus (Backprop Depends on It)
- Limits and continuity
- Derivatives (single-variable, then partial)
- Chain rule
- Gradients, Jacobians, Hessians
- Optimization: critical points, max/min
- Multivariable integrals
- Taylor series
π Resources:
- MIT OCW Calculus IβIII
- Khan Academy
Phase 2: Probabilistic Thinking
4. Probability & Statistics
- Probability rules: union, intersection, conditional
- Random variables (discrete and continuous)
- Expectation, variance, covariance
- Distributions: Bernoulli, Binomial, Gaussian, Poisson
- Bayesβ Theorem
- Law of Large Numbers, Central Limit Theorem
- Entropy, cross-entropy, KL divergence
- MLE, MAP estimation
- Markov chains
π Resources:
- StatQuest (YouTube)
- Khan Academy - Probability & Stats
Phase 3: Learning to Train
5. Optimization
- Gradient descent: SGD, Adam, RMSProp
- Loss functions and minimization
- Convex vs non-convex functions
- Saddle points, local/global minima
- Momentum, learning rate schedules
- Lagrange multipliers
- Lipschitz continuity
π Resources:
- Convex Optimization by Boyd
- Deep Learning Book - Chapter 4
6. Numerical Methods
- Floating point precision
- Numerical differentiation
- Gradient checking
- LU, QR decomposition
- Efficient matrix operations
π Resources:
- Numerical Recipes (book)
- MIT OCW Numerical Methods
Phase 4: Building Intelligence
7. Information Theory
- Entropy
- Joint, marginal, conditional entropy
- Cross-entropy loss
- KL divergence
- Mutual information
- Perplexity
π Resources:
- Elements of Information Theory - Cover & Thomas
- StatQuest - Info Theory
8. Graph Theory
- Directed Acyclic Graphs (DAGs)
- Topological sorting
- Adjacency matrices
- Attention graphs
π Resources:
- MIT 6.006 - Intro to Algorithms
- Khan Academy
9. Combinatorics & Discrete Math
- Set theory, logic
- Permutations and combinations
- Recursion and recurrence relations
- Trees and graphs
- Symbolic representations
π Resources:
- Discrete Mathematics by Rosen
- MIT OCW - Mathematics for CS
Phase 5: Advanced & Specialized Topics
10. Fourier Analysis & Signal Processing
- Fourier series, transforms
- Discrete Fourier Transform (DFT, FFT)
- Convolutions (1D, 2D)
- Frequency domain filtering
π Resources:
- The Scientist & Engineerβs Guide to DSP
- 3Blue1Brown - Fourier Intuition
11. Topology & Manifolds (Advanced)
- Metric spaces, continuity
- Manifolds and embeddings
- Lipschitz mappings
- Generalization theory
π Resources:
- Topology by Munkres
- Geometric Deep Learning literature
12. Category Theory (God Mode)
- Sets, categories, morphisms
- Functors, monoids, natural transformations
- Compositionality of learning systems
π Resources:
- Category Theory for Programmers - Bartosz Milewski
- nLab (for deep research)
Final Phase: Put It All Together
Youβll be able to:
- Build a neural net from scratch (forward + backprop)
- Write your own autodiff engine
- Train it with your own optimizer
- Implement transformers and attention
- Build a GPT-style model from raw math and code