AI Backbone — Complete Study Guide

10 articles covering the fundamental concepts of modern AI

Reading Order

Rendering diagram...

Article Index

#	Topic	Core Formula	Key Concept
01	Forward Pass	$\mathbf{z} = \mathbf{W}\mathbf{x} + \mathbf{b}$	Data flows left → right through layers
02	Loss Functions	$L = (y - \hat{y})^2$	Measuring how wrong predictions are
03	Backpropagation	$\frac{\partial L}{\partial w} = \frac{\partial L}{\partial a}\cdot\frac{\partial a}{\partial z}\cdot\frac{\partial z}{\partial w}$	Chain rule distributes blame
04	Gradient Descent	$w \leftarrow w - \alpha\nabla L$	Walk downhill on the loss landscape
05	Activation Functions	$\text{ReLU}(z) = \max(0,z)$	Non-linearity enables complex learning
06	Embeddings	$\mathbf{e}_i = \mathbf{E}[i]$	Discrete tokens → dense meaning vectors
07	Attention & Transformers	$\text{softmax}(\mathbf{QK}^T/\sqrt{d_k})\mathbf{V}$	Every token attends to every token
08	RLHF	$r - \beta D_{KL}(\pi_\theta\\|\pi_{SFT})$	Align model with human values
09	Regularization	$L + \lambda\sum w^2$	Generalize, don't memorize
10	Tokenization	$1\text{ token} \approx 0.75\text{ words}$	Text → numbers the model can process

Quick-Glance Cheat Sheet

The Training Loop

$\underbrace{x \to \hat{y}}_{\text{forward pass}} \to \underbrace{L(\hat{y}, y)}_{\text{loss}} \to \underbrace{\nabla_W L}_{\text{backprop}} \to \underbrace{W \leftarrow W - \alpha\nabla L}_{\text{gradient descent}}$

Key Activations

Function	Formula	Use
ReLU	$\max(0,z)$	Hidden layers
Sigmoid	$\frac{1}{1+e^{-z}}$	Binary output
Softmax	$\frac{e^{z_i}}{\sum e^{z_j}}$	Multi-class output
GELU	$z\cdot\Phi(z)$	Transformers

Attention Formula

$\text{Attention}(Q,K,V) = \text{softmax}\!\left(\frac{QK^T}{\sqrt{d_k}}\right)V$

Regularization at a Glance

Method	Prevents	How
L2	Large weights	$+\lambda\sum w^2$
Dropout	Co-dependency	Random zeroing
Early Stopping	Over-training	Monitor val loss
Batch Norm	Covariate shift	Normalize activations

Generated from a full conversational deep-dive into AI foundations. Each article contains: intuition, real-life analogy, math derivations, examples, and diagrams.