Week 26 — Coding Projects

Core

Implement deterministic GD and SGD.

NumPy: Fit linear regression with GD and SGD. Compare batch size effects. Track loss over iterations.
Metal: Compute batch gradients on GPU, optimizer update on CPU first. Multi-pass reduction for gradient. · Reading: MBT — batched compute, reductions, multi-pass update structure.
Vulkan: Compute passes for objective + gradient evaluation. · Reading: Vulkan Book — compute passes for objective + gradient evaluation.
CUDA: SGD mini-batch kernel with reduction of partial gradients. · Reading: CUDA Book — mini-batch compute, reduction of partial gradients.
Stretch: Add momentum or Adam. Compare convergence traces.
Verify: SGD is noisier but often cheaper per step · Learning rate matters dramatically · GPU batch gradient matches CPU reference.