Machine Learning
GPU Kernels & Compilers: A Compressed Interview Course
From linear algebra to FlashAttention, MoE, and the XLA lowering pipeline — the working set for a GPU-kernel / ML-compiler interview, told with diagrams, the underlying math, and real kernels. Weighted toward the compiler / XLA / PTX axis.