Tensors & Matrix Ops
Deep learning is fundamentally about moving numbers through structured containers called Tensors — n-dimensional arrays. A 0D tensor (scalar) is a single number like 3.14. A 1D tensor (vector) is a list with shape (n,). A 2D tensor (matrix) has shape (rows, cols). Deep networks routinely work with 4D tensors (batch × channels × height × width).
The most critical operation in all of deep learning is Matrix Multiplication. When input data flows through a neural network layer, the input matrix of shape (m × n) is multiplied by a weight matrix of shape (n × k), producing an output of shape (m × k). The inner dimensions must always match — this is the one rule you cannot violate.
The dot product is the atomic building block: multiply corresponding elements, then sum. Every linear layer — from a simple regression to GPT-4 — reduces to exactly this operation, scaled up by billions of parameters. Understanding this is understanding the heartbeat of neural networks.