Chapter 6 of 11
Chapter 4 - Neural Networks: When Simplicity Failed
The Crux
For decades, ML was linear models and hand-crafted features. Then we hit a wall: some patterns are too complex to engineer by hand. Neural networks didn't win because they're better in all cases-they won because they scale to complexity that breaks classical methods.
Why Deep Learning Was Inevitable
The Limits of Linearity
Linear models assume: output = w₁·feature₁ + w₂·feature₂ + ...
This works if patterns are linear. But reality isn't linear.
Example: Image classification. Raw pixels → "is this a cat?"
A linear model on pixels learns: "if pixel 237 is bright and pixel 1842 is dark, probably a cat."
But cats appear at different positions, scales, orientations. Pixel 237 sometimes has cat ear, sometimes background. No linear combination of pixels works.
The Classical Fix: Feature engineering. Extract edges, textures, shapes (SIFT, HOG, etc.). These are manually designed.
The Problem: For images, we figured out edges and textures. For speech? Video? 3D point clouds? Feature engineering is domain-specific, labor-intensive, and eventually impossible.
The Neural Network Promise
Instead of hand-crafting features, learn them.
Input → Layer 1 (learns edges) → Layer 2 (learns textures) → Layer 3 (learns parts) → Layer 4 (learns objects) → Output
Each layer is a learned feature transformation. The model discovers useful representations automatically.
When it works: You have lots of data and patterns too complex for manual features.
When it doesn't: Small data, simple patterns, or need for interpretability.