Chapter 4 - Neural Networks: When Simplicity Failed

The Crux

For decades, ML was linear models and hand-crafted features. Then we hit a wall: some patterns are too complex to engineer by hand. Neural networks didn't win because they're better in all cases-they won because they scale to complexity that breaks classical methods.

Why Deep Learning Was Inevitable

The Limits of Linearity

Linear models assume: output = w₁·feature₁ + w₂·feature₂ + ...

This works if patterns are linear. But reality isn't linear.

Example: Image classification. Raw pixels → "is this a cat?"

A linear model on pixels learns: "if pixel 237 is bright and pixel 1842 is dark, probably a cat."

But cats appear at different positions, scales, orientations. Pixel 237 sometimes has cat ear, sometimes background. No linear combination of pixels works.

The Classical Fix: Feature engineering. Extract edges, textures, shapes (SIFT, HOG, etc.). These are manually designed.

The Problem: For images, we figured out edges and textures. For speech? Video? 3D point clouds? Feature engineering is domain-specific, labor-intensive, and eventually impossible.

The Neural Network Promise

Instead of hand-crafting features, learn them.

Input → Layer 1 (learns edges) → Layer 2 (learns textures) → Layer 3 (learns parts) → Layer 4 (learns objects) → Output

Each layer is a learned feature transformation. The model discovers useful representations automatically.

When it works: You have lots of data and patterns too complex for manual features.

When it doesn't: Small data, simple patterns, or need for interpretability.

← Chapter 3 - Classical Machine Learning: Thinking in Features1 / 11