Federated Learning
Training TechniquesA machine learning approach where a model is trained across multiple decentralized devices or servers holding local data, without ever exchanging the raw data itself - only model updates are shared.
Each hospital trains a diagnostic model on its own patient data, then shares only what the model learned - not the patient records. The central server combines the lessons into a smarter model and sends it back. No hospital ever sees another hospital's data.
Federated learning is a training technique where a shared model is improved collaboratively across many participants without centralizing their data. Instead of sending raw data to a central server, each participant trains the model locally on their own data and sends only the model updates (gradients or weight changes) back to a coordinating server. The server aggregates these updates to produce an improved global model, then sends it back to participants for the next round.
Google introduced federated learning in 2016 to improve the Gboard keyboard predictions on Android phones. Each phone trained a local model on the user's typing patterns, sent only the model update to Google's servers, and the aggregated model improved predictions for all users without Google ever seeing individual keystrokes.
The core algorithm is Federated Averaging (FedAvg). Each participant trains for several local epochs, then the server averages the resulting weights proportional to each participant's dataset size. This reduces communication rounds significantly compared to sending gradients after every batch.
Federated learning addresses three practical problems: privacy (raw data never leaves the device), regulation (data stays within jurisdictional boundaries, relevant for GDPR and HIPAA), and bandwidth (sending model updates is far cheaper than transmitting raw datasets).
Challenges include non-IID data (participants have different data distributions, which can cause the global model to diverge), communication overhead (even compressed updates are expensive at scale), and security (malicious participants can poison the model by sending adversarial updates). Techniques like secure aggregation and differential privacy are used to mitigate these risks.
Federated learning is used in healthcare (hospitals collaborating on diagnostic models without sharing patient records), finance (banks building fraud detection models without sharing transaction data), and mobile devices (improving autocomplete, voice recognition, and recommendations locally).
Types of Federated Learning
- Horizontal Federated Learning (HFL): Used when datasets across clients share the same feature space but differ in samples (e.g., two regional banks with different customers but identical financial data structures).
- Vertical Federated Learning (VFL): Used when datasets share the same sample space but differ in feature space (e.g., a bank and an e-commerce platform collaborating to build a better profile of shared users without revealing underlying data).
- Federated Transfer Learning (FTL): Applied when clients have very little overlap in both sample and feature space.
- Cross-device vs. Cross-silo: Cross-device involves millions of highly unreliable nodes (like smartphones). Cross-silo involves a smaller number of highly reliable participants (like hospitals or corporate data centers).
Examples of Federated Learning
- Healthcare and Genomics: Hospitals collaborating to train predictive models (like tumor detection or COVID-19 outcome prediction) using regional patient records without moving sensitive identifiable health records (PHI) across jurisdictions, maintaining HIPAA compliance.
- Mobile Keyboards & Voice Assistants: Google's Gboard or Apple's QuickType training auto-complete and next-word-prediction models solely on device, meaning private chat conversations never touch a central server.
- Autonomous Vehicles: Fleets of self-driving cars continuously adapting to weather conditions and sharing model updates with the fleet without transmitting continuous 4K video feeds back to HQ, thus saving immense bandwidth.
- Financial Fraud Detection: Banks building more robust fraud detection systems by securely federating insights across institutions, countering international money laundering rings while adhering to strict financial privacy regulations.
Key Federated Learning Algorithms
- FedAvg (Federated Averaging): The foundational algorithm developed by Google. It operates by having each client train a local model for several epochs before the central server averages all exactly-weighted client updates to form the new global model.
- FedProx: Designed to address statistical heterogeneity (non-IID data) and system heterogeneity (some devices dropping out or computing fewer epochs). It adds a proximal term to the local objective function to restrict local updates from moving too far from the global model.
- Adaptive Federated Optimization (FedOpt): A framework that applies adaptive optimizers (like Adagrad, Adam, and Yogi) to the server-side update step, creating variants like FedAdagrad, FedAdam, and FedYogi, often yielding faster convergence and better performance on complex models than standard FedAvg.
- Secure Aggregation: While technically a cryptographic protocol rather than learning algorithm, it is critical in practical FL. It ensures the central server only sees the sum of the updates from thousands of devices rather than individual parameters, drastically improving privacy.
- FedSVRG & Scaffold: Other advanced variants used to handle the inherent variance and drift in local client models across decentralized locations.
References & Resources
Related Terms
Last updated: March 11, 2026