>_TheQuery
← Glossary

LoRA

MLOps

Low-Rank Adaptation, a PEFT method that fine-tunes large models by learning small low-rank update matrices instead of modifying the full weight matrices.

Like adding a lightweight lens in front of a camera instead of rebuilding the whole camera body. The base system stays the same, but the output changes in a targeted way.

LoRA stands for Low-Rank Adaptation. It is one of the most widely used Parameter-Efficient Fine-Tuning (PEFT) techniques for adapting large pretrained models. Instead of updating a model's full weight matrices during fine-tuning, LoRA freezes the original weights and learns two much smaller matrices whose product approximates the desired update.

The core idea is that many useful task-specific changes to a model can be represented as a low-rank update rather than a full dense rewrite. If a weight matrix W would normally be updated by some large matrix Delta W, LoRA constrains that update to the form A x B where A and B are much smaller than W. This dramatically reduces the number of trainable parameters while preserving much of the benefit of full fine-tuning.

LoRA is especially common in transformer models, where it is typically applied to attention and sometimes feed-forward projection matrices. Because the base model remains frozen, training requires less memory, and the learned LoRA weights can be saved as a compact adapter checkpoint rather than a full copy of the model.

One reason LoRA became so popular is operational simplicity. You can keep one base model and load different LoRA adapters for different tasks, styles, or domains. At inference time, the LoRA weights can either be applied dynamically or merged into the base weights ahead of deployment, depending on the serving setup.

LoRA is also the foundation for variants such as QLoRA, which combines low-rank adaptation with quantized base weights to make fine-tuning even more memory efficient. In practice, when people say they "fine-tuned" an open model on consumer hardware, they often mean they trained a LoRA rather than updating the full model.

A useful rule of thumb: PEFT is the umbrella category, and LoRA is the best-known tool under that umbrella.

Last updated: March 31, 2026