Fine-Tuning & Adaptation

When to teach a model new tricks, and the cheaper alternatives that usually work first

Fine-tuning is the process of further training a pretrained model on your own data so it adopts your tone, format, domain knowledge, or task structure. It's powerful, but it's also rarely the first thing you should reach for.

The Adaptation Ladder

Climb in this order. Stop when the result is good enough.

Better prompting — clearer instructions, better examples, structured output.
Few-shot examples — paste 3–10 high-quality examples in the prompt.
Retrieval (RAG) — let the model pull in fresh, specific facts at runtime.
Tool use — give the model the ability to call functions instead of guessing.
Fine-tuning — actually update the weights.

Fine-tuning is at the bottom because everything above it is faster to ship, easier to update, and easier to debug.

When Fine-Tuning Actually Helps

Style and format — making outputs consistently match a specific voice or schema.
Latency / cost — distilling a frontier model's behavior into a smaller, cheaper model.
Task specialization — narrow, repetitive tasks where prompting hits a ceiling.
Tool-use reliability — teaching a model to use your specific tools precisely.

When fine-tuning doesn't help: adding factual knowledge, fixing reasoning errors, keeping up with changing data. Use retrieval instead.

Techniques

Full fine-tuning — update every parameter. Expensive, requires careful data and infrastructure.
LoRA / QLoRA — train a small set of adapter weights on top of a frozen base. Much cheaper, almost always good enough.
Instruction tuning — supervised fine-tuning on instruction/response pairs.
Preference tuning (DPO, RLHF) — teach the model which of two outputs is preferred. Used to align tone and behavior.

Data Quality Beats Data Volume

A few hundred high-quality examples often outperforms tens of thousands of mediocre ones. Time spent curating, deduplicating, and labeling beats time spent collecting more.

The Operational Cost

A fine-tuned model is a new artifact you have to version, evaluate, monitor, and re-train when the base model deprecates. Don't take that on unless prompting and retrieval genuinely can't get you there.

The Adaptation Ladder

When Fine-Tuning Actually Helps

Techniques

Data Quality Beats Data Volume

The Operational Cost

On this page