LLM Frameworks

LangChain, LlamaIndex, Haystack, DSPy — when they help, when they get in the way

The LLM framework ecosystem moves fast and opinions are loud. Strip away the marketing and these libraries do roughly the same things: wrap provider APIs, give you primitives for chains and agents, ship retrievers and parsers and memory implementations. Whether they're worth pulling in depends on what you're building and how much abstraction you can stomach.

What They Actually Provide

Provider abstraction — call any model the same way.
Composable primitives — chains, runnables, pipelines for stitching steps together.
Built-in retrievers — out-of-the-box vector store integrations.
Document loaders and parsers — PDFs, HTML, code, you name it.
Memory implementations — buffer, summary, vector-backed.
Output parsers — turn model text into typed objects.
Agent loops — ReAct and friends, ready to use.

You can build all of this yourself in a few hundred lines. The frameworks bundle it.

The Major Players

LangChain — the largest, most general-purpose. Strong ecosystem (LangGraph, LangSmith, LangServe), broad integrations, big learning curve. Reputation for over-abstraction; recent versions are more focused. Good when you want batteries included; less good when you want to understand exactly what's happening.

LlamaIndex — started as a RAG-focused library, now broader. Strongest in retrieval and document processing — the fastest path from "here's a folder of PDFs" to "ask questions about them." Cleaner abstractions than LangChain in the RAG path; lighter for agentic patterns.

Haystack — production-oriented, pipeline-centric. Component-based design feels closer to traditional software engineering. Strong for search-style applications, less popular than the above but steady.

DSPy — different paradigm. Instead of writing prompts, you write programs that define inputs, outputs, and metrics; DSPy compiles them into prompts and few-shot examples by optimizing against your data. Steeper conceptual curve, real wins for evaluable, modular pipelines.

Provider SDKs directly — Anthropic SDK, OpenAI SDK, Google GenAI SDK. The "no framework" option. Honest about what's happening; you write the loop yourself.

When a Framework Helps

Prototyping — get to a working demo fast.
Mixed providers — swap models without rewriting application code.
Heavy retrieval workloads — LlamaIndex's document/parsing layer is genuinely useful.
DSPy-shaped problems — pipelines you can evaluate end-to-end and want to optimize.
Team without LLM experience — a framework gives you patterns rather than blank-page choices.

When It Gets in the Way

Simple feature — a chat endpoint with one prompt doesn't need a framework. Provider SDK + 50 lines is faster, cleaner, and easier to debug.
Custom requirements — when your needs deviate from framework assumptions, fighting the abstraction costs more than rewriting it.
Production debugging — deeply nested abstractions hide what actually went over the wire. Add tracing yourself or accept slower diagnosis.
Performance-sensitive code paths — frameworks add latency. Most teams don't notice; some do.

Choosing

A pragmatic decision tree:

One feature, one provider, simple flow — provider SDK directly.
Heavy RAG, document-centric — LlamaIndex.
Pipeline-shaped, evaluable — DSPy.
Many features, many providers, want batteries — LangChain.
Production search system — Haystack.

Most teams end up using more than one — DSPy or LlamaIndex inside a larger LangChain app, or provider SDK plus a single specific framework component.

A Note on Lock-In

Frameworks evolve faster than your code. Major version migrations have repeatedly broken applications built on these libraries. Mitigations:

Pin versions; upgrade deliberately.
Wrap the framework behind your own narrow interface so a swap is isolated.
Don't sprinkle framework primitives throughout your app code; concentrate them at boundaries.

The ones that protect themselves from framework churn move fastest when the next better option appears.

What to Watch

The trend is toward thinner, more focused libraries. Provider SDKs are getting better at the things frameworks used to provide (structured output, tool use, streaming). Specialized libraries (just retrieval, just evals, just agents) are doing better than monoliths. The future probably has fewer kitchen-sink frameworks and more sharp tools you compose yourself.