Best AI Memory for the Vercel AI SDK (2026)
The Vercel AI SDK is a brilliant LLM + tool-calling layer — and it ships with zero persistent memory. You manage the messages array yourself. Six ways to give an AI SDK agent memory that survives sessions, users, and deploys — evaluated honestly.
The quick answer
The Vercel AI SDK gives you a clean generateText / streamText + tools surface, but it does not persist anything — you own the messages array. If you want memory that survives sessions and works per-user without standing up a vector DB: Ricord, called as two API endpoints (or wrapped as tools). If retrieval quality is your edge and you want to own the stack: Mem0 OSS. The matrix below covers the rest.
Why this matters for the AI SDK specifically
The AI SDK is deliberately unopinionated about state. Each call takes a messagesarray and returns a result; what you keep between calls is entirely up to you. For a single request that's perfect. For an agent that should remember a user across sessions, it means you're on the hook for the entire memory stack — storage, retrieval by meaning, per-user scoping, deduping contradictions, and deletion. That's a quarter of engineering most teams discover three months in.
The six approaches below differ on how much of that you build vs. buy — from stuffing everything into the message window, to rolling your own DB, to a hosted layer you call in a couple of lines.
The decision matrix
Eight criteria, six options. The two "built-in" columns are the AI SDK's actual defaults: keep history in the message window, or persist it to a database you build and own.
| Criterion | Ricord | Mem0 | Letta | Supermemory | DIY (your DB) | Messages array |
|---|---|---|---|---|---|---|
| Persists across sessions / deploys | You build it | |||||
| Persists across users (multi-tenant) | Basic | You build it | ||||
| Semantic recall (by meaning) | You build it | |||||
| Entity extraction + conflict resolution | Manual | Basic | ||||
| Browsable wiki + knowledge graph | Pro only | |||||
| Same memory in Claude Desktop / Cursor / ChatGPT | API only | |||||
| No DB / vector store to run yourself | OSS = self-host | Self-host | ||||
| Cost (smallest tier with memory) | $12/mo annual | $249/mo for graph | Self-host + LLM | $29/mo | DB bill + eng time | $0 (tokens) |
Slot-by-slot — which fits your build
If your agent should remember users without you building memory infra
Ricord is two HTTP calls — one to save, one to recall — that you wrap as AI SDK tools or invoke in your route handler around generateText. No vector store to run, no schema to design; you get semantic recall, per-user scoping, conflict resolution, and a browsable wiki of what the agent learned.
If you're only doing short, stateless requests
The messages arrayis all you need. Pass the recent turns in the request, don't persist anything, ship it. The moment you need to recall something from last week — or scope memory per user — you've outgrown it.
If you have the engineers and memory is your moat
DIY on your own DB (Postgres + pgvector, or a vector DB) or Mem0 OSS (Apache-2). Full control, and the right call if retrieval quality is the product. Budget real time for the unglamorous parts — multi-tenant isolation, conflict resolution, hard delete. When OSS wins →
If you're shipping a whole agent runtime
Lettabundles an agent runtime with memory. If you've already chosen the AI SDK as your generation layer, though, a memory layer you call (Ricord) composes more cleanly than adopting a second runtime.
The drop-in: two calls around generateText
The AI SDK has no memory, so you add it where you already control the request. These are the real endpoints — wrap them as tools, or call them in your route handler before and after generation:
const RICORD = "https://api.ricord.ai/v1";
const headers = {
"Authorization": `Bearer ${process.env.RICORD_API_KEY}`,
"Content-Type": "application/json",
};
// Recall relevant memory, then feed it to generateText()
const r = await fetch(`${RICORD}/memories/search`, {
method: "POST", headers,
body: JSON.stringify({ query: userInput, limit: 5 }),
});
const { context } = await r.json(); // join of the relevant memories
const { text } = await generateText({
model,
system: `You are a helpful assistant.\n\nWhat you know:\n${context}`,
prompt: userInput,
});
// Save anything worth remembering after the turn
await fetch(`${RICORD}/memories/fact`, {
method: "POST", headers,
body: JSON.stringify({ content: "User prefers TypeScript and pnpm" }),
});That's the whole integration — no vector store, no schema. The same memory is also reachable from Claude Desktop, Cursor, and Codex via Ricord's MCP server, so an agent you build on the AI SDK shares memory with the tools you code in.
Getting started
Grab a key and drop the two calls into your route. If you'd rather own the stack, start with Mem0 OSS or your own pgvector table — just budget for the production-grade parts.