Best AI Memory for Pydantic AI (2026)
Pydantic AI ships strong type safety, dependency injection via RunContext, and tools — but no memory layer. The framework expects you to bring your own. Six ways to do it, ranked honestly, with a drop-in pattern that keeps the type-safety story intact.
Why this question keeps coming up
Pydantic AI ships a clean, type-safe framework for building agents. It has tools, dependency injection via RunContext, message history per run, and structured outputs. What it deliberately doesn't ship: a memory layer. The team's position is that memory is too domain-specific to bake in, and you should bring your own.
That's right for framework purity. It means most production Pydantic AI builds spend a sprint or two building a memory layer in front of vector store + Postgres before the real agent work starts. This page is about whether that's the right call.
The quick answer
If you want a hosted memory layer that drops into Pydantic AI without breaking the type-safety story: Ricord. If you have Python engineering capacity and want to own the layer: Mem0 OSS wrapped as Pydantic tools. If your agent just needs message history within a run (no cross-run memory): Pydantic AI's built-in message history is enough. The full matrix is below.
The decision matrix
Ten criteria, six options. We're including Pydantic AI's built-in message-history primitive as the "do nothing" baseline so the cost of adding memory is honest.
| Criterion | Ricord | DIY | Mem0 | Letta | Cognee | MessageHistory |
|---|---|---|---|---|---|---|
| Works as a Pydantic AI Tool | DIY | Wrap REST | Wrap REST | Built-in primitive | ||
| Type-safe inputs/outputs (Pydantic models) | DIY | Partial | DIY | DIY | ||
| Persists across agent.run() calls | DIY | Within run only | ||||
| Per-user / per-tenant scoping | DIY | DIY | ||||
| Semantic recall (vector or graph) | ||||||
| Entity extraction + conflict resolution | Manual | |||||
| Browsable wiki of what was learned | ||||||
| Hard delete (GDPR) | DIY | DIY | Drop history | |||
| Cross-client (same memory from Claude Desktop / Cursor) | API only | |||||
| Cost (smallest tier with the listed features) | $15/mo annual | Eng time | $249/mo for graph | Self-host + LLM | $0 OSS / self-host | $0 (built in) |
Slot-by-slot — which fits your Pydantic AI build
If your agent only needs context within a run
MessageHistory(the built-in) is enough. Pydantic AI's RunContextholds the message history for the current run; you don't need anything else. Good for one-shot prompts, classifiers, short-task agents.
If your agent needs cross-run memory and you have engineers
Mem0 OSSas Pydantic tools. Python-first, wraps cleanly. You'll spend real time on the production-grade pieces (conflict resolution, multi-tenant, hard delete) but you'll own them.
If your agent needs cross-run memory without engineering quarters
Ricord as a dependency injected via RunContext. The type-safety story stays intact — define Pydantic models for the save and recall inputs, pass the client through deps, and call the API from your tools. The drop-in pattern is below.
If your product's value is in the extraction pipeline
Cognee for forkable extraction, AGPL license is fine. Cognee details →
If you're shipping a custom agent runtime
Lettabundles runtime + memory. Most Pydantic AI builders won't switch off their framework choice for this — Pydantic AI's value is the type-safety, which Letta's runtime doesn't replicate.
Ricord as a Pydantic AI dependency — the drop-in
Pydantic AI's dependency injection makes this clean. Define a small dataclass for the deps, inject the Ricord client, and call it from @agent.tool functions. Input and output stay Pydantic-typed:
from dataclasses import dataclass
from pydantic_ai import Agent, RunContext
import httpx, os
@dataclass
class Deps:
http: httpx.AsyncClient
ricord_key: str
user_id: str
agent = Agent(
"openai:gpt-4o",
deps_type=Deps,
system_prompt="You're a helpful assistant. Use ricord_recall before answering and ricord_save after learning anything new.",
)
@agent.tool
async def ricord_recall(ctx: RunContext[Deps], query: str) -> str:
"""Recall what we know about a topic from persistent memory."""
r = await ctx.deps.http.get(
"https://api.ricord.ai/v1/memories/recall",
params={"user_id": ctx.deps.user_id, "query": query, "k": 5},
headers={"Authorization": f"Bearer {ctx.deps.ricord_key}"},
)
hits = r.json().get("hits", [])
return "\n".join(h["content"] for h in hits) or "No memory found."
@agent.tool
async def ricord_save(ctx: RunContext[Deps], content: str) -> str:
"""Save a fact for future recall."""
await ctx.deps.http.post(
"https://api.ricord.ai/v1/memories",
json={"user_id": ctx.deps.user_id, "content": content},
headers={"Authorization": f"Bearer {ctx.deps.ricord_key}"},
)
return "Saved."
# Run it
async def main():
async with httpx.AsyncClient() as http:
deps = Deps(http=http, ricord_key=os.environ["RICORD_API_KEY"], user_id="alex")
result = await agent.run("What deploy command should I use?", deps=deps)
print(result.output)Why Ricord wins for production Pydantic AI builds
- The type-safety story stays intact. The Ricord client wraps cleanly inside a dataclass
Deps. Inputs and outputs are typed; Pydantic AI's validation flows through. DIY-on-Postgres often leaks raw dict types into your agent code. - Per-user scoping is a parameter, not an architecture. Pass
user_idthrough deps; the layer handles isolation. No namespace-prefix discipline in your tools. - Conflict resolution at ingest. Pydantic AI's message-history primitive doesn't reason about contradictions across runs. Ricord resolves at write time so recalls return the current truth.
- Cross-client memory.The same memory your Pydantic AI backend writes is reachable from Claude Desktop, Cursor, Codex, Zed, Gemini CLI, Windsurf, and Cline via Ricord's MCP server. Useful when devs debug agents from an IDE while users run them through your backend.
- Browsable wiki view. Your team can see what the agent has learned at
ricord.ai/dashboard— a real organized view of facts and entities, not a vector dump. - Logfire-friendly.Pydantic AI ships built-in Logfire integration for agent observability; Ricord's HTTP calls show up in your Logfire trace alongside the LLM calls. Full agent trace + memory trace in one view.
When to stick with the built-in
Pydantic AI's message history is the right call when:
- Your agent is one-shot or short-lived (classifier, validator, data extractor)
- You don't need anything to survive past the current run
- You're still in "does this agent shape work" mode — adding a memory layer is premature
The day any of those flips, add Ricord as a dependency. Pydantic AI's deps boundary is the right seam to extend on.
Getting started
pip install pydantic-ai httpx # Get an API key at https://ricord.ai/login?signup=true export RICORD_API_KEY=rc_live_... # Drop the @agent.tool functions above into your Pydantic AI agent