Skip to main content
All comparisons
Use-case roundup

Best AI Memory for Pydantic AI (2026)

Pydantic AI ships strong type safety, dependency injection via RunContext, and tools — but no memory layer. The framework expects you to bring your own. Six ways to do it, ranked honestly, with a drop-in pattern that keeps the type-safety story intact.

Why this question keeps coming up

Pydantic AI ships a clean, type-safe framework for building agents. It has tools, dependency injection via RunContext, message history per run, and structured outputs. What it deliberately doesn't ship: a memory layer. The team's position is that memory is too domain-specific to bake in, and you should bring your own.

That's right for framework purity. It means most production Pydantic AI builds spend a sprint or two building a memory layer in front of vector store + Postgres before the real agent work starts. This page is about whether that's the right call.

The quick answer

If you want a hosted memory layer that drops into Pydantic AI without breaking the type-safety story: Ricord. If you have Python engineering capacity and want to own the layer: Mem0 OSS wrapped as Pydantic tools. If your agent just needs message history within a run (no cross-run memory): Pydantic AI's built-in message history is enough. The full matrix is below.

The decision matrix

Ten criteria, six options. We're including Pydantic AI's built-in message-history primitive as the "do nothing" baseline so the cost of adding memory is honest.

CriterionRicordDIYMem0LettaCogneeMessageHistory
Works as a Pydantic AI ToolDIYWrap RESTWrap RESTBuilt-in primitive
Type-safe inputs/outputs (Pydantic models)DIYPartialDIYDIY
Persists across agent.run() callsDIYWithin run only
Per-user / per-tenant scopingDIYDIY
Semantic recall (vector or graph)
Entity extraction + conflict resolutionManual
Browsable wiki of what was learned
Hard delete (GDPR)DIYDIYDrop history
Cross-client (same memory from Claude Desktop / Cursor)API only
Cost (smallest tier with the listed features)$15/mo annualEng time$249/mo for graphSelf-host + LLM$0 OSS / self-host$0 (built in)

Slot-by-slot — which fits your Pydantic AI build

If your agent only needs context within a run

MessageHistory(the built-in) is enough. Pydantic AI's RunContextholds the message history for the current run; you don't need anything else. Good for one-shot prompts, classifiers, short-task agents.

If your agent needs cross-run memory and you have engineers

Mem0 OSSas Pydantic tools. Python-first, wraps cleanly. You'll spend real time on the production-grade pieces (conflict resolution, multi-tenant, hard delete) but you'll own them.

If your agent needs cross-run memory without engineering quarters

Ricord as a dependency injected via RunContext. The type-safety story stays intact — define Pydantic models for the save and recall inputs, pass the client through deps, and call the API from your tools. The drop-in pattern is below.

If your product's value is in the extraction pipeline

Cognee for forkable extraction, AGPL license is fine. Cognee details →

If you're shipping a custom agent runtime

Lettabundles runtime + memory. Most Pydantic AI builders won't switch off their framework choice for this — Pydantic AI's value is the type-safety, which Letta's runtime doesn't replicate.

Ricord as a Pydantic AI dependency — the drop-in

Pydantic AI's dependency injection makes this clean. Define a small dataclass for the deps, inject the Ricord client, and call it from @agent.tool functions. Input and output stay Pydantic-typed:

from dataclasses import dataclass
from pydantic_ai import Agent, RunContext
import httpx, os

@dataclass
class Deps:
    http: httpx.AsyncClient
    ricord_key: str
    user_id: str

agent = Agent(
    "openai:gpt-4o",
    deps_type=Deps,
    system_prompt="You're a helpful assistant. Use ricord_recall before answering and ricord_save after learning anything new.",
)

@agent.tool
async def ricord_recall(ctx: RunContext[Deps], query: str) -> str:
    """Recall what we know about a topic from persistent memory."""
    r = await ctx.deps.http.get(
        "https://api.ricord.ai/v1/memories/recall",
        params={"user_id": ctx.deps.user_id, "query": query, "k": 5},
        headers={"Authorization": f"Bearer {ctx.deps.ricord_key}"},
    )
    hits = r.json().get("hits", [])
    return "\n".join(h["content"] for h in hits) or "No memory found."

@agent.tool
async def ricord_save(ctx: RunContext[Deps], content: str) -> str:
    """Save a fact for future recall."""
    await ctx.deps.http.post(
        "https://api.ricord.ai/v1/memories",
        json={"user_id": ctx.deps.user_id, "content": content},
        headers={"Authorization": f"Bearer {ctx.deps.ricord_key}"},
    )
    return "Saved."

# Run it
async def main():
    async with httpx.AsyncClient() as http:
        deps = Deps(http=http, ricord_key=os.environ["RICORD_API_KEY"], user_id="alex")
        result = await agent.run("What deploy command should I use?", deps=deps)
        print(result.output)

Why Ricord wins for production Pydantic AI builds

  1. The type-safety story stays intact. The Ricord client wraps cleanly inside a dataclassDeps. Inputs and outputs are typed; Pydantic AI's validation flows through. DIY-on-Postgres often leaks raw dict types into your agent code.
  2. Per-user scoping is a parameter, not an architecture. Pass user_id through deps; the layer handles isolation. No namespace-prefix discipline in your tools.
  3. Conflict resolution at ingest. Pydantic AI's message-history primitive doesn't reason about contradictions across runs. Ricord resolves at write time so recalls return the current truth.
  4. Cross-client memory.The same memory your Pydantic AI backend writes is reachable from Claude Desktop, Cursor, Codex, Zed, Gemini CLI, Windsurf, and Cline via Ricord's MCP server. Useful when devs debug agents from an IDE while users run them through your backend.
  5. Browsable wiki view. Your team can see what the agent has learned at ricord.ai/dashboard — a real organized view of facts and entities, not a vector dump.
  6. Logfire-friendly.Pydantic AI ships built-in Logfire integration for agent observability; Ricord's HTTP calls show up in your Logfire trace alongside the LLM calls. Full agent trace + memory trace in one view.

When to stick with the built-in

Pydantic AI's message history is the right call when:

  • Your agent is one-shot or short-lived (classifier, validator, data extractor)
  • You don't need anything to survive past the current run
  • You're still in "does this agent shape work" mode — adding a memory layer is premature

The day any of those flips, add Ricord as a dependency. Pydantic AI's deps boundary is the right seam to extend on.

Getting started

pip install pydantic-ai httpx
# Get an API key at https://ricord.ai/login?signup=true
export RICORD_API_KEY=rc_live_...
# Drop the @agent.tool functions above into your Pydantic AI agent