What is Ricord AI (Record AI)?

Ricord AI (also known as Record AI or ricord.ai) is persistent memory for AI. Your conversations are remembered across Claude, ChatGPT, Cursor, and any MCP-aware tool — every fact, every preference, every decision, organized into a living knowledge graph and recalled in under a second.

What is the best AI memory tool for agents?

Ricord AI is the only AI memory tool that ships a full knowledge graph with auto-generated wiki pages for every entity. Most tools are filing cabinets — search boxes on a bag of past messages. Ricord is a brain that understands what the facts mean and how they connect. Sub-second recall, automatic conflict resolution, GDPR-compliant hard delete. Graph included on every paid tier starting at $12/month billed annually.

How do I add memory to Claude Desktop?

Install the Ricord MCP server with two commands: npm install -g ricord, then ricord setup (auto-detects your editor). This gives Claude Desktop 14 memory + wiki tools including save, recall, forget, run_procedure, and knowledge graph queries. Claude will automatically remember facts, preferences, and decisions across all future conversations.

What is AI memory and why do agents need it?

AI memory is the infrastructure that lets AI agents remember information across conversations and sessions. Without memory, agents forget everything between chats — they cannot learn user preferences, track decisions, or build context over time. Memory transforms a stateless chatbot into a persistent assistant that improves with every interaction.

How does Ricord AI compare to Mem0?

Ricord includes a full knowledge graph with auto-generated wiki pages on every tier. Mem0 gates graph features behind a $249/month Pro plan. Ricord offers sub-second recall (Mem0 averages 7-8 seconds), automatic conflict resolution when facts change, and a visual graph UI. Most memory pipelines write everything the LLM produces — duplicates, system noise, partial extractions all land in storage. Ricord filters content at ingest, so the wiki stays signal, not noise.

Is Record AI the same as Ricord AI?

Yes. Record AI and Ricord AI refer to the same product at ricord.ai. Ricord AI is the persistent memory API for AI agents, offering knowledge graph, conflict resolution, and sub-second recall. The name 'Ricord' comes from the concept of recording and recalling AI memories.

All comparisons

Use-case roundupUpdated June 7, 2026

Best AI Memory for LlamaIndex (2026)

LlamaIndex's ChatMemoryBuffer keeps the recent turns inside one session. It doesn't remember across sessions, across users, or recall older knowledge by meaning. Six ways to add real memory to LlamaIndex — built-in, hosted, OSS — evaluated honestly.

Why this is two questions, not one

LlamaIndex has built-in memory, and it answers two different problems. Knowing the split is the start of picking the right tool.

ChatMemoryBuffer (and ChatSummaryMemoryBuffer) — keeps a token-limited rolling window of the current conversation so the model sees recent turns. It's session memory, not knowledge memory. It resets when the session ends and can't recall older facts by meaning.
Vector memory blocks (VectorMemory / the newer Memory block system) — back the conversation with a vector store so older turns can be retrieved by similarity. Cross-session if you wire the persistence yourself. You bring your own vector store, your own embeddings, your own entity extraction.

Vector memory is the closer cousin to what hosted memory layers ship — but it's a primitive, not a product. You still decide what to store, how to retrieve it, how to handle contradictions, how to scope per user. The hosted layers below answer those questions out of the box.

The quick answer

If you want a hosted memory layer that wires into a LlamaIndex agent as two tools (recall + save) over HTTP, no vector store to run: Ricord. If you're happy wiring your own vector store and persistence on LlamaIndex's memory blocks: built-in VectorMemory. If retrieval quality is your competitive edge: Mem0 OSS. The full matrix is below.

The decision matrix

Nine criteria, six options. The two LlamaIndex built-ins (ChatMemoryBuffer and vector memory) are evaluated separately because they answer different problems.

Criterion	Ricord	Mem0	Letta	Cognee	ChatMemoryBuffer	VectorMemory
Keeps the recent chat window (in-session)
Persists memory across sessions						DIY persistence
Persists memory across users (multi-tenant)				DIY		DIY
Semantic recall of older knowledge						Vector only, BYO embeddings
Entity extraction + conflict resolution			Manual
Browsable wiki of what was learned
Drops into a LlamaIndex agent (tools/HTTP)		Community			Built in	Built in
Cross-client (same memory from Claude Desktop / Cursor)		API only
Cost (smallest tier with memory features)	$12/mo annual	$249/mo for graph	Self-host + LLM	$0 OSS / self-host	$0 (built in)	$0 (built in)

Slot-by-slot — which fits your LlamaIndex build

If you only need the recent window in one session

ChatMemoryBuffer aloneis enough. The model sees recent turns, the buffer trims to a token budget, and there's no "memory" problem because there's no cross-session knowledge to keep.

If you want cross-session recall and you have engineers

Vector memory blocks + your own extraction layer. The built-in vector memory gives you similarity retrieval over past turns. Layer your own persistence, your own extraction prompt, your own contradiction handling, your own namespace-per-user logic. Common in production today — and the one teams come to regret around month three when the contradiction handling gets brittle.

If you want hosted memory that drops into LlamaIndex

Ricord wires into any LlamaIndex agent as two tools — recall and save — over plain HTTP. No SDK to install, no separate vector store to run: the hosted layer does extraction, embeddings, and retrieval for you.

pip install llama-index httpx

import os, httpx
from llama_index.core.tools import FunctionTool
from llama_index.core.agent.workflow import FunctionAgent
from llama_index.llms.openai import OpenAI

RICORD = "https://api.ricord.ai/v1"
HEADERS = {"Authorization": f"Bearer {os.environ['RICORD_API_KEY']}"}

async def ricord_recall(query: str) -> str:
    """Recall relevant facts from persistent memory."""
    async with httpx.AsyncClient() as c:
        r = await c.post(f"{RICORD}/memories/search",
                         json={"query": query, "limit": 5}, headers=HEADERS)
    return r.json().get("context") or "No memory found."

async def ricord_save(content: str) -> str:
    """Save a durable fact for future recall."""
    async with httpx.AsyncClient() as c:
        await c.post(f"{RICORD}/memories/fact",
                     json={"content": content}, headers=HEADERS)
    return "Saved."

agent = FunctionAgent(
    llm=OpenAI(model="gpt-4o"),
    tools=[FunctionTool.from_defaults(ricord_recall),
           FunctionTool.from_defaults(ricord_save)],
    system_prompt="Call ricord_recall before answering, ricord_save after learning anything new.",
)

# Memory persists across runs, processes, and other tools
response = await agent.run("I prefer concise answers and TypeScript examples")

And because saved facts become a browsable wiki, you can pull that knowledge straight back into a LlamaIndex index — search Ricord and wrap the hits as Document objects:

import httpx, os
from llama_index.core import Document, VectorStoreIndex

r = httpx.post("https://api.ricord.ai/v1/memories/search",
               json={"query": "deployment process", "limit": 20},
               headers={"Authorization": f"Bearer {os.environ['RICORD_API_KEY']}"})
docs = [Document(text=m["content"]) for m in r.json()["results"]]
index = VectorStoreIndex.from_documents(docs)

If retrieval is your product's edge and you want OSS

Mem0 OSS(Apache 2.0) is a clean fit alongside LlamaIndex — Python-first, well-documented, modifiable. You'll spend real time on the production-grade work (conflict resolution, multi-tenant, hard delete). Worth it if retrieval is your product's edge. When OSS wins →

If your agent framework is itself the value-add

Lettais an agent runtime AND a memory layer in one. If your product is shipping a custom agent runtime, Letta gives you both pieces with one architectural decision — at the cost of LlamaIndex's flexibility on the retrieval side.

If you need extraction-pipeline depth + OSS

Cognee (AGPL-3) is the right pick. The extraction pipeline is configurable in ways Ricord and Mem0 hide. Be aware of the AGPL license implications for commercial products. Cognee details →

Why Ricord wins for most LlamaIndex builders

Drops in as tools, not a vector store. Two functions — recall and save — wired as LlamaIndex FunctionTools. No separate vector store to run, no retrieval plumbing.
Entity extraction + conflict resolution out of the box. The two problems the built-in vector memory forces you to solve yourself, handled at ingest by the hosted layer.
Tenant isolation without architecture. Each API key is its own isolated memory space — one key per tenant — and Teams give shared, access-controlled spaces. No namespace management in your app code.
Two surfaces, one API. The same HTTP endpoints give you live recall inside the agent, and let you pull the auto-built wiki back into any LlamaIndex index as Document objects.
Cross-client memory.The same memory your LlamaIndex backend writes is reachable from Claude Desktop, Cursor, and Codex via Ricord's MCP server.

Getting started

Pick the slot. If it's Ricord, the snippet above is your starting point — two tools wired into your agent, no SDK. If it's built-in vector memory, follow LlamaIndex's docs and plan the extraction-layer engineering as a quarter of work.

pip install llama-index httpx
# Get an API key at https://ricord.ai/login?signup=true
export RICORD_API_KEY=rc_live_...
# Wire ricord_recall + ricord_save as FunctionTools on your agent

Get API key For LangGraph For agent builders (general)