Open-Source vs Hosted Memory Layers: When Each Wins
Every AI memory layer now ships in two flavors — open-source self-host or hosted SaaS. The right answer depends on five axes, not on ideology. Here's the framework, slot-by-slot.
Why this question exists now
Two years ago, "AI memory" meant a folder of JSON files and some heuristic about which slice to replay into the prompt. There was nothing to fork, nothing to host. The question of open-source vs hosted didn't apply.
Today every serious memory layer ships in both shapes. Mem0 has an Apache-licensed core and a Pro tier with a managed graph. Letta (formerly MemGPT) ships its agent runtime as MIT and runs Letta Cloud alongside it. Zep open-sourced a Community Edition for years before deprecating it. Even hosted-first players like Supermemory and Ricord exist alongside a growing pile of indie OSS memory APIs on GitHub.
So you have a real choice now. And like every "build vs buy" decision in infra, the answer is rarely about ideology. It's about which axes you weight, and whether your team will actually do the work the open-source path demands.
The five axes the decision turns on
Most posts that try to answer this question collapse into a feature checklist. That misses the shape. The decision is a weighted sum over five axes — and the weights are different for every team.
1. Cost over time, not at launch
Self-hosted feels free. It is not. A production memory layer in 2026 needs an embedding service, a vector store, a graph store (or a hybrid that does both), a worker queue for ingest, a reranker, and a Postgres for the relational metadata. None of those are free at any meaningful traffic. The honest number, for a team running 1M memories with sub-second recall: $400–$1,200/month in cloud bills, plus the engineering time to keep it running.
Hosted plans land in the $15–$100/month range until you cross real volume. The break-even isn't where you think — it's far further out than it feels at the start.
2. Latency budget
Memory retrieval is on the critical path of every agent turn. If your agent expects an answer in 300ms and your recall takes 900ms, the agent feels slow no matter how fast the LLM is. A hosted vendor sized for thousands of customers has a latency budget shaped like a SLO. A self-hosted single-tenant deployment has whatever latency you wrote into your retrieval code, which tends to drift up as your dataset grows.
This is the axis open-source most often regrets later. You shipped the MVP fast against a vector DB with cosine similarity. By month six you have entity resolution to do, contradiction detection, a reranker, and 200ms of overhead from each one.
3. Privacy and data residency
This is where open-source is actually load-bearing. If you're building for healthcare, defense, finance, or any regulated European customer, hosted-anything starts with a compliance conversation that often ends in "send us the on-prem install." Self-host wins outright here. The cost of running the infra is dwarfed by the cost of failing the audit.
For everyone else: most hosted vendors are SOC 2 Type II in US-East, with optional EU residency. That's sufficient for the long tail of SaaS and developer-tool startups.
4. Customization depth
Self-host wins when you need to change the retrieval logic itself — your own scoring function, a domain-specific reranker, a custom embedding model trained on your corpus. The OSS forks of Mem0 and Letta are widely modified for exactly this. If your product's edge depends on memory retrieval being the thing that's different about you, you probably want to own the code.
Most products don't fall here. Most use memory as a substrate, not a differentiator. For those, hosted's default config beats whatever you'd roll yourself in the first quarter.
5. Ops burden
Every line of OSS memory code you fork is a line you now own. Backups, encryption at rest, audit logging, hard-delete verification, graceful failover when the embedding model rate-limits. One ML engineer with strong infra chops can handle it. Most teams of four can't spare that engineer.
This is the silent killer. Teams pick OSS because it's cheap, then spend an engineer's salary on running it. The math only works if the customization or privacy axes are doing real work to justify the burden.
When open-source wins
Concrete profiles where self-host is the right call:
- Regulated industries.Healthcare, defense, regulated finance. Data can't leave your perimeter. Hosted is a non-starter; OSS or on-prem hosted is the only path.
- Research teams iterating on retrieval. If your work product is a better retrieval algorithm, you need the code in front of you. Mem0 and Letta both fork well for this.
- Bootstrapped products with deep infra DNA.If the team already runs Postgres, Redis, a vector store, and a worker queue, adding a memory layer is incremental cost. The marginal infra-ops work is small.
- EU residency requirements without a hosted EU option.Some hosted vendors are US-East only. If your customers demand Frankfurt or Amsterdam and your vendor can't serve it, self-host wins by default.
When hosted wins
The mirror profiles:
- Developer tools and SaaS.You're building for Claude Code users, Cursor power users, custom agent builders. Memory is a feature, not the product. Hosted gets you from zero to working in 60 seconds.
- Pre-product-market-fit startups. Engineering hours are the scarce resource. Spending them on memory infra that a vendor has already solved is the wrong trade.
- Teams without prior retrieval-infra experience.The five production-grade properties (graph, conflict resolution, sub-second recall, hard delete, audit) each take weeks to build from scratch. Cumulative, that's a quarter of engineering time you can spend on the product instead.
- Anyone who wants the wiki view. A few hosted vendors auto-generate browsable knowledge graphs and wiki pages from your stored memories. Building that on top of raw OSS is a research project.
The honest middle: hybrid
The under-discussed pattern is using both. Spin up the OSS version for offline evaluation — measure recall accuracy on your own corpus, benchmark different retrieval strategies, get comfortable with the data model. Then point production at the hosted version, which has the SLA and the ops out of the box.
This works particularly well with Mem0 and Letta, where the OSS and hosted versions share enough of the data model that you can A/B test against your own evals before committing. It's how serious teams de-risk the choice.
What you actually buy with hosted
Marketing copy never quite spells this out, so:
- A retrieval pipeline you don't maintain.Embedding service, vector index, rerank, graph traversal — all on the vendor's clock.
- Backups + point-in-time restore.Memory loss is worse than data loss because it's silent. Hosted vendors run daily backups by default.
- Hard delete that actually deletes. GDPR compliance for the long tail. The OSS path needs you to wire this end-to-end through every cache and index.
- An admin UI. Browsing what your agent has remembered, deleting bad entries, exporting on demand.
- The wiki, if the vendor offers it.Auto-generated browsable pages per entity, with backlinks. Not every vendor has this — but if you want it, you'll buy it long before you build it.
Side-by-side: the major options
A quick orientation table. Self-host columns reflect the OSS project; hosted columns reflect the managed plan. Pricing is base-tier monthly and changes — check the vendor before committing.
| Layer | OSS license | Hosted from | Best fit |
|---|---|---|---|
| Mem0 | Apache 2.0 | $29/mo | Vector-first memory; Apache fork well |
| Letta | MIT | Letta Cloud | Agent framework; memory + state |
| Zep | Discontinued (CE) | $24/mo | Bi-temporal graph; hosted only now |
| Supermemory | No OSS | Hosted only | Multi-source ingest (Chrome, audio) |
| Ricord | No OSS | $15/mo annual | Wiki + graph + MCP-native install |
Notable: Zep's OSS Community Edition was discontinued in late 2025 in favor of a hosted-only offering — a useful data point for anyone weighing "the OSS will always be there." It often isn't.
How to decide in 10 minutes
A short triage. Answer in order; the first "yes" wins.
- Does your data have to stay inside your perimeter for legal or compliance reasons? → Self-host.
- Is your retrieval algorithm your competitive edge? →Self-host.
- Do you have an ML engineer with infra chops who has spare cycles to own the layer? → Self-host (probably).
- Do you want a wiki / graph view of your memory without building one? → Hosted.
- Is engineering time scarcer than the ~$15–$100/month a hosted plan costs? → Hosted.
Where Ricord fits
Ricord is hosted-only. We made that call deliberately, not as a lock-in move. The five production-grade properties take real engineering work to ship — knowledge graph with automatic entity extraction, conflict resolution at ingest, auto-generated wiki pages per entity, sub-second recall under load, hard delete that actually deletes everywhere. Forking an OSS baseline and reaching that bar is a quarter of engineering time. We'd rather you spend that quarter on your product.
Install is one config block per MCP-compatible client:
bun add -g ricord ricord login ricord install # Claude Code, Claude Desktop, Codex, Cursor
If you're in the "most teams" bucket — building for developers, no compliance gate, no need to fork retrieval code — try the hosted path first. It's the right trade until your weights on the five axes change.
If you're in the OSS bucket, fork Mem0 or Letta and skip us. We'd rather you ship than feel pressured.
Keep reading
All postsMCP Memory Server: A Developer's Guide
What an MCP memory server actually does, the minimum-viable version (three tools), and what separates a toy from a production one. Plus the 60-second install path.
How to Make Claude Code Remember Across Sessions
Claude Code is brilliant within a session and amnesiac between them. Three patterns that fix it — including an MCP-native memory layer that installs in 60 seconds.