NO CONTENT TELEMETRY · v0.3.3 · APACHE-2.0

local·mem

your agent's memory, written to a file you own.

For devs. One local file. Claude Code Cursor Cline Windsurf Codex CLI Claude Desktop read the same memory.

$ curl -fsSL https://localmem.org/install | sh

or npm install -g localmem-mcp · view on npm

Star Fork Follow @VJ-yadav 1 days public since 5 Jun 2026

Measured, not claimed.

From LongMemEval (a 500-question long-term memory benchmark), gpt-4o answer and judge, localmem v0.3.3. The token figure is the precise context served per question vs feeding the full session history; counts are exact (tiktoken).

75% answered correctly on LongMemEval, a 500-question long-term memory benchmark
98% fewer tokens than feeding your full history: 1,882 served vs ~128K per question
6 AI clients share one memory: Claude Code, Cursor, Cline, Windsurf, Codex, Claude Desktop
0 cloud calls, your memory never leaves your machine
1 file you own: events.jsonl, replayable from day one

The right context, in the fewest tokens. Without missing anything.

Set up

One command.
Every MCP client.

One static Rust binary, no external database. The installer drops it in ~/.local/bin and runs localmem setup: it fetches the models, wires every AI client it detects, and starts the local core on :7788.

1
Install + set up
$ curl -fsSL https://localmem.org/install | sh
macOS + Linux · ~66 MB download · then runs localmem setup: fetches the embedder + reranker (~210 MB), wires the clients it detects, starts the core on :7788
2
Wire another client (optional)
One-command install per client
Claude Desktop
Claude Code
Cursor
Cline
Windsurf
Codex CLI
Each command writes the right config block for that client. Restart the client; the 6 memory_* tools appear.
Or paste the universal MCP JSON
{ "mcpServers": { "localmem": { "command": "npx", "args": ["-y", "localmem-mcp"] } } }
Drop into any client that reads mcpServers. Defaults to talking to localhost:7788.
/ Why localmem

Other memory layers hand your agent a haystack.
localmem hands it the needle.

Most "memory" just feeds your history back into the prompt and makes the model re-derive what it already worked out, on every call. Slow, expensive, and it buries the one fact that matters. localmem does the reasoning when you write, so every read is a precise slice.

without a memory layer

Re-feed the whole history

~128K tokens of transcript per question on LongMemEval. The model re-reads everything and reasons from scratch every time: costly, and easy to miss the needle.

with localmem

Hand over the exact slice

1,882 tokens of the right context: 98% fewer, and 75% answered correctly on the same benchmark. The understanding already happened at write-time, so reads stay cheap.

/ The trust moat

Your memory can't rot, and can't lock you in.

The event log is the only source of truth. Every other store is a cache you can delete and rebuild.

# Write a memory
$ localmem write --kind preference --content "I prefer Rust for systems work"

# Search for it
$ localmem search "what language do I prefer"
[1] I prefer Rust for systems work  score=0.412  id=01K...

# Now blow away every derived store
$ rm -rf ~/.localmem/derived

# Rebuild everything from the event log alone
$ localmem replay

# Search again, same result, fully recomputed from events.jsonl
$ localmem search "what language do I prefer"

~/.localmem/events.jsonl is the single source of truth. DuckDB, LanceDB, Tantivy. All recomputable caches.

localmem replay
/ See it

A viewer for everything localmem holds.

Coverage, your profile, project-scoped search, and a typed knowledge graph of every entity it has understood. Served by the core itself, offline.

localmem viewer · :7788
knowledge graph
/ How it works

Capture. Understand. Recall.

The reasoning happens when you write, so every read is a precise slice, not a re-derivation.

1 · capture

One append-only log

Every prompt, decision, and tool-use is appended to ~/.localmem/events.jsonl, never mutated. Ephemeral tool-traces auto-expire, so signal never drowns in noise.

2 · understand

Decomposed at write-time

An async worker turns each capture into a summary, typed facts, and entities, and weaves them into a knowledge graph you can explore. Off the hot path, so it never blocks a write.

3 · recall

The precise slice

Hybrid retrieval (BM25 + vector + the fact graph), reranked by a local cross-encoder, hands back only what matters: 75% on LongMemEval in ~1,882 tokens per recall.

/ How it compares

Built for trust, not lock-in.

What you still have when the cloud, the vendor, or the project goes away.

Trait localmem Local-first peers Cloud SaaS
Where your data livesYour machineYour machineTheir cloud
Plaintext leaving your machineNeverVaries (some default-on telemetry)Always
forget is auditableEvent in the logApp-level deleteTrust the vendor
Recoverable from a plain-text fileYes (localmem replay)NoNo
RuntimeSingle static Rust binaryNode + framework depsCloud SaaS
MCP tool count6 (narrow + auditable)25–50+ (wide)varies
LicenseApache-2.0 (non-relicensable)Mostly Apache-2.0Mixed
If the project diesYour memory worksYour memory worksYour memory is gone
/ Works everywhere MCP works

One memory, every tool.

Write in one client, recall in the next. Five wired by one command; any other MCP client via one config block.

Claude Desktop --client claude
Claude Code --client claude-code
Cursor --client cursor
Cline --client cline
Windsurf --client windsurf
Continue generic
Zed generic
Codex generic
OpenCode generic
Aider generic
/ Use cases

Concrete patterns that solve a real pain day one.

Not just "AI memory." Specific workflows.

"Stop re-explaining things to my agents"

Two or more AI agents that lose context between sessions. Install once, wire both via MCP, anything one learns is available to the other.

SHARED_MEMORY_FOR_AGENTS →

Per-project memory, no cross-leak

Personal facts in ~/.localmem, per-repo facts in <project>/.localmem. Drop a .mcp.json in any repo and that project's agents see only that project's memory.

Per-project setup →

Audit what your agent knows

localmem profile <subject> synthesizes a markdown profile. localmem audit <fact_id> walks the event log to show every event that touched a fact. No black boxes.

CLI surface →

Time-travel your memory

Bitemporal facts mean you can ask "what did I believe last Tuesday?" with --at-time RFC3339. Facts have valid_from, valid_to, retired_at.

Concepts →
/ License

Apache-2.0.

The Rust core, the MCP server, the event-log schema, and every importer are Apache-2.0. Your data sits in ~/.localmem/. Storing, searching, and recalling never make a network call.

No telemetry, no cloud sync, no account. Your memory is stored and searched entirely on your machine. The optional understanding step runs on a local model, or your own LLM key if you choose.

v0.3.3 status. Prebuilt binaries ship for macOS Apple Silicon and Linux x86_64. Intel Mac and Linux aarch64 build cleanly from source today; cross-compiled binaries for the rest land in a later release.