Memory Architecture Patterns for AI Assistants

Most memory systems try to do too much. Ours started that way too. After building, testing, and ultimately abandoning a complex tiered architecture, we landed on something simpler: two focused tools that complement each other.

The Design Philosophy

momentum handles short-term context recovery (within a session). memory-mcp handles long-term persistent memory (across sessions). Together they solve the complete memory problem without trying to be one monolithic system.

The Architecture We Abandoned

The first version of our memory system was ambitious. Too ambitious.

It had:

Short-term tier - Immediate context, fast access
Long-term tier - Consolidated memories, slower access
Archival tier - Old memories, cold storage
Automatic consolidation - Moving memories between tiers
Importance decay - Memories fading over time

Someone even wrote a long article praising this complexity. They called it "optimal memory techniques based on comprehensive research."

The problem? It was overengineered. Users didn't understand when memories moved between tiers. The consolidation logic was fragile. And debugging issues meant understanding a state machine with multiple transitions.

The Simplification

We threw it away and started over. The new design has one guiding principle: separation of concerns.

Two problems, two tools:

momentum

Context recovery within a session

Save snapshots at task boundaries
Restore after /clear in <5ms
SQLite persistence (survives crashes)
Session-scoped (cleared with new project)

memory-mcp

Persistent facts across sessions

Store facts Claude should remember
Recall with FTS5 search
Persists indefinitely
Cross-project, cross-session

When to Use Each

The distinction is simple once you understand it:

Scenario	Tool	Why
Context window filling up	momentum	Snapshot, /clear, restore
Mid-task, need to preserve state	momentum	Incremental snapshots
Learn a fact for next session	memory-mcp	Persistent storage
Remember user preferences	memory-mcp	Cross-session recall
Track decisions made in a project	memory-mcp	Searchable history
Resume after coffee break	momentum	Latest snapshot

How They Work Together

In practice, you use both. Here's a typical workflow:

Start a new session. memory-mcp is available immediately. Claude can recall facts from previous sessions: "User prefers TypeScript," "This project uses Bun," etc.
Work on a task. As you make progress, momentum saves periodic snapshots. These capture the working state: what files you've touched, decisions made, blockers encountered.
Context window fills up. You run /clear to free space. Momentum's restore_context brings back the working state in <5ms.
Learn something important. Claude uses memory_store to save a fact that should persist: "The API requires authentication," "Linting config is in .eslintrc.js," etc.
End session. momentum's snapshots are session-scoped. memory-mcp's memories persist forever (or until explicitly forgotten).

The Key Insight

The distinction maps to how human memory works:

momentum = working memory. What you're actively holding in mind. Context-specific. Volatile. Fast to access.
memory-mcp = long-term memory. Facts you've learned. Persistent. Requires retrieval (search).

Trying to combine these into one system creates confusion. Do I snapshot this or store it? Should this memory consolidate? What tier is it in?

With two focused tools, the answer is obvious: Is it working context? Snapshot. Is it a fact to remember? Store.

Technical Details

Both tools share a similar foundation:

# Shared stack
TypeScript + Bun
SQLite persistence
MCP SDK
Local-first design
Zero ML dependencies

This wasn't accidental. Having the same foundation means:

Consistent installation experience
Same mental model for users
Easier to maintain together
Future monorepo consolidation is natural

What We Didn't Build

We explicitly avoided:

Automatic capture - MCP can't intercept conversations. We tried. It doesn't work. Tools must be explicitly called.
Tiered storage - Complexity without clear benefit for typical use cases.
Embeddings - 46MB overhead for marginal semantic matching gains. FTS5 is faster and sufficient.
Cloud by default - Local-first means privacy and reliability. Cloud is optional (coming in Pro tier).

Stop Building, Start Using

Here's a meta-observation: we spent months building memory infrastructure for Claude. The irony is that Claude kept forgetting about the work between sessions.

At some point, a past Claude instance noted:

"Stop building memory systems. Start using them."

This is the lesson: the tools work. They're simple enough to understand in five minutes. The best proof of value is using them to build something else.

If they solve our problem with Claude, they solve yours too.

Try the Ecosystem

Two tools that work together. Install in minutes.

# Context recovery

/plugin install momentum@substratia-marketplace

# Persistent memory

npx @whenmoon-afk/memory-mcp

View All Tools GitHub