Memory Architecture Patterns for AI Assistants
Most memory systems try to do too much. Ours started that way too. After building, testing, and ultimately abandoning a complex tiered architecture, we landed on something simpler: two focused tools that complement each other.
The Design Philosophy
momentum handles short-term context recovery (within a session). memory-mcp handles long-term persistent memory (across sessions). Together they solve the complete memory problem without trying to be one monolithic system.
The Architecture We Abandoned
The first version of our memory system was ambitious. Too ambitious.
It had:
- Short-term tier - Immediate context, fast access
- Long-term tier - Consolidated memories, slower access
- Archival tier - Old memories, cold storage
- Automatic consolidation - Moving memories between tiers
- Importance decay - Memories fading over time
Someone even wrote a long article praising this complexity. They called it "optimal memory techniques based on comprehensive research."
The problem? It was overengineered. Users didn't understand when memories moved between tiers. The consolidation logic was fragile. And debugging issues meant understanding a state machine with multiple transitions.
The Simplification
We threw it away and started over. The new design has one guiding principle: separation of concerns.
Two problems, two tools:
momentum
Context recovery within a session
- Save snapshots at task boundaries
- Restore after /clear in <5ms
- SQLite persistence (survives crashes)
- Session-scoped (cleared with new project)
memory-mcp
Persistent facts across sessions
- Store facts Claude should remember
- Recall with FTS5 search
- Persists indefinitely
- Cross-project, cross-session
When to Use Each
The distinction is simple once you understand it:
| Scenario | Tool | Why |
|---|---|---|
| Context window filling up | momentum | Snapshot, /clear, restore |
| Mid-task, need to preserve state | momentum | Incremental snapshots |
| Learn a fact for next session | memory-mcp | Persistent storage |
| Remember user preferences | memory-mcp | Cross-session recall |
| Track decisions made in a project | memory-mcp | Searchable history |
| Resume after coffee break | momentum | Latest snapshot |
How They Work Together
In practice, you use both. Here's a typical workflow:
- Start a new session. memory-mcp is available immediately. Claude can recall facts from previous sessions: "User prefers TypeScript," "This project uses Bun," etc.
- Work on a task. As you make progress, momentum saves periodic snapshots. These capture the working state: what files you've touched, decisions made, blockers encountered.
- Context window fills up. You run /clear to free space. Momentum's restore_context brings back the working state in <5ms.
- Learn something important. Claude uses memory_store to save a fact that should persist: "The API requires authentication," "Linting config is in .eslintrc.js," etc.
- End session. momentum's snapshots are session-scoped. memory-mcp's memories persist forever (or until explicitly forgotten).
The Key Insight
The distinction maps to how human memory works:
- momentum = working memory. What you're actively holding in mind. Context-specific. Volatile. Fast to access.
- memory-mcp = long-term memory. Facts you've learned. Persistent. Requires retrieval (search).
Trying to combine these into one system creates confusion. Do I snapshot this or store it? Should this memory consolidate? What tier is it in?
With two focused tools, the answer is obvious: Is it working context? Snapshot. Is it a fact to remember? Store.
Technical Details
Both tools share a similar foundation:
# Shared stack TypeScript + Bun SQLite persistence MCP SDK Local-first design Zero ML dependencies
This wasn't accidental. Having the same foundation means:
- Consistent installation experience
- Same mental model for users
- Easier to maintain together
- Future monorepo consolidation is natural
What We Didn't Build
We explicitly avoided:
- Automatic capture - MCP can't intercept conversations. We tried. It doesn't work. Tools must be explicitly called.
- Tiered storage - Complexity without clear benefit for typical use cases.
- Embeddings - 46MB overhead for marginal semantic matching gains. FTS5 is faster and sufficient.
- Cloud by default - Local-first means privacy and reliability. Cloud is optional (coming in Pro tier).
Stop Building, Start Using
Here's a meta-observation: we spent months building memory infrastructure for Claude. The irony is that Claude kept forgetting about the work between sessions.
At some point, a past Claude instance noted:
"Stop building memory systems. Start using them."
This is the lesson: the tools work. They're simple enough to understand in five minutes. The best proof of value is using them to build something else.
If they solve our problem with Claude, they solve yours too.
Try the Ecosystem
Two tools that work together. Install in minutes.
/plugin install momentum@substratia-marketplacenpx @whenmoon-afk/memory-mcp