Your OpenClaw Forgot Everything Again. Here’s How We Fixed It.

ResonantOS adds a 4-layer lossless memory system to OpenClaw. Nothing is deleted. The AI learns from its own mistakes. And it runs on your hardware.

Augmented Mind: Think with AI and Manolo Remiddi

Mar 23, 2026

One week into building ResonantOS, OpenClaw lost a critical architecture decision mid-conversation. We had spent an hour designing a security gate, agreed on the approach, moved to implementation, and when we referenced the decision twenty minutes later, it was gone. Compacted into a summary that said “discussed security architecture”. The specific gate design, the reasoning, the edge cases we had worked through - all evaporated.

This is the default behaviour of every AI agent platform. When the conversation gets too long, the system summarizes old messages and deletes the originals. It’s like taking meeting notes by writing “we had a meeting” and shredding the whiteboard.

We refused to accept that. Over six weeks, we built a memory architecture where nothing is ever deleted, the AI recalls details from weeks-old sessions in seconds, and every failure becomes a permanent lesson. This article explains how it works and how to set it up on OpenClaw.

What Makes This Different

If you have used ChatGPT’s memory or similar tools, you have seen one pattern: extract facts from conversations and store them in a database. “User prefers dark mode.” “Project uses PostgreSQL.” Flat key-value pairs.

That approach captures preferences but misses process. It does not remember that you tried three approaches to a problem, why the first two failed, what the error messages said, or what lesson you drew from the experience. It cannot reconstruct the reasoning behind a decision, only the decision itself.

ResonantOS takes a different approach. It preserves full conversation history (lossless, not summarized-and-deleted), injects project documentation into every turn (so the AI “already knows” your system), and runs automated cron jobs that capture structured failure analysis into files the AI reads in future sessions. The result is an AI that does not just remember facts - it remembers context, process, and lessons.

ResonantOS runs on top of OpenClaw, an open-source AI agent platform. This is not a generic solution you bolt onto any LLM. It is a specific experience layer that uses OpenClaw’s plugin system, cron scheduler, multi-agent architecture, and session management to create something none of those components deliver alone. The architecture decision - combining four layers that each solve a different retrieval problem - is what makes it work.

The Four Layers

LayerProblem It SolvesLCM (Lossless Context Management)”The AI forgot what we discussed last Tuesday” - preserves every message, lets the AI drill back into any past conversationRAG (Semantic Search)”The AI can’t find that document I wrote” - searches files by meaning across your entire knowledge baseR-Awareness“The AI doesn’t know our project context” - injects architecture docs and recent session summaries directly into the AI’s promptMemory Log Pipeline“The AI keeps making the same mistakes” - captures failures, classifies them, feeds lessons back into future sessions

Together they give the AI:

Working memory - the current conversation plus compressed history of everything before
Medium-term recall - what happened in the last 20 sessions, injected automatically every turn
Long-term learning - curated knowledge and failure patterns accumulated over months

Here is what that looks like in practice. This is a real exchange from our system. Three weeks after a deployment failure, with no prompting, the AI referenced the specific lesson:

Human: Deploy the website to GitHub Pages.

AI: Before pushing, I need to check existing DNS records. Last time (March 21) we deployed without verifying DNS first - there was a stale A record pointing to old hosting that blocked HTTPS for 3 hours. Running dig augmentatism.com first.

The AI was not told to check DNS. It recalled the failure from a memory log header injected by R-Awareness, cross-referenced with details retrieved from LCM’s compressed history. Three layers working together.

Layer 1: LCM - Lossless Conversation History

LCM is an open-source OpenClaw plugin (by Martian Engineering) that replaces the default compaction system. Instead of summarizing and deleting, it keeps every message in a SQLite database and builds a tree of progressively compressed summaries on top.

Think of default compaction as writing notes on a whiteboard, then erasing old notes when you run out of space. LCM is writing in a notebook with a table of contents. The notes are always there. The table of contents helps you find things quickly. And you can always flip back to the original page.

How It Works

Every message is saved permanently.
When the conversation grows long, LCM summarizes chunks of older messages into “leaf” summaries.
Leaf summaries get re-summarized into higher-level nodes, forming a tree structure.
Each turn, the AI sees: recent raw messages + compressed summaries of everything before.
When the AI needs details from compressed history, it uses tools (lcm_grep, lcm_expand) to traverse the tree and recover the original text.

The Settings That Actually Matter

Most defaults are fine. These are the ones that change behaviour:

incrementalMaxDepth: -1 - Set this immediately. The default (0) means summaries never get re-summarized. The context fills up fast. With -1, older summaries compress further automatically. This is what makes LCM scale to thousands of messages.

expansionModel - The model that retrieves details from compressed history. This is a speed task, not a thinking task. Use a small, fast model. Locally via Ollama: Llama 3.2-3B, Qwen3-4B, or Phi-4-mini. API fallback: any fast inference model (Haiku-class, GPT-4.1-mini, Gemini Flash). Never use a large reasoning model here. They are 5-10x slower and provide zero benefit. The retrieval sub-agent has a timeout - if the model is too slow, retrieval fails silently and the AI reports it could not find the information.

maxExpandTokens: 12000 - The default (4000) is too small. Real source content can be 15-20K tokens. A 4K budget means the sub-agent truncates or gives up. We tripled it and retrieval went from unreliable to consistent.

summaryModel - The model that compresses messages into summaries. This one benefits from intelligence because summary quality determines future retrieval quality. Locally: Qwen3-8B, Llama 3.3-8B, or Mistral-7B (needs 16GB+ RAM). API fallback: any mid-tier model.

The Bug That Taught Us the Most

After configuring LCM, we noticed that retrieval kept returning empty results. The AI would say “I couldn’t find that information” about conversations we knew were stored. We assumed a configuration problem and spent hours tweaking settings.

Then we read the source code.

The data was there. Every message, perfectly preserved in SQLite. But between the database and the AI’s eyes, three formatting functions progressively stripped the content away. The first truncated messages to 200-character snippets. The second replaced message bodies with metadata placeholders (”msg#28764, assistant, 362 tokens”). The third dropped the content field entirely when preparing the sub-agent’s prompt.

Four layers of code, each individually reasonable, that combined to create a system where the AI had perfect memory storage and zero memory retrieval. The database remembered everything. The AI saw nothing.

We patched three files, raised the token budget, and retrieval started working. The lesson: lossless storage means nothing if the retrieval pipeline is lossy. Test the full path, not just the database.

Restart Method

After changing LCM settings, send SIGUSR1 to the gateway process: kill -USR1 . Do not hard-restart the gateway - that breaks plugin context and causes retrieval errors.

Layer 2: RAG - Semantic Search

RAG lets the AI search your files by meaning, not keywords. When it uses the memory_search tool, it finds relevant content across documents, notes, and knowledge bases even when the wording differs.

What We Learned About Embeddings

Embeddings run locally via Ollama using nomic-embed-text (274MB, 8192-token context window). We tested alternatives:

mxbai-embed-large has a 512-token context limit. Real documents exceed that immediately. Unusable.
bge-m3 takes 15 seconds per chunk. A 5000-chunk index would need 87 hours to reindex. Unusable.

nomic-embed-text is the only local model right now that combines adequate context window with acceptable speed.

The Real Fix Was Cleaning the Data

Our search quality was terrible. We assumed the embedding model was the bottleneck and tested three alternatives. None helped.

Then we audited the indexed files. 972 archive files containing formatting artifacts, raw tool output JSON, and AI thinking blocks were polluting the index. We removed them. The index dropped from 21,254 chunks to 4,810 - 77% noise removed - and search quality improved dramatically.

If search feels broken, audit your data before switching models.

Session Transcript Search

OpenClaw can index live session transcripts, letting the AI search its own recent conversations:

{
  "memorySearch": {
    "experimental": {
      "sessionMemory": true
    }
  }
}

Low risk to enable, useful when you need to find something said but not yet saved to a file.

Layer 3: R-Awareness - The Cold Start Bridge

R-Awareness is a ResonantOS-original plugin. It injects documents directly into the AI’s system prompt so the AI “already knows” your project context without searching for it.

It serves two distinct purposes.

Purpose 1: Project Knowledge on Demand

You probably have documents that describe your project: architecture specs, component guides, decision records, active task lists. We call these “Single Source of Truth” (SSoT) documents - just an organized folder of reference material.

R-Awareness manages which documents the AI sees and when:

Always-on documents load every turn. Your core references: system overview, project state, identity. Choose these carefully since they consume tokens every exchange. (Most providers cache system prompt content, so after the first turn they are effectively free.)

Keyword-triggered documents load only when relevant. Mention “security” and R-Awareness injects the security spec. Mention “DAO” and the governance architecture appears. Specialized knowledge stays out of base context until needed.

Drift correction is automatic. When a separate auditor detects the AI drifting from expected behaviour (giving up too easily, taking shortcuts, being sycophantic), it writes a flag file. Next turn, R-Awareness reads the flag, injects a reinforcement document, and clears it. One-shot correction without human intervention.

Purpose 2: Session Continuity Across Restarts

This is the mechanism that bridges the gap between sessions.

Every day, a cron job processes recent memory logs (Layer 4) and extracts compact headers into a file called RECENT-HEADERS.md. Each header is a 200-400 token summary of one work session:

## 2026-03-21 - LCM Fix, Website Deploy

### Decisions
- LCM expansion bug fixed: content stripped by formatting layer
- Website deployed to GitHub Pages
- sessionMemory enabled for transcript search

### Corrections (from human)
- Did not verify DNS before declaring deployment complete

### DNA Patterns Active
- SLOPPY_PUBLIC_DEPLOYMENT: partial verification accepted as complete

### Open
- HTTPS still broken on website (stale DNS record)

R-Awareness injects the last 20 of these as an always-on document. When the AI starts a new session, it immediately knows what you worked on recently, what decisions were made, what went wrong, and what is still open. No re-explaining. No “as we discussed yesterday” preamble. The AI already has it.

This is the “cold start” bridge. Without it, every session starts from zero. With it, the AI has continuity across weeks of work sessions.

Layer 4: Memory Logs - Where the AI Learns From Failure

This is the layer that compounds over time. Memory logs do not just record what happened - they diagnose what went wrong and classify it for future pattern recognition.

The 3-Part Format

Part 1: Process Log - The narrative. What was requested, what was attempted, what decisions were made, what failed, what was the final outcome.

Part 2: Failure Classification - Every failure is categorized:

F1: Wrong data (missing info, bad assumptions)
F2: Wrong tool (model limitations, wrong approach)
F3: Wrong process (skipped steps, procedural error)

Part 3: Stability Assessment - Quick system health check after the session’s changes.

Why This Matters for Sovereignty

Here is the strategic argument. Every memory log is a structured training pair. The process narrative, the failure classification, the correction from the human - these are exactly the data points needed to fine-tune a local model on your specific workflow.

Most fine-tuning datasets are synthetic or scraped. These are real. Hundreds of sessions where a human and an AI collaborated on actual problems, hit actual failures, and documented the actual lessons. Diagnosis under uncertainty, error recovery, strategic pivots - the skills that are hardest to train into a model.

If you plan to run a local model (Llama, Qwen, Mistral, or whatever comes next), these logs are your training corpus building itself in the background. Your AI gets smarter on your data, on your hardware, without sending a single token to anyone else’s server.

That is what sovereignty means in practice. Not a philosophy. A pipeline.

Automated Capture

Four cron jobs handle the pipeline:

CronSchedulePurposeIntraday logEvery 3 hoursWrites full memory logs during the workdayNightly safety netEarly morningCatches anything missed during the dayDrift detectionAfter nightlyScans architecture docs for stalenessHeader generationAfter drift scanRebuilds RECENT-HEADERS.md for R-Awareness

Between cron runs, the AI drops “breadcrumbs” - one-line JSON entries with a timestamp, what happened, and what failed. The next cron run consumes these into full logs.

The intraday and nightly crons need a model capable of strategic analysis, not just extraction. Locally: Qwen3-14B+ or Llama 3.3-70B if your hardware supports it. API: any reasoning-tier model. Drift detection and header generation are simpler extraction tasks where any competent small model works.

MEMORY.md - The Curated Mind

On top of the four automated layers sits a curated file: MEMORY.md. This is the AI’s long-term mental model.

The AI reads it at session start. It contains distilled lessons, project state, preferences, critical rules, and infrastructure notes. During maintenance cycles, the AI reviews recent memory logs, identifies what is worth keeping permanently, and removes what has gone stale.

The difference between memory logs and MEMORY.md is the difference between a journal and understanding. The journal captures everything. The mental model keeps what matters.

A Common Misconfiguration

OpenClaw has a feature called memoryFlush that saves context before default compaction runs. If you see guides recommending:

{ "compaction": { "memoryFlush": { "enable": true } } }

This does nothing when LCM is active. LCM takes over compaction entirely (ownsCompaction: true), so stock compaction never fires, and memoryFlush never triggers. The memory log pipeline (Layer 4) handles long-term capture instead.

Only enable memoryFlush if you are NOT using LCM.

Model Selection At a Glance

Every component that can run locally should. API models are a performance upgrade, not a dependency.

ComponentWhat It DoesLocal FirstAPI FallbackNever UseLCM ExpansionRetrieve historyLlama 3.2-3B, Qwen3-4B (Ollama)Haiku, GPT-4.1-miniLarge reasoning models (timeouts)LCM SummarizationCompress messagesQwen3-8B, Mistral-7B (16GB+ RAM)Sonnet-class, GPT-4.1Tiny models (lossy summaries)Memory log writingFailure analysisQwen3-14B+ or 70BReasoning-tier APIBudget models (can not diagnose)RAG embeddingsVector searchnomic-embed-text (Ollama)-mxbai (512 limit), bge-m3 (slow)Cron extractionHeaders, driftAny small local modelAny budget APIPremium (waste)

The goal: your memory system functions fully without any API connection. Cloud models are optional.

Getting Started

The full configuration combines all four layers. If you are running ResonantOS, the setup agent walks you through this interactively - type /setup. If you are configuring manually, the core settings are:

{
  "plugins": {
    "slots": { "contextEngine": "lossless-claw" },
    "entries": {
      "lossless-claw": {
        "enabled": true,
        "config": {
          "freshTailCount": 32,
          "contextThreshold": 0.75,
          "incrementalMaxDepth": -1,
          "expansionModel": "ollama/llama3.2:3b",
          "maxExpandTokens": 12000
        }
      },
      "r-awareness": {
        "enabled": true,
        "config": {
          "ssotRoot": "/path/to/your/ssot",
          "alwaysOnDocs": ["overview.md", "recent-headers.md"],
          "tokenBudget": 25000
        }
      }
    }
  },
  "agents": {
    "defaults": {
      "memorySearch": {
        "provider": "ollama",
        "experimental": { "sessionMemory": true }
      }
    }
  }
}

Then set up the cron pipeline for memory logs and header generation. The setup agent handles this, or consult the ResonantOS docs for manual configuration.

What Comes Next

We are building this in the open. The code is at github.com/ResonantOS. The community is on Discord.

Next on the roadmap: Telegram topic-based session isolation (each conversation topic gets its own memory silo), per-model prompt optimization through R-Awareness, and the first fine-tuning experiments using accumulated memory logs as training data.

The bet is simple. Every AI platform is racing to make their agent smarter by making the model bigger. We are making the agent smarter by making its memory deeper - and keeping that memory under your control.

If that resonates, try it. Break it. Tell us what is missing. The best memory architecture is the one built by people who actually use it every day.

Transparency note: This article was written and reasoned by Manolo Remiddi. The Resonant Augmentor (AI) assisted with research, editing and clarity. The image was also AI-generated.

Giving Lab

This is a strong framing of the memory problem—especially the “nothing gets deleted” stance. The missing piece most operators run into is *retrieval discipline* after capture: if recall prompts and merge rules aren’t explicit, memory can still feel random under load. We’ve been documenting practical OpenClaw teardowns where each failure gets converted into a reusable runbook (symptom → root cause → exact fix command), so future sessions recover faster instead of re-learning. If that execution-first angle is useful, I share those operator playbooks here: https://substack.com/@givinglab

The Augmented Mind: Think with AI

Discussion about this post

Ready for more?