How I fixed my AI agent's memory problem

AI Memory

This post was co-written by me and Hal, my AI agent. Hal actually lived through these memory problems, so it felt right to have him help tell the story.

Every OpenClaw bot has the same fundamental flaw (at time of writing): it wakes up with amnesia.

Each new session starts fresh. Yesterday's decisions, last week's research, that promise to follow up on something? Gone. The context window is generous but finite, and eventually the system compacts old messages into summaries, losing detail in the process.

I run an AI agent called Hal on OpenClaw, an open-source agent framework. Hal handles everything from email triage to code reviews to scheduling. After a few weeks of daily use, the memory problem became impossible to ignore.

Here's what broke, and how I fixed it.

What kept going wrong

Three things:

Post-compaction amnesia. OpenClaw compacts conversation history when the context window fills up. The compacted summary captures the gist but drops the specifics. After compaction, Hal would forget task details, lose track of multi-step projects, and repeat questions I'd already answered.

Inconsistent daily notes. Hal was supposed to write daily summaries to memory files. In practice, long sessions meant the agent was too busy working to stop and journal. Some days had detailed notes. Others had three lines. One day was completely empty.

Sub-agent context loss. I use sub-agents for heavy tasks like code fixes and research. When a sub-agent finishes at 3am, the main session has no idea what happened until someone manually checks. Work was getting done and then forgotten.

The fix: a three-layer memory stack

I didn't solve this with one change. It took a combination of three systems working together.

Layer 1: Automated journaling

The biggest win. I set up a cron job that runs every 3 hours using a cheaper model (Sonnet instead of Opus). It reads the last 80 messages from the main session, compares them against the current day's memory file, and appends anything missing.

This runs silently in the background. No notifications, no interruptions. The agent doesn't need to remember to write notes because the summariser catches what it misses.

The result: every day now has a comprehensive log, even if the agent was too busy to journal manually.

Layer 2: Smarter compaction

Before OpenClaw compacts the conversation, I enhanced the flush prompt to capture more than just a surface-level summary. It now explicitly grabs decisions and their reasoning, action items and promises, task state and progress, and key context that would be expensive to rediscover.

This means the compacted summary is richer. When the agent wakes up post-compaction, it has enough detail to reconstruct context without starting from scratch.

Layer 3: Transcript search

The final piece: making old sessions searchable. OpenClaw has a plugin called QMD, created by Tobi Lutke of Shopify fame, that provides local document search with BM25 keyword indexing across all memory files and session transcripts. 30 days of retention.

Now when the agent needs context it doesn't have, it can search. "What did we decide about the OAuth flow last Tuesday?" returns the actual conversation snippet with file and line references.

I tried embedding-based search first (vector similarity), but it crashed due to a known compatibility issue. BM25 keyword search turned out to be good enough. Sometimes the simpler approach wins.

The glue: state files

Memory files capture what happened. But I also needed a way to track what's happening right now.

A single current-focus.json file tracks active tasks, parked items, blockers, and recent decisions. It's updated by the summariser and by the agent manually. On session start, the agent reads this first to know what's in flight.

Combined with today's and yesterday's memory files, the agent can recover from compaction, session restarts, or even container rebuilds with near-zero context loss.

What I learned

Write it down or lose it. "Mental notes" don't survive session restarts. Only files persist. This sounds obvious but it's the single most important rule for OpenClaw memory.

Automate the journaling. Relying on the agent to write its own notes is unreliable. It gets busy, forgets, or writes inconsistent summaries. The automated summariser is the most valuable cron job I run.

Cheaper models for memory tasks. The summariser doesn't need the smartest model. It's reading and condensing, not reasoning. Using a cheaper model saves significant cost with no quality loss.

Search beats perfect memory. You don't need the agent to remember everything. You need it to be able to find things when it needs them. Keyword search across 30 days of transcripts handles 90% of "what did we discuss about X?" questions.

The result

Before these changes, the agent lost meaningful context roughly every 4 to 6 hours of active use. Sub-agent results were regularly forgotten.

After: context loss is rare. The agent can reconstruct its state after compaction in seconds. Daily memory files are consistently detailed. And transcript search fills in any remaining gaps.

The total setup took about 3 hours to build and configure. It's been running for over a week with zero maintenance. For anyone running a persistent AI agent, this is the first thing worth getting right.