agentmemory
Local memory for AI coding agents with fast hybrid recall. agentmemory runs as a single Node process and stores state on disk as JSON—no external DBs.
What is agentmemory?
agentmemory is a local “memory layer” for AI coding agents that captures an agent’s session activity, then provides fast recall for later steps. It runs on your machine as a single Node process and stores state on disk as JSON, with no external databases.
The system captures tool calls and prompts through auto-capture hooks, consolidates raw observations into semantic memories, and serves retrieval through a hybrid pipeline (BM25 + vector + knowledge graph) and on-device reranking.
Key Features
- Local runtime with JSON-on-disk state (zero external databases): Runs as one Node process and persists data to disk as JSON; it does not require Redis, Kafka, Postgres, Qdrant, or Neo4j.
- Auto-capture hooks for agent activity: PreToolUse, PostToolUse, SessionStart, Stop, and additional events feed into the memory pipeline without additional glue code once installed.
- Hybrid retrieval with on-device reranking: Triple-stream recall combines BM25, vector, and knowledge-graph signals and reranks results on-device (described as “P50 under 20ms on a laptop” on the page).
- Auto-consolidation and retention behavior: Hourly sweeps consolidate raw observations into semantic memories, merge duplicates, decay stale rows using retention scoring, and emit batched audit rows when items are deleted.
- MCP server with a defined tool surface: Exposes MCP tools such as
memory_save,memory_recall,memory_smart_search,memory_sessions,governance,audit, andexport, along with a REST twin for each MCP tool under/agentmemory/*. - Session replay via JSONL import: Rehydrates a session from a Claude Code JSONL transcript, including observations, tool uses, and timeline, into the store.
- Knowledge-graph compression and querying: Extracts entities and relations during compression; supports graph querying via
/agentmemory/graphand visualization in the viewer. - Federated sync between nodes (authenticated HTTPS): Push/pull memories between agentmemory nodes with bearer-token authentication; the page explicitly notes “no silent syncs.”
- Local viewer and observability output: Provides a live observation stream viewer (port 3113) and logs/traces through an “OTEL observability worker” (OTLP export for tracing backends such as Jaeger/Honeycomb/Tempo mentioned on the page).
How to Use agentmemory
- Install once: Run
npm install -g @agentmemory/agentmemoryto putagentmemoryon your PATH. - Start the server: Launch
agentmemory(server runs on:3111, viewer on:3113). - Verify locally: Open
http://localhost:3113to view the live observation stream and dashboards. - Connect an agent via MCP: Configure your agent to use the agentmemory MCP JSON configuration (the site states “one MCP JSON fits almost everything”).
- Optionally import past sessions: Use the provided JSONL session import capability to replay earlier agent runs into the store.
Use Cases
- Build continuity across multiple coding sessions: Capture every session’s prompts and tool usage so later agent actions can retrieve relevant past observations quickly.
- Support “why did the agent do that?” auditing: Use the audit emission described for deletes and the observability/tracing output to inspect what happened during memory operations and session handling.
- Improve retrieval quality for mixed queries: Use hybrid search (BM25 + vector + knowledge graph) with on-device reranking when queries are partly lexical, partly semantic, or depend on extracted relations.
- Turn transcripts into reusable memory: Import Claude Code JSONL transcripts to rehydrate a full session timeline—useful when you have past runs you want to query later.
- Coordinate between multiple agent machines: Set up peer-to-peer sync between agentmemory nodes using authenticated HTTPS for push/pull memory exchange.
FAQ
-
Does agentmemory require an external database (like Postgres or Qdrant)? No. The page states “ZERO EXTERNAL DATABASES” and describes the system as a single process with state living on disk as JSON.
-
How do I access tools for saving and recalling memory? agentmemory exposes an MCP server with tools such as
memory_saveandmemory_recall. The page also notes REST endpoints for each tool under/agentmemory/*. -
Where can I see what the server is capturing? A viewer is auto-started on port 3113, showing the live observation stream, session explorer, memory browser, knowledge graph visualization, and a health dashboard.
-
Can I import existing coding transcripts? Yes. The page describes a JSONL session import workflow that ingests a Claude Code JSONL transcript and rehydrates observations, tool uses, and timeline.
-
Does agentmemory support moving memory data between machines? The page describes peer-to-peer sync over authenticated HTTPS with bearer-token requirements (and no silent syncs).
Alternatives
- General-purpose vector databases + custom agent memory layer: You can store embeddings and implement retrieval, but you’d be responsible for orchestration, consolidation, hooks, and session/timeline handling—unlike agentmemory’s described auto-capture + MCP/REST surface.
- Local knowledge-base tools for code history (note/graph style systems): Tools that index documents and provide search/graph views can help with recall, but they may not directly capture agent tool calls and session events via the hook pipeline described here.
- RAG frameworks without agent-specific auto-capture: Many RAG stacks provide retrieval and generation-time context assembly, but may require more bespoke integration to capture SessionStart/Stop and tool-use events into a retrievable memory model.
- Agent telemetry/observability-only setups: Tracing tools can help inspect behavior, but they typically do not provide the memory consolidation, retrieval endpoints, and replay import workflow described for agentmemory.
Alternatives
Falconer
Falconer is a self-updating knowledge platform for high-speed teams to write, share, and find reliable internal documentation and code context in one place.
skills-janitor
Audit, track usage, and compare your Claude Code skills with skills-janitor—nine focused slash commands and zero dependencies.
Lasso
Lasso is an AI-first PIM for ecommerce teams that enriches product attributes and descriptions, processes supplier data, and monitors competitors via app or API.
Codex Plugins
Use Codex Plugins to bundle skills, app integrations, and MCP servers into reusable workflows—extending Codex access to tools like Gmail, Drive, and Slack.
Struere
Struere is an AI-native operational system that replaces spreadsheet workflows with structured software—dashboards, alerts, and automations.
garden-md
Turn meeting transcripts into a structured, linked company wiki with local markdown and an HTML browser view. Sync from supported sources.