Labsco
fjwood69 logo

Mori

โ˜… 20

from fjwood69

Shared memory layer for AI coding agents with dream pipeline distillation, session grounding, and multi-instance coherence.

๐Ÿ”ฅ๐Ÿ”ฅPaid serviceAdvanced setup

Mori provides deterministic boundaries for non-deterministic agents.

Mori (ๆฃฎ) is a governed shared memory layer for AI coding agents. Because no coding agent is dependably safe โ€” not even the most capable, and not even the same one twice. Across a multi-model, multi-harness stress test, the most capable coding model broke the build every time ; the same model did the right thing, then the wrong thing, on identical input; and an agent handed a tool that flagged its own change as build-breaking read the warning and shipped the break anyway. Capability doesn't fix this, and neither does better retrieval. What holds is enforcement: pipelines propose what agents learn, a human promotes it to canon, and a binding gate keeps every agent inside the boundary โ€” regardless of which model, which run, or whether it read the memory at all.

Bring your own agent; the knowledge outlives it. Works with any OpenAI-compatible provider and any agent harness โ€” no homelab, no Anthropic account, no LLM gateway required, though those all work too.

๐Ÿ“„ The research behind this โ€” seven model families, the nulls we published, and every retraction: read the whitepaper โ†’

Why use mori?

You're right to be sceptical of "memory systems" โ€” most are a vector DB with a retrieval prompt bolted on. So we ran the experiments, published the nulls, and lead with the result that held up.

The failure mode is cross-contamination. Curation decides what to keep ; provenance decides where it's valid โ€” and across a team's many repos that's the line between a shared brain and a liability. A memory that's true in one repo, surfaced while you work another, makes the agent confidently reach for an API that doesn't exist here โ€” retrieval interference . We reproduced it and the fix: with out-of-scope memory in the brief, agents chased phantom APIs in 20/20 runs; with provenance-safe scoping (MORI_BRIEF_SCOPE, on by default), 0/20 โ€” across two independent frontier-class models (Fisher p โ‰ˆ 0). The memory was deliberately seeded from a prior repo, so it's a stress test of what canon-drift does over months, not a natural-incidence rate โ€” but the mechanism is clean and model-independent. Ungated memory isn't shared memory โ€” it's cross-contamination. (A memory reaches another project's brief only via an explicit scope:global tag; type alone no longer auto-globalises.)

The other failure mode is obedience โ€” and it's the one that points where mori is going. Provenance fixes what an agent knows ; it doesn't fix what an agent does . In a pre-registered cross-repo benchmark, we gave frontier agents a tool to see downstream impact and a plain-language warning that a change would break a build โ€” and they broke it anyway, 15/15, across four models and three independent harnesses. One read "WILL BREAK" in the tool's output and shipped the change in the same breath. (Blind caution is no fallback either: told only to "be careful," the strongest model refused safe work 70% of the time.) Information without enforcement is fatal โ€” you cannot govern an enterprise in token-space. That result is the thesis behind mori's next layer: governed playbooks โ€” a deterministic, pre-compute gate that checks a repo's lockfiles against human-approved patterns and refuses an unsafe migration before it runs , independent of prompt wording, model obedience, or luck. Let the builder models commoditise the code-edit loop; mori is the institutional memory they pull from, the seatbelt that keeps them in-bounds, and the audit trail that proves what they did. (Scope: pre-registered npm-dependency migrations; the gate is built and benchmarked, not yet a shipped product surface โ€” we lead with what we proved, not what we hope.)

What we tested โ€” including what failed. We don't headline a speed number, because our own data wouldn't let us. A cold-start discovery-cost task (files an agent reads/greps before its first edit) hinted curated memory cuts re-exploration:

What the agent starts with Discovery cost vs cold start Nothing (cold start) 22.5 โ€” Auto-extracted memories ~17โ€“18 ~22% better Hand-written CLAUDE.md ~17โ€“18 ~22% better Human-curated canon (mori) 11* ~51% better*

  • One repo, not replicated โ€” the pre-registered, powered follow-up below is a null . Shown for transparency, not as a headline.

โ€” but a pre-registered, powered follow-up returned a null: the human gate did not make the compounding curve faster than keeping everything, and on a second repo the curation win flipped negative (the occurrence a blind curator dropped was the answer). So we don't sell mori as a speed-up โ€” we publish the null. The robust result is provenance, above. One finding from a memory-research program now past a thousand agent runs; full methodology, every model, the null, and every retraction in the whitepaper.

So mori doesn't replace your CLAUDE.md โ€” keep it as your unconditional floor (static facts you hand-edit, always present: commands, conventions, hard rules). Mori is the governed layer above it: the decisions and patterns a human chose to keep, surfaced to every session โ€” and only where they apply. (the full distinction โ†’)

And it's yours: self-hosted (your server, your data), open-source (AGPL-3.0), provider- and agent-neutral. No data leaves your infrastructure; the knowledge outlives whichever agent you're using this year.

Multi-Instance Coherence

If you run AI coding agents across multiple machines, profiles, or in a team โ€” one focused on the API layer, another on the frontend, a third on infrastructure โ€” you already know the problem: each instance is brilliant in isolation, but none of them know what the others decided.

Instance B doesn't know that Instance A just changed the auth contract. Instance C doesn't know that Instance B's deployment assumptions shifted. They find out the hard way, mid-task, when something breaks.

Mori gives every instance the same shared picture. Every coding agent instance sends its session events to the shared Mori server; the dream pipeline distils those events from all instances into a unified memory store, and /brief surfaces them at the start of any session. But be clear about what that buys: a shared picture is visibility , not coherence you can bank on. Surfacing what Instance A decided does not make Instance B act on it โ€” a coding agent can read another instance's change and proceed against it anyway (we measured exactly that, 15/15). The coherence that holds is the gate's, not the brief's: a binding boundary that applies regardless of whether the agent read the memory at all.

Capabilities

Capability What it does Slash command Dream pipeline Auto-distils session events into structured memories /dream Session grounding Loads shared context at session start โ€” not per-query RAG; lightweight delta re-grounding after context compaction /brief, /brief --post-compact Memory search Ranked full-text search and browse across the shared store (SQLite FTS5 / Postgres tsvector) /pensieve Web dashboard Built-in memory browser served at the mori root URL โ€” search, browse, unfurl โ€” Universal ingestion Feed PDFs, images, git, transcripts into the memory store /ingest Strategic review LLM guidance with focus areas and auto-injected standards /consult Requirements tracking Lightweight project checklist surfaced via /brief /req Governance Capability-scoped API keys (read/write/dreamer roles), versioning, trusted dreamers, rollback, attribution; a universal in-transaction write-audit + tier-capability & anatomy enforcement at the store.write chokepoint (flag-gated, audit-mode by default) โ€” Curation queue Ingestion's canonical/standard proposals await trusted-dreamer sign-off in a review UI (/review, with source/diff/approve-reject) before becoming canonical โ€” memory that's curated , not just accumulated โ€” One-click deploy Stand up your own server on Render / Railway / Fly / Cloud Run (or free managed Postgres + any stateless host) โ€” NATS messaging Real-time cross-device awareness /nats Inter-agent messaging Send tasks, questions, and decisions across the device network /msg Skill deployment Push slash commands to all devices in one step /update

Full reference: docs/reference/slash-commands.md

Pairs well with

Mori is your team's earned memory โ€” not a docs cache. It remembers what your agents decided and learned across sessions and devices. It complements tools that supply live external knowledge :

  • Context7 โ€” up-to-date, version-specific library and framework documentation injected into the prompt. Where Mori remembers "we chose X, and why" , Context7 supplies "here is X's current API." Different layer, complementary purpose.

  • Your platform's own docs โ€” for fast-moving tool and harness behaviour (hook schemas, config formats), consult the current official docs rather than training-data recall. See the Read the current manual, not your memory practice in agent-working-practices.

How it works

Dream pipeline โ€” the proposal half of the gate

The dream pipeline is the proposal mechanism, not the product. It runs at high recall: it turns session activity into candidate memories and deliberately over-produces โ€” recall over precision โ€” because nothing it emits reaches canon without a human promoting it (see Governance below). That division of labour โ€” machine proposes, human disposes โ€” is the gate the benchmark measures.

Session events are captured via agent lifecycle hooks (Claude Code, Cursor, Antigravity) and distilled into structured memories by a configurable LLM.

Copy & paste โ€” that's it
Hook fires โ†’ POST /api/events/raw โ†’ events table (SQLite/Postgres)
 โ†“
PreCompact โ†’ POST /api/precompact โ†’ dream_run() reads since watermark
 โ†“
 LLM distills events โ†’ structured memories
 โ†“
 memories written to store (with attribution)
 โ†“
 watermark advanced

The PreCompact hook triggers an immediate synchronous dream before context compression โ€” so nothing is lost at the moment it matters most.

Its counterpart works after compression: a SessionStart hook fires when the session resumes post-compaction (source: "compact") and runs /brief --post-compact โ€” a lightweight delta that surfaces only what changed in the shared store since your last brief (new, superseded, and evicted memories), skipping the full base reload and the freshness scan. PreCompact preserves what this session learned; SessionStart re-grounds it on what every other instance changed while it was busy.

What it captures: PostToolUse, PostToolUseFailure, PreCompact, UserPromptSubmit, Stop โ€” tool calls, prompts, errors, stop reasons, session ID, hostname, working directory, transcript path, and (on Stop) the assistant's own reasoning โ€” the plans, analysis, and decisions behind each turn.

Governance โ€” the gate

Proposals don't become canon on their own. Both the dream pipeline (your sessions) and the autonomous-agent intake path (other agents) write to a review queue, not to canon. A trusted dreamer โ€” a human โ€” reviews candidates and promotes the load-bearing ones; every promotion is versioned and write_audit-logged. Agents read canon; they never silently write it.

Underneath, every write โ€” the dreamer's included โ€” passes one audited authorization chokepoint: a structured provenance lands a write_audit row in the same transaction as the write, and tier-capability + anatomy enforcement gate who may write what (both ship audit-mode by default, so the policy is measured before it bites).

To keep that review cheap, mori rolls up near-duplicate candidates so the reviewer disposes of a convention once instead of many times. The proposal side runs at high recall; the gate is what makes high recall affordable instead of exhausting. This is the line the benchmark measures โ€” and the commercial seam too: standards and policy packs enter through the same gate (signed in, versioned, audited), not by trusting the filesystem.

Memory store

Memories live in the store โ€” SQLite (memories.db) for solo/sync deployments, Postgres for team/async โ€” with three tiers:

SQLite vs Postgres is a trust boundary, not just a backend toggle. SQLite is the one-human, one-writer mode โ€” the zero-config default for a single user on one machine, where you are the only thing that writes. Postgres is the team mode โ€” many machines, many agents, concurrent writers โ€” and it's mandatory for anything beyond solo: SQLite's file-level locking serialises writes and cannot sustain it. Capabilities that exist because multiple writers do (the autonomous-agent intake / governance pipeline) are Postgres-only by design. The curation queue still runs on SQLite in degraded single-writer form โ€” one human gating their own agents โ€” so the gate is never unavailable; it just doesn't need concurrency until a team does. Choose Postgres for production/team use.

Tier Scope Lifecycle Ephemeral Session summaries Auto-expire unless explicitly saved Working Patterns, decisions, project context Flagged after 30 days without retrieval Canonical Explicitly promoted by a trusted dreamer Indefinite, freshness-checked via /brief

Versioning, diff, rollback, attribution, and governance built in. See docs/reference/configuration.md.

Universal ingestion

New team members start cold. /ingest bootstraps the memory store from existing source material โ€” applying the same distillation pipeline that powers the dream phase.

Copy & paste โ€” that's it
# Preview (zero cost, no LLM):
/ingest --source ~/my-project --preview

# Dry-run to validate extraction quality:
/ingest --source ~/my-project --dry-run --focus decisions

# Commit:
/ingest --source ~/my-project --focus all --tier working

Supported: PDF, images/whiteboards (Kimi K2.6 vision), CC transcripts (.jsonl), git history (--since 30d), text and code.

Works with remote servers: /ingest reads files on the client device and sends content over the wire โ€” no shared filesystem needed. Works whether mori-advisor is running locally or on GCE.

Cost guard: --max-cost (default $5.00) aborts before spending. Preview is always free. SHA256 dedup prevents re-ingesting the same content.

Strategic consultation (/consult)

Ask a question mid-session and get strategic guidance grounded in your actual project context โ€” not generic advice. When a focus area is specified, relevant team standards are automatically pulled from the memory store and injected alongside your question. The advisor checks against your own baseline, not a textbook.

Copy & paste โ€” that's it
# Architecture review with file context:
/consult "should we move auth to a separate service?" --focus architecture

# Security review against your team's own baseline:
/consult "review this handler" --focus security --file src/auth.py

# Chain tool output directly into the advisor:
/consult "review this" --focus security --file src/auth.py --file snyk-report.json

Focus areas: general, architecture, security, performance, style

Depth levels: quick (fast scan), balanced (default), deep (thorough)

Standards-aware: set MORI_STANDARDS_DIR to a directory of .md files and Mori imports them as protected memories. /consult --focus security injects your security baseline into the advisor call, so /consult reviews against your rules, not a textbook. (That's the advisor โ€” a single, scoped review request โ€” reading your standards; it is not a claim that your coding agent obeys them mid-task. Surfaced standards inform; they don't bind.)

Inter-agent messaging (/msg)

Delegate tasks, ask questions, and share decisions across your Claude Code instances โ€” without a shared session. Messages are typed, reply-threaded, and picked up at the next /brief. The mori-msg daemon receives messages server-side: decision messages are written directly to the memory store without any human session on the receiving end.

Copy & paste โ€” that's it
# From your laptop, delegate a task to a workstation:
/msg send workstation task "Refactor auth middleware โ€” extract rate limiting into its own module"

# The workstation picks it up at next /brief and acks:
/msg ack a3f9c2b1 "on it"

# Back on your laptop, check the reply:
/msg inbox

# The workstation marks it done when finished:
/msg done a3f9c2b1

Message types: task, decision, question, reply, ack, done, broadcast

Requires the mori-msg daemon running alongside mori-advisor (included in the default pod stack). See docs/reference/msg.md for full reference.

Web dashboard

Not everyone who needs the shared memory runs a Claude Code session. Mori serves a built-in memory browser at its own root URL โ€” just open the server in a browser:

Copy & paste โ€” that's it
http:// :8968/

Enter any valid API key (the same MORI_API_KEYS your clients use) and you can search, browse, and click any card to unfurl its full body and provenance (origin clients, tier, retrieval count, freshness). The page is served same-origin, so it talks to the very mori instance it loaded from โ€” no base URL to configure, no separate server to run. It's a single dependency-free file (vanilla JS, no build step, no CDN), backed by a small read REST API:

Route Returns GET /api/memories?query=&type=&tag=&client=&since=&limit= Ranked full-text (or recency) list โ€” lean shape, no body GET /api/memories/{name} One memory in full โ€” body + provenance (lazy-loaded on unfurl) GET /api/events?session_id=&client=&since=&limit= Session event log, newest first

The dashboard and its routes are read-only and API-key gated (X-Api-Key); write actions (delete, trusted-dreamer review) are deferred until the read surface is validated. The page is also available standalone (dashboard/index.html) if you'd rather host it elsewhere and point it at a mori instance โ€” set MORI_CORS_ORIGINS for that cross-origin case (it's unnecessary for the built-in same-origin serving).

Architecture

Building

Copy & paste โ€” that's it
git clone https://github.com/fjwood69/mori.git
cd mori
podman build -t localhost/mori-advisor:latest .
# Or: docker build -t mori-advisor:latest .

License

AGPL-3.0 โ€” see LICENSE. Commercial licences available โ€” see COMMERCIAL.md. Contributions require a one-time CLA โ€” see CONTRIBUTING.md.