Labsco
masondelan logo

Selvedge

β˜… 10

from masondelan

Change tracking for AI-era codebases β€” captures the why behind every change as the agent makes it.

πŸ”₯πŸ”₯πŸ”₯βœ“ VerifiedFreeQuick setup

selvedge.sh Β Β·Β  PyPI Β Β·Β  GitHub

Long-term memory for AI-coded codebases. A git blame for AI agents β€” but for the why , not just which line which model touched. Captured live, by the agent, as the change happens.

Selvedge is a local MCP server. AI coding agents (Claude Code, Cursor, Copilot) call it as they work to log structured change events with reasoning. Your data stays in a SQLite file under .selvedge/ next to your code.

Six months ago, your AI agent added a column called user_tier_v2. You don't know why. git blame points to a commit from claude-code with a generated message that says "Update schema." The session that made the change is long gone β€” and so is the prompt that produced it.

With Selvedge, you run this instead:

Copy & paste β€” that's it
$ selvedge blame user_tier_v2

 user_tier_v2
 Changed 2025-10-14 09:31:02
 Agent claude-code
 Commit 3e7a991
 Reasoning User asked to add a grandfathering flag for legacy free-tier
 users during the pricing migration. Stores the original tier
 so we can backfill discounts without touching billing history.

That reasoning was captured by the agent in the moment β€” written into Selvedge from the same context that produced the change. Not inferred from the diff afterward by a second LLM. Not a hand-typed commit message.

Who Selvedge is for

Selvedge has two audiences. Same tool, same pip install, same SQLite file under .selvedge/. Different scale of pain.

Teams running long-term, AI-coded codebases. When the project is big enough that you (or someone else) will touch it again in six months, twelve months, three years β€” but most of it was written by an agent whose context evaporated the day each PR shipped. git blame tells you what changed. Selvedge tells you why β€” even after the agent session, the prompt template, the developer who asked for it, and the model version are all long gone. This is the original use case: production codebases, schema decisions, migrations, dependency changes that need an audit trail that survives turnover.

Solo developers using Claude Code on everyday projects. Side projects, weekend builds, the small internal tool you keep poking at. You don't need enterprise governance β€” you just need to remember why you (or your agent) did the thing you did yesterday, last week, last sprint. Run selvedge init once. Add four lines to your CLAUDE.md. From then on, selvedge blame is muscle memory β€” a way to talk to your past self when your past self was an LLM.

If you've ever come back to your own AI-built project and thought "what was this for again?", Selvedge is the missing piece.

The problem

Human-written code leaks intent everywhere β€” commit messages, PR descriptions, inline comments, the Slack thread that preceded it. AI-written code doesn't. The agent has perfect clarity about why it made each decision, but that context lives in the prompt and evaporates when the conversation ends.

Six months later, your team is debugging a schema decision with no trail. git blame tells you what changed and when . It can't tell you why .

Selvedge captures the why β€” live, by the agent itself, as the change is made. The diff is git's job. The why is Selvedge's.

What's new in v0.3.9

Agent Trace export β€” Selvedge is a compatible producer. selvedge export --format agent-trace emits Agent Trace v0.1.0 records β€” the open AI code-attribution wire format from Cursor + Cognition AI β€” so your captured history travels to any tool that reads the standard. Selvedge's reasoning and entity-level provenance ride along in each record's metadata under the dev.selvedge namespace. Drop-in upgrade for anyone on 0.3.8.

Copy & paste β€” that's it
selvedge export --format agent-trace -o trace.json # one record per event
selvedge export --format agent-trace --ndjson -o trace.ndjson # stream, one per line
selvedge export --format agent-trace --collapse-by-session # merge a session into one record
selvedge import trace.json --format agent-trace # round-trip back in

It's opt-in and additive β€” nothing about the native model, the 8 MCP tools, or local SQLite changes. Entity-level events (a column, an env var, a dependency) have no line range, so Selvedge marks them metadata.dev.selvedge.range_unknown: true rather than fabricating one β€” an honest fidelity signal. This was planned for v0.4.0; only the exporter moved forward (Postgres + the tool rename remain the v0.4.0 markers; HTTP + auth ships in v0.4.1). Full mapping in docs/agent-trace-interop.md.

What's new in v0.3.8

Active memory v1 (date-based). Selvedge's append-only log learns to know when its own data is stale. A decision can now carry a revisit date, and the new stale_decisions tool surfaces decisions that have aged out β€” but only the ones whose entity is still in active use , so an old-but-correct decision nobody touches never nags. Drop-in upgrade for anyone on 0.3.7. This brings the MCP surface to 8 tools.

revisit_after + stale_decisions β€” decisions with an expiry date

Set revisit_after on an architectural log_change β€” an ISO date or a relative offset like 90d:

Copy & paste β€” that's it
log_change({
 "entity_path": "deps/stripe", "change_type": "add", "entity_type": "dependency",
 "reasoning": "Pinned Stripe SDK to v11 for the billing launch.",
 "revisit_after": "180d" // revisit this pin in ~6 months
})

Later, stale_decisions returns the dated decisions that have come due β€” and filters out pure age:

Copy & paste β€” that's it
stale_decisions({})
// β†’ only decisions past their revisit date whose entity is STILL in use:
[
 {
 "entity_path": "deps/stripe", "change_type": "add",
 "reasoning": "Pinned Stripe SDK to v11 for the billing launch.",
 "revisit_due": "2026-...Z", "days_overdue": 12,
 "active_use_signals": ["queried"],
 "stale_reason": "past its revisit date and still active β€” the entity was queried (blame/diff/prior_attempts) after the decision."
 }
]

Pure age never surfaces. A decision only comes back if the entity is still live β€” recently queried (blame / diff / prior_attempts) or its changeset kept moving. That's the noise defense: a dated decision nobody has touched won't nag. Templated and deterministic β€” no LLM, ever. The pattern-based half (expires_when grammar, explicit reject/revert change types) lands in v0.3.11; the v0.3.8 migration adds the expires_when column now so that's a no-migration release.

CLI parity for the wedge + CLI-awareness

selvedge prior-attempts <entity> lands β€” the v0.3.7 prior_attempts wedge was the only MCP tool without a CLI command. It's a thin presenter over the same store, so --json emits the identical list the tool returns. New selvedge stale mirrors stale_decisions (with --json for cron / Slack jobs). And the canonical agent-instructions block now names the CLI equivalents alongside the MCP tools, so a shell-having agent is never blocked when the MCP server isn't loaded. Selvedge stays MCP-first; the CLI is the additive second path.

See CHANGELOG.md for the full list, the one-time migration-v3 note (metadata-only ADD COLUMN, fast even on multi-million-event DBs), and the called-out test-budget overage from the bundled CLI + agent-block work.

Where Selvedge fits

AI agents call Selvedge as they work. Selvedge captures the why into a durable, queryable store and emits it back out β€” as Agent Trace records for cross-tool readers, as observability metadata that links into Sentry/Datadog stack traces, and as compliance artifacts for SOC 2 and EU AI Act audits.

Selvedge does not replace git (line-level what/when), PR review tools (review-time quality), agent observability (LLM call traces), or general-purpose code-host AI features. It sits between them β€” the provenance-as-first-class-citizen layer that everything else references.

How Selvedge compares

There's a fast-growing "git blame for AI agents" category. Here's where Selvedge fits β€” and where it deliberately doesn't.

Reasoning source Granularity Mechanism Grouping Storage Selvedge Captured live, by the agent in the same context that produced the change Entity β€” DB column, table, env var, dep, API route, function MCP server β€” agent calls it as work happens Changesets β€” named feature/task slugs across many entities SQLite, zero deps AgentDiff Inferred post-hoc by Claude Haiku from the diff at session end Line Git pre/post-commit hook None JSONL on disk Origin Captured at commit time Line Git hook None Local Git AI Attribution metadata Line Git hook + Agent Trace alliance None Git notes BlamePrompt Prompt-only Line Git hook None Local

Why "captured live" matters. AgentDiff and Origin generate reasoning after the change is made, by feeding the diff back to a second LLM call. Selvedge's reasoning is the agent's own intent, written from the same context window that produced the change β€” no inference, no hallucinated explanations, and an empty reasoning field is itself a useful signal (the agent didn't have one).

Why "entity-level" matters. Most tools attribute lines . Selvedge attributes things you actually search for : users.email, env/STRIPE_SECRET_KEY, api/v1/checkout, deps/stripe. The first question after git blame is usually "what's the history of this column" , not "what's the history of lines 40–48 of users.py" .

Why "changesets" matter. A Stripe billing rollout touches the users table, two new env vars, three new API routes, one dependency, and four functions across the codebase. Tag every event with changeset:add-stripe-billing and you can pull the entire scope back later β€” even if the original PR was broken into eight smaller ones over a month.

Selvedge ↔ Agent Trace. Agent Trace (Cursor + Cognition AI, RFC Jan 2026, backed by Cloudflare, Vercel, Google Jules, Amp, OpenCode, and git-ai) is an emerging open standard for AI code attribution traces. Selvedge isn't a competitor to it β€” it's a compatible producer. As of v0.3.9, selvedge export --format agent-trace emits Agent Trace v0.1.0 records (and selvedge import --format agent-trace reads them back); the mapping is in docs/agent-trace-interop.md. Agent Trace is the wire format. Selvedge is the live capture + query layer that emits it.

How it works

Selvedge runs as an MCP server. AI agents in tools like Claude Code call Selvedge's tools as they work β€” logging structured change events to a local SQLite database.

Each event records:

  • What changed (entity path, change type, diff)

  • When (timestamp)

  • Who (agent, session ID)

  • Why (reasoning β€” captured from the agent's context in the moment)

  • Where (git commit, project)

The diff is git's job. The why is Selvedge's.

Entity path conventions

Copy & paste β€” that's it
users.email DB column (table.column)
users DB table
src/auth.py::login Function in a file (path::symbol)
src/auth.py File
api/v1/users API route
deps/stripe Dependency
env/STRIPE_SECRET_KEY Environment variable

Prefix queries work everywhere: users returns users, users.email, users.created_at, and any other entity under the users. namespace.

MCP tools

When connected as an MCP server, Selvedge exposes:

Tool Description log_change Record a change event with entity, diff, and reasoning (pass rename_from with change_type="rename" for the dual-event rename pattern) diff History for an entity or entity prefix blame Most recent change + context for an exact entity history Filtered history across all entities changeset All events grouped under a named feature/task slug search Full-text search across all events prior_attempts Prior change attempts on an entity + inferred outcome (was it tried and reverted?) β€” call it before editing stale_decisions Dated decisions past their revisit_after that are still in active use (pure age never surfaces)

CLI reference

Copy & paste β€” that's it
selvedge init [--path PATH] Initialize in project
selvedge status Recent activity summary
selvedge diff ENTITY [--limit N] Change history for entity
selvedge blame ENTITY Most recent change + context
selvedge history [--since SINCE] Browse all history
 [--entity ENTITY]
 [--project PROJECT]
 [--changeset CS]
 [--summarize]
 [--limit N]
selvedge changeset [CHANGESET_ID] Show events in a changeset
 [--list] or list all changesets
 [--project NAME]
 [--since SINCE]
selvedge search QUERY [--limit N] Full-text search
selvedge prior-attempts ENTITY Prior attempts + inferred outcome
 [--description T] (ENTITY xor --description)
 [--all] widen recall to proximity_low
 [--window 7d] proximity window
selvedge stale [--entity ENTITY] Dated decisions due for a revisit
 [--project NAME]
 [--agent NAME]
 [--json]
selvedge stats [--since SINCE] Tool call coverage report (per-tool, per-agent)
selvedge doctor [--json] Health check: DB path, schema, hook, MCP wiring
selvedge install-hook [--path PATH] Install git post-commit hook
 [--window MIN] (default 60 minutes)
selvedge backfill-commit --hash HASH Backfill git_commit on recent events
 [--window MIN] (default 60 minutes)
selvedge import PATH Import migrations (SQL / Alembic) or
 [--format auto|sql| an Agent Trace file (agent-trace)
 alembic|agent-trace]
 [--project NAME]
 [--dry-run]
selvedge export [--format json|csv| Export history (agent-trace =
 agent-trace] Agent Trace v0.1.0 records)
 [--since SINCE]
 [--entity ENTITY]
 [--ndjson] agent-trace: one record per line
 [--collapse-by-session] agent-trace: merge a session into one
 [--output FILE]
selvedge log ENTITY CHANGE_TYPE Manually log a change
 [--diff TEXT] CHANGE_TYPE: add, remove, modify,
 [--reasoning TEXT] rename, retype, create, delete,
 [--agent NAME] index_add, index_remove, migrate
 [--commit HASH]
 [--project NAME]
 [--changeset CS]
 [--revisit-after WHEN] ISO date or offset (e.g. 90d)
 [--rename-from OLD] OLD path when CHANGE_TYPE is 'rename'
selvedge migrate-paths Re-canonicalize stored entity paths
 [--apply] (dry-run by default; --apply writes)
 [--json]

All read commands support --json for machine-readable output.

Relative time in --since:

  • 15m β†’ last 15 minutes (m = minutes)

  • 24h β†’ last 24 hours

  • 7d β†’ last 7 days

  • 5mo β†’ last 5 months (mo or mon = months)

  • 1y β†’ last year

Unparseable inputs (e.g. --since yesterday) exit with a clear error rather than silently returning empty results. ISO 8601 timestamps are also accepted and normalized to UTC.

Coverage checking

Wondering how often your agent actually calls log_change? Two ways to check:

Copy & paste β€” that's it
# Quick summary in the terminal
selvedge stats

# Cross-reference against git commits
python scripts/coverage_check.py --since 30d

The coverage script compares your git log against Selvedge events and shows which commits have associated change events. Low coverage usually means the system prompt needs strengthening β€” see docs/fallbacks.md for guidance.

In CI (GitHub Action)

The same check ships as the Selvedge Coverage Check composite Action, so you can track agent coverage on every push β€” and optionally fail the build when it drops:

Copy & paste β€” that's it
# .github/workflows/selvedge-coverage.yml
name: Selvedge coverage
on: [push, pull_request]
jobs:
 coverage:
 runs-on: ubuntu-latest
 steps:
 - uses: actions/checkout@v4
 with:
 fetch-depth: 0 # full history so commits can be matched
 - uses: masondelan/[emailΒ protected] # pin to a release tag (or @main for latest)
 with:
 since: 30d
 fail-under: "0.5" # optional: fail below 50% coverage; omit to report only

It writes a coverage summary to the job summary and exposes coverage-ratio, covered, and total as step outputs. The action cross-references your git history against the Selvedge event log, so the runner needs the project's .selvedge/selvedge.db (commit it, or restore it before this step) and full git history (fetch-depth: 0). Inputs: since, window, limit, fail-under, selvedge-version, python-version, working-directory, db-path.

Contributing

Copy & paste β€” that's it
git clone https://github.com/masondelan/selvedge
cd selvedge
pip install -e ".[dev]"
pytest

See CLAUDE.md for architecture details and the phase roadmap.

License

MIT β€” see LICENSE.