Conventions

Rules for building tools that evolve safely under agent modification.

These conventions exist because agents optimize locally — nearest tool, cheapest modification, fastest path to green tests. Without structural guards, every tool becomes a god-object after five iterations. These are the immune system.

1. Scope boundaries

What: The "What this does NOT do" section in SKILL.md defines permanent scope boundaries — what the tool must never become, not just what it currently doesn't do.

Why: In a system where agents create WOs and modify tool code, this section is the only structural guard against uncontrolled growth. A weak not-do section means agents will keep expanding the tool until it does everything badly.

Rule: Minimum 3 bullet points. Each must use boundary-defining verbs (manage, store, execute, replace, modify, own). Items must not contradict the Commands section. No vague disclaimers.

Example:

## What this does NOT do

- Does not execute remediation — only reports state
- Does not manage database schemas or migrations
- Does not replace Prometheus — no metric storage or alerting

Validator: not-do-min-items (fail), not-do-specificity (warn), not-do-no-overlap (warn), not-do-boundary-verbs (warn)

2. Extend-vs-new-tool rubric

What: A five-point test for deciding whether a change belongs in the existing tool or requires a new one.

Why: Without this rule, every adjacent feature crawls into the nearest binary. Agents optimize for the cheapest modification, not the architecturally correct move.

Rule: A change belongs in the existing tool ONLY if it: (1) strengthens the same primary job, (2) uses the same trust boundary, (3) uses the same input world, (4) preserves the same output contract, (5) does not weaken "What this does NOT do." Otherwise, create a new tool.

Agent bias cues: Prefer new tool when crossing trust boundary. Prefer handoff over expansion. Prefer narrow outputs over adding optional fields forever. Prefer composition over capability absorption.

Validator: scope-pressure (warn) — flags when command count exceeds 10 or section count exceeds 8.

3. Handoff contracts

What: An optional Handoffs section in SKILL.md declaring what the tool outputs, what kind of tool handles the next step, and what questions the tool refuses to answer.

Why: Tools know what they do but not where they stop. Without explicit handoff contracts, agents blur boundaries because blurred boundaries feel efficient.

Rule: Recommended for tools with 3+ commands. Each handoff declares output type, next tool category (not hardcoded name), and refused questions.

Example:

## Handoffs

- Output: deployment health JSON. Next: diagnostic tool for root cause.
- Output: security scan results. Next: enforcement tool for policy gates.
- Refused questions: why is it broken, should we fix it, is it safe to act.

Validator: handoff-section (warn)

4. Doctor output schema

What: A standard JSON schema for tool doctor --format json output.

Why: Every tool invents its own doctor format. An agent running doctor on 5 tools gets 5 schemas. This breaks the cognition pipeline — agents cannot programmatically assess readiness.

Rule: Required fields: status (healthy/degraded/unavailable), checks (array with name/status/message). Recommended: dependencies, capabilities, readiness (0-1).

Provenance fields: Doctor output must also include binary provenance — the link from installed tool back to source. Without this, agents cannot resolve changelogs, verify versions, or reason about tool history from a binary alone. Required: version. Recommended: revision (commit SHA), source.repo (canonical repository URL).

Example:

{
  "status": "healthy",
  "version": "1.5.6",
  "revision": "abc1234",
  "source": {
    "repo": "github.com/ppiankov/enforcement-gate"
  },
  "checks": [
    {"name": "config", "status": "pass", "message": "valid"},
    {"name": "database", "status": "pass", "message": "reachable"}
  ],
  "readiness": 1.0
}

In repo mode, ancc validate reads CHANGELOG.md directly. In binary mode, the agent reads doctor output → resolves source.repo → fetches the changelog for the declared version. If provenance is missing, the agent treats changelog lookup as unknown.

Validator: doctor-output-valid (warn), doctor-provenance (warn)

5. Provenance classification

What: Standard provenance values for JSON output fields: observed (live from API), declared (from annotation/config), inferred (computed from other fields), unknown (source unclear or stale).

Why: Agents consuming tool output cannot distinguish a live replica count from a 6-month-old annotation. Acting on stale annotations as if they were live truth causes stupid decisions at industrial scale.

Rule: Use a top-level provenance map keyed by field path. Not fake confidence scores — real data lineage.

Example:

{
  "replicas": 3,
  "owner": "platform-team",
  "health": "degraded",
  "provenance": {
    "replicas": "observed",
    "owner": "declared",
    "health": "inferred"
  }
}

Validator: provenance-documented (warn)

6. Deprecation and pruning

What: Lifecycle conventions for removing features, splitting tools, and retiring commands.

Why: The ecosystem only has conventions for growth. Healthy ecosystems also need subtraction. Without pruning rules, features only ever get added, overlapping tools accumulate, and dead code persists.

Rule: SKILL.md supports a Deprecated section with per-command markers. When scope-pressure fires, the convention recommends splitting: identify command clusters by domain, create new SKILL.md for each, update handoff contracts, deprecate old commands.

Validator: deprecated-commands (warn)

7. Temporal contracts

What: Release hygiene conventions that make tool evolution machine-readable. CHANGELOG.md is required, entries must exist for every tagged version, and change classification must be explicit.

Why: ANCC conventions 1–6 describe what a tool does now. But agents operate across time — they upgrade tools, evaluate compatibility, and reason about whether behavior changed. Without temporal contracts, tools evolve silently: behavior changes without trace, agents cannot detect breaking changes, and "what version should I use" becomes a guess.

The pattern: Agents skip changelogs for the same reason they skip scope boundaries — nothing structural enforces them. Tagging is required (the release fails without it). Writing release notes is optional. So agents optimize: tag, ship, skip narrative. Every time.

Rule: CHANGELOG.md must exist at repo root. Version tags must map 1:1 to changelog entries — a release is invalid without a matching entry. Each entry must classify changes and declare the affected contract surface:

Breaking — output schema changed, commands removed, behavior inverted. Must explicitly declare which commands, flags, or output fields are affected. Agents must re-read SKILL.md.
Additive — new commands, new flags, new output fields. Existing contracts still hold. No action required by consuming agents.
Behavioral — same interface, different behavior. Thresholds changed, defaults changed, error handling changed. Must declare what assumptions are no longer valid. Hardest to detect without explicit declaration — and the most dangerous category for agents.
Internal — refactoring, dependencies, performance. No external contract change.

Forward compatibility: Agents must assume future versions may invalidate current assumptions unless explicitly declared compatible. The changelog is how that declaration is made.

Enforcement: A release is invalid without a matching changelog entry. Release workflows must verify the entry exists before creating a GitHub release. Release notes must be extracted from CHANGELOG.md — not auto-generated as empty "Full Changelog" links. CI gates the tag: no entry, no release.

Example:

## [1.5.6] - 2026-03-25

### Fixed (behavioral)

- Fallback relaunches target when child seccomp install fails
- Status pipe replaces exit code 126 as sentinel

### Added (additive)

- eBPF observe mode with 9 tracepoints
- enforcement-gate enforce CLI command

What this enables: Agents can refuse unsafe upgrades deterministically. Agents can choose compatible versions. Agents can reason about tool evolution without reading diffs. History becomes part of the contract.

Validator: changelog-exists (warn), changelog-version-entry (fail)

8. Active defense

What: Safety layers must tell the agent when they intervene — not silently fix things behind the agent's back. When a gate redacts, blocks, or modifies agent output, it must inject a structured alert into the agent's context so the agent knows its environment changed.

Why: Silent intervention is invisible corruption. If a proxy silently strips a secret from an outbound request, the agent doesn't know its output was modified. It might retry, assume success, or build on incomplete context. The agent is now operating on information it believes is complete but isn't. That is the same class of failure as a tool with no scope boundaries — silent assumptions that compound.

The principle: Agents must not operate on modified information without knowing it was modified. Every safety layer that intercepts, redacts, blocks, or transforms agent traffic must signal that intervention in-band — through the same channel the agent reads.

Rule: Safety layer alerts must be structured and machine-parseable — not human-readable warning strings. The alert must include: what was detected, what action was taken, and severity. This allows agents to route the alert through their escalation chain rather than just printing it.

Example:

{
  "pastewatch_alert": {
    "type": "secret_detected",
    "category": "api_key",
    "action": "redacted",
    "severity": "critical",
    "message": "secret removed from outbound request — rotate immediately"
  }
}

What this prevents: Silent redaction that the agent doesn't know about. Retry loops caused by invisible modifications. Agents building on context they believe is complete but isn't. The difference between passive defense (silently fix) and active defense (fix and tell) is the difference between a system that hides problems and one that surfaces them.

9. The clamp ratchet

What: Conventions for restraint tools that are themselves agent-operable — gates, sandboxes, scope-clamps that one agent applies to another. Such a tool's loosening operations must be one-way: an agent may tighten a restraint, but only a human, over an off-band channel, may widen or remove it.

Why: It is tempting to think a restraint tool should not be agent-native — "you don't let the thing being firewalled configure the firewall." But in a multi-agent fleet the restrainer and the restrained are different agents: an orchestrator dispatching a sub-agent onto a sensitive host clamps that sub-agent's reach at dispatch. The restrainer is an agent, so the restraint tool must be agent-operable. Agent-nativeness is required when an agent invokes the restraint, optional when a human does. When an architect agent pulls the lever to bound a sub-agent, the tool must be agent-operable — an agent cannot safely invoke a contract it cannot read.

The principle: Handing an agent the lever creates a new failure mode — a confused or prompt-injected orchestrator could be induced to loosen the restraint on its own sub-agent. So the lever must be a ratchet. This is the same asymmetry as deletion safety (adding is recoverable, removing is not), applied to authority: granting restriction is safe to automate; removing it is not.

Rule: A restraint tool's agent-facing contract must split operations by direction. Tightening — adding a protected path, narrowing an allowlist, lowering a budget — is agent-initiated and free. Widening or removing a restraint must route through an out-of-band consent path that the restrained process tree cannot see or forge, where a human answers. The SKILL.md "What this does NOT do" section must state the ratchet explicitly: an agent can clamp; it cannot un-clamp. A restraint tool that quietly does more than restrain — or whose restraint an agent can silently loosen — is the dangerous failure mode, and matters doubly for tools whose entire job is to bound others.

Example: A read-gate dispatched by an orchestrator:

# Orchestrator agent clamps a sub-agent at dispatch — tightening, free:
bulwark run --deny-all --allow '/var/log/**' -- sub-agent ...

# Widening the clamp is NOT an agent operation. It routes to a human
# over the off-band consent channel; the sub-agent tree cannot reach it:
bulwark run --consent socket --protect /etc -- sub-agent ...
#   -> operator answers allow/deny out-of-band; the agent only sees the result

What this enables: Least-privilege applied at dispatch, by the dispatching agent — the agentic-era version of the principle. An architect can shine authority onto a fleet and clamp each agent's reach in the same gesture, because the clamp can only be tightened by the agents and only loosened by a human. That is what makes it safe to run many agents on sensitive hosts without one of them widening its own reach.