The Problem: Decisions That Compound
When you build an AI agent that operates autonomously — fetching data, writing files, sending messages to peer agents, deploying infrastructure — every decision compounds. A quick choice about authentication sequencing today determines what middleware patterns exist six months from now. A “simple” API key implementation creates migration paths that constrain your Solid-OIDC rollout later.
Many agent frameworks lack structured consequence tracing. The model picks whatever seems reasonable, ships it, and you discover the knock-on effects when something breaks downstream.
We built something different: a structured adjudication system that traces the consequences of each option through 10 orders of effects before choosing. Here’s what that looks like in practice.
The Decision
Our psychology agent mesh needed public client authentication. The design document already specified a phased rollout:
- Phase 0 (current): Anonymous, rate-limited by IP
- Phase 1: API keys (bearer tokens, KV-backed)
- Phase 2: Solid-OIDC (DPoP token binding, WebID, pod storage)
- Phase 3: Tiered access combining both
The architecture had already settled what to build. The open question: how to sequence the implementation. Three options emerged:
| Option | Approach |
|---|---|
| A | Phase 1 first — API keys now, Solid-OIDC later |
| B | Phase 2 first — skip API keys, go straight to Solid-OIDC |
| C | Build auth middleware abstraction, implement API keys as first resolver, Solid-OIDC slots in later |
The Knock-On Framework
Each option gets traced through 10 orders of effects:
Order 1-2: Certain (direct, immediate effects)
Order 3: Likely (based on known dependencies)
Order 4-5: Possible (compounding; state assumptions)
Order 6: Speculative (honest about confidence)
Order 7: Structural (ecosystem/precedent effects)
Order 8: Horizon (normative/structural long-term)
Order 9: Emergent (INCOSE emergence — interaction of chains)
Order 10: Theory-revising (Popper — falsifies the
theory that justified the decision)
The confidence labels matter. By order 6, we explicitly label our analysis as speculative. By order 10, we ask whether the decision could invalidate the reasoning that led to it. Most decision frameworks stop at order 2-3 and call it analysis.
What the Analysis Revealed
Option A (API keys first) looked fast and safe, but order 6 surfaced a calcification risk: users build integrations around bearer tokens, creating migration friction when Solid-OIDC arrives. Pass 2 (the second trace) downgraded this risk — API keys remain valid in Phase 3 as a tier, so no migration needed. The risk proved real but not as severe as Pass 1 suggested.
Option B (Solid-OIDC first) looked architecturally pure, but order 5 revealed the gap problem: jumping from “no auth” to “full OIDC + DPoP + WebID + pods” with no stepping stone. Pass 2 upgraded this risk — debugging auth failures across CSS, CF Worker, and client requires visibility into all three systems simultaneously. No fallback mechanism.
Option C (parallel foundation) emerged as the synthesis. Order 7 identified the structural precedent: multi-scheme auth from the start means every future Worker gets the same resolveAuth() interface. The abstraction amounts to one function — not a framework, not overengineered.
The Two-Pass Calibration
The two-pass system catches a specific failure mode: anchoring to first impressions.
Pass 1 generates the initial trace. You notice which effects actually differentiate the options and which ones score the same across all options. Pass 2 re-traces with that calibration: downgrade risks that Pass 1 revealed as manageable, upgrade risks that turned out more serious than initially estimated.
In this case:
- Option A’s calcification risk → downgraded (keys coexist in Phase 3)
- Option B’s complexity gap → upgraded (no fallback, three-system debugging)
- Option C’s abstraction overhead → downgraded (one function, not a framework)
The Comparison Table
After knock-on analysis, build a table using only axes that differentiate the options:
| Delivery speed | Complexity now | Migration cost later | Fallback resilience | |
|---|---|---|---|---|
| A | Fast | Low | Low | High |
| B | Slow | High | None | Low |
| C | Fast-Medium | Low-Medium | None | High |
Option C wins on consensus — it matches A’s delivery speed while architecturally preparing for B. The comparison table makes this visible; the knock-on analysis provides the evidence that the table summarizes.
Resolution Protocol
Two resolution paths:
- Consensus: one option wins on a clear majority of axes → choose it
- Parsimony: no consensus → simplest option that meets requirements wins
Here, consensus applied. Option C provided the strongest composite profile — matching A’s delivery speed while eliminating future migration cost and maintaining fallback resilience.
Implementation
The cognitive architecture that drives this lives in docs/cognitive-triggers.md — a set of mechanical triggers that fire at specific points in the agent’s decision loop:
- T3 (Before Recommending) fires whenever the agent considers recommending an approach. 15 checks run, including domain classification, grounding verification, sycophancy check, parsimony comparison, and a Tier 1 evaluator proxy with adversarial self-framing.
- T14 (Structural Checkpoint) fires at every decision point and scans orders 7-10 from the knock-on framework.
- /adjudicate composes
/knock(single-option tracing) with 2-pass iteration and structured comparison.
The system self-documents: every adjudication writes its full analysis to docs/decisions/YYYY-MM-DD-{slug}.md with YAML frontmatter. The decision record persists independently of the conversation that produced it.
The Meta-Point
The adjudication system exists because AI agents face the same decision quality problems that human organizations face, but compressed in time. A human team might take weeks to discover that their authentication sequencing choice created migration debt. An autonomous agent discovers this in 30 seconds of knock-on tracing — if the tracing infrastructure exists.
The cognitive architecture doesn’t make the agent smarter. It makes the agent’s reasoning auditable, structurally repeatable, and calibrated. The same decision with the same inputs produces structurally comparable analysis — though LLM non-determinism means exact reproduction varies across runs — and the analysis explicitly labels its own confidence at each step.
That captures the value proposition: not better decisions through magic, but better decisions through structure.
Epistemic Flags
- The two-pass calibration claim (reduces anchoring bias) describes design intention, not a measured effect — no controlled experiment has isolated the calibration step’s contribution
- The auth sequencing case study demonstrates the protocol’s mechanics but cannot establish generalizability from a single instance
- LLM non-determinism limits reproducibility claims — the architecture aims for structural consistency, not identical outputs
The psychology agent’s cognitive architecture, including the adjudication protocol and knock-on framework, lives at safety-quotient-lab/psychology-agent under Apache 2.0. The trigger system (docs/cognitive-triggers.md), decision records (docs/decisions/), and skills (.claude/skills/) all transfer — see the cogarch adaptation guide for adoption instructions.