The Big Picture
SGD is best understood as one diagram: everything an AI coding agent needs to make a governed change — the nine governance layers (L0–L8) distilled into four buckets that feed the model's context window, the Cortex memory layer that feeds it across time, and a traceability spine that links every change back to intent and forward to executable tests.
Click the diagram to zoom full-screen · download SVG · PNG for slides & PDF
The AI tools still write the code. SGD makes sure they have the context to write the right code — and proves, change by change, that they did.
The four buckets
📄 Docs — what goes into the model's context
The governance layers are machine-readable constraint documents. An agent reads them before it touches code, so its change stays inside the boundaries the organisation already agreed.
| Layer | Pins down |
|---|---|
| Vision | The shared intent — non-goals, success measures — that everything below cites. |
| Capability Model | What exists: every feature maps to a capability. |
| Design System | How we build things here — structure, naming, dependency rules. |
| Feature Specs | Business intent, data model, API contract, and Gherkin acceptance criteria. |
| ADRs | Cross-cutting architectural decisions, with rationale. |
| DAG Manifest | Impact analysis — "this change touches 14 features across 3 repos." |
| Change Protocol | The required sequence: LOCATE → READ → IMPACT → PROPOSE → IMPLEMENT → TEST → UPDATE. |
🛠️ Tools — what acts and injects context
- RepoSentry GitHub App — connect repos or Jira, score every PR, run the dashboard.
- Cortex — persistent, evidence-weighted memory so sessions are never amnesiac; proven patterns elevate org-wide.
- MCP server — feeds the relevant spec, ADRs, and coherence rules straight into the agent's context.
sgdplugin skills —sgd-preflight,sgd-implement,tdd-workflow,pr-review,qa-audit, and more, driving the workflow at the keyboard.sgd-init— wires CI workflows, enforcement docs, and a pre-push hook into any repo in under 60 seconds.- Remediation PRs — the platform opens the fix automatically.
See AI Integration →
🎛️ Controls — what enforces the standard
- Coherence Score (0–100) — the headline number a CTO can show the board.
- Drift metrics — spec coverage, built coverage (capabilities with acceptance criteria), journey freshness, orphan rate, ADR cadence, cross-repo contract drift.
- Tiered approval gates — auto-pass routine changes; require human sign-off when a change crosses a capability boundary.
- Predictive trend alerts — catch the slow "boiling frog" drift before it becomes a crisis.
- Graceful degradation — advisory first, blocking when the team is ready, so governance never stalls velocity.
🪝 Hooks — where SGD plugs in
- GitHub App webhooks → governance checks on every PR, as native status checks.
- Pre-push git hook → local enforcement before code leaves the laptop.
- CI workflows → enforcement in the pipeline.
- Cortex ingestion → commits, CI results, and PR reviews feed the knowledge graph.
- Daily snapshot job → drift metrics recorded over time for trend analysis.
See PR Checks →
The traceability spine — and where the tests come from
Every governed change rides a single, machine-verifiable chain. The crucial middle of that chain is the executable tests, generated from the spec's acceptance criteria — this is what turns a specification from a document nobody reads into a gate the code must pass.
Specs are the source of the tests, not a parallel artefact that drifts from them. This follows the Living Documentation principle (BDD / Cyrille Martraire): the Gherkin scenarios in a feature spec are the acceptance criteria, and they become the executable tests that guard the change.
How SGD keeps that link honest:
- Acceptance criteria are written as Gherkin in every Layer 3 feature spec —
Given / When / Thenscenarios, not prose. - Those scenarios drive automated tests. The
tdd-workflowskill turns each scenario into a failing test first, then implements the minimum code to make it pass. Test Generation auto-derives test cases directly from a spec's acceptance criteria so no scenario ships without a test behind it. - CI runs them on every PR as a governance check — a PR that doesn't satisfy its spec's scenarios fails the check.
- Coverage becomes a drift metric.
built_coveragetracks the percentage of built capabilities that actually have Gherkin acceptance criteria attached. A capability that shipped without tests shows up as drift — automatically. qa-auditverifies, after the fact, that the merged PR really meets the linked spec's criteria, and posts the evidence on the PR.
The result: a regulator (or your future self) asking "how do we know this change does what was asked?" gets a one-click answer — the QD-NN decision, the spec, the passing BDD scenarios, and the commit that landed them.
See Traceability →
Why it helps
For the business — CTOs, compliance, the board
- One coherence score across every repo proves the AI-tooling spend is paying off.
- Audit-ready evidence for FCA Consumer Duty, ISO 27001, SOC 2 — the intent → spec → tests → commit chain is recorded as work happens, never reconstructed.
- Drift caught early by predictive alerts, instead of discovered as expensive legacy debt.
- Proven patterns spread org-wide automatically via Cortex, without central coordination.
For developers — and their AI agents
- Every session starts with structured, evidence-weighted context instead of cold.
- Specs give the agent scope boundaries, so it stops reinventing auth, tables, and error handling.
- Tests are generated from the spec, so "done" has a concrete, executable definition.
- DAG impact analysis warns you before you break 14 downstream features.
- Graceful degradation means governance guides without gating your velocity.
Next steps
The 6 Governance Layers
The full governance model, from capability model to change protocol.
AI Integration
How specs, ADRs, and coherence rules reach Claude Code and Cursor.
Traceability
The full chain from business intent to deployed, tested code.
Your First Feature Spec
Write the Gherkin acceptance criteria that drive the tests.