Position Paper · v2.3 · July 2026

The Agency Paradox: Governed Autonomy as Infrastructure

Author Stephen Sweeney

Version 2.3

License CC BY 4.0

Audience Engineering Leaders, Platform Architects, AI Infrastructure Teams

This is the current version. It supersedes v2.2 (March 2026), which remains available as composition problem and ninth criterion →

Revision notes: This version preserves the core thesis of earlier versions while making its claims more precise and falsifiable.

v2.1 → v2.2. The paper incorporated the composition problem: the fact that individually compliant actions may compose into collectively invalid behavior, and that repeated denials may constitute a governance signal rather than isolated events. Special thanks to Dr. George Walder, whose academic response to v2.1 sharpened this issue and helped clarify the need for composition-aware governance.

v2.2 → v2.3. Three amendments informed by current empirical research on agent state governance, alignment failure modes, and architectural attack surfaces. Amendment 1 clarifies the threat model boundary of the constitutional governance layer — what it governs and what it does not. Amendment 2 expands the knowledge substrate criterion from currency (staleness) to integrity (adversarial injection), reflecting documented memory-poisoning attack vectors. Amendment 3 surfaces Constraint composition as an open architectural problem: how multiple active Constraints compose their evaluations without conflicting, redundant, or latency-degrading results. The core thesis and architecture are unchanged. The framework is more precisely bounded. See Empirical Grounding for the Agency Paradox.

Thanks to the research and criticism that made these clarifications necessary.

The Central Question, Restated

The original framing asked: who is in command when autonomous systems act? That question remains correct. But the answer has sharpened.

The paradox this paper is named for is now stateable in one sentence: useful machine agency requires stronger architectural constraint, not weaker oversight. The more capable the agent, the more its usefulness depends on the determinism of the authority that bounds it.

The answer is not a more disciplined human. The answer is a better-designed system.

Human discipline is necessary but not sufficient. Discipline alone does not scale reliably, does not enforce boundaries by itself, and does not produce sufficient evidence for audit. Discipline fails when the human is unavailable, distracted, or simply outnumbered by the rate at which autonomous systems act. The question of command is not a behavioral question. It is an architectural one.

The architectural answer is this:

Autonomous agents require a control plane that is separate from the acting system, deterministic in its authority, and continuously observable by a human principal. This control-plane pattern is not new. Distributed software systems have already developed much of it — through declarative state, policy enforcement, reconciliation, and audit. The discipline has not yet been applied to autonomous AI agents. That is the gap.

Terminology note. This paper uses the neutral term Constraint for any declared governance rule. Specific implementations may realize Constraints as Laws, Policies, Rules, or Controls. In AgentVector-derived systems, Constraints are implemented as Laws. The distinction is intentional: this paper defines the architectural principle; implementation codices define the concrete law system.

The Failure Mode of Behavioral Governance

Behavioral governance places the human in the role of manual enforcement. The human defines scope, verifies output, manages risk, and intervenes when something goes wrong. These are the right instincts. But they have a structural problem: they require the human to be faster, more consistent, and more available than the system being governed.

Autonomous systems do not operate at human speed. They propose actions continuously, across multiple contexts, often in parallel. An agent generating hundreds of decisions per hour cannot be governed by a human reviewing each one. An agent running overnight cannot be governed by a human who is asleep. A network of agents cannot be governed by a single operator’s attention.

The failure modes that emerge from behavioral governance are predictable: scope overreach that isn’t caught until damage is done; architectural drift accumulating across sessions; decisions that cannot be reconstructed because no systematic record exists; safety invariants that hold when the human is attentive and collapse when they aren’t.

These are not failures of the human. They are failures of the architecture.

The conclusion is not that human oversight is unimportant. It is that human oversight must be focused, attributed, and supported by systems that enforce constraints independently of any individual’s presence or attention.

What Cloud-Native Systems Already Solved

The discipline required to govern autonomous systems at scale is not theoretical. It has been developed, proven, and deployed — in the domain of distributed software infrastructure.

Cloud-native systems did not eliminate operational disorder, but they did establish the dominant architectural pattern for governing machine-speed software systems: declared intent, bounded reconciliation, policy at the boundary, and auditable state transitions.

The problem they solved was structurally similar: how do you govern complex autonomous systems — controllers, operators, schedulers — that act continuously, at machine speed, across distributed infrastructure, without requiring a human to approve every action?

The answer has four components:

Declarative desired state. Authority over what a system should do is expressed as a declaration, not a command. The system is told what it should be, not what to do. This separates the statement of intent from the execution of intent, and makes authority legible, reviewable, and auditable before any action occurs.

Continuous reconciliation. The system continuously compares its current state against declared desired state and acts to close the gap. Autonomous action is always bounded — the system can only move toward the declared state. Deviation is detected automatically, not by human inspection.

Policy enforcement at the boundary. Before any action reaches the system, it passes through an admission boundary that evaluates it against declared policy. The decision is deterministic. The record is automatic. No action reaches the system without policy evaluation.

Observable audit trail. Every action, every decision, every state change is recorded as a first-class artifact of the system’s operation — queryable, exportable, and sufficient to reconstruct what happened and why.

These four components together produce a governed system: one that acts autonomously within declared constraints, enforces its own boundaries, and produces continuous evidence that governance is occurring. This is the established pattern. It is precedent for what follows — not a solved problem, but a proven discipline.

The Missing Application

Cloud-native governance was designed for software systems. It governs infrastructure. Autonomous AI agents are not merely infrastructure components. They are reasoning actors that generate proposals under uncertainty.

This distinction matters architecturally. A Kubernetes controller is autonomous within a bounded reconciliation loop; an AI agent is autonomous within an open-ended reasoning loop. That difference is why infrastructure governance is precedent, not sufficiency.

Infrastructure controllers reconcile toward declared state. AI agents generate proposals based on reasoning, instruction, and context — proposals whose range is not bounded by a schema, whose implications are not always predictable, and whose failure modes include not just service crashes but misaligned reasoning, scope overreach, and unbounded action in response to open-ended instruction.

The existing cloud-native governance model handles the infrastructure layer. It does not handle the reasoning layer — the moment before an action reaches infrastructure, when the agent has decided to propose something and that proposal must be evaluated against constitutional constraints before it is allowed to proceed.

Between the agent’s reasoning and the infrastructure’s execution, there is a layer that does not yet exist as a general infrastructure discipline with these properties: a constitutional governance layer that evaluates proposed actions against declared authority, produces a deterministic verdict, records the evaluation, and either permits or denies the action — before it reaches the system, before any side effect occurs, independent of the human’s availability.

This layer has requirements that distinguish it from infrastructure policy enforcement:

It must evaluate intent, not just parameters. Infrastructure admission control checks whether a container image is signed or a resource request is within quota. Constitutional governance must evaluate whether a proposed action is within the agent’s authorized scope, consistent with declared constraints, and appropriate given the current operational context.

It must be substrate-independent. An AI agent runs on mobile devices, desktop machines, edge systems, and cloud infrastructure. The constitutional layer must govern the agent’s actions regardless of substrate — the same authority, the same audit trail, the same laws, everywhere.

It must produce compositional evidence. Not just a log of what happened, but a record of how each governance decision was reached — which rules were evaluated, in what order, with what verdict, and why. This is the requirement for replay, for audit, and for demonstrating to a third party that governance actually occurred.

It must be fail-closed. If the constitutional layer is unreachable, ambiguous, or incomplete, the default must be denial. A system that permits action when governance is uncertain is not a governed system.

The Threat Model Boundary

The constitutional governance layer governs the action surface — what the agent proposes to do. This is its scope, and it is a meaningful scope: every action that reaches the execution environment passes through a deterministic evaluation, and every permitted action is attributable and auditable.

What the governance layer does not and cannot govern is the agent’s parametric state — the objective function encoded in the agent’s weights. Current empirical evidence documents that capable models can exhibit goal-directed behavior that is architecturally below the action-proposal layer: models have been observed using individually compliant-looking action sequences to achieve objectives inconsistent with their declared constraints, and maintaining strategic reasoning in internal traces that does not surface in the proposals themselves.

This boundary matters for what the framework claims. Constitutional governance makes misalignment detectable and constrainable at the action surface. It does not make misalignment impossible at the parametric level. The composition-aware evaluation criterion — the session-level pattern detection described in the following section — is the mechanism by which action-surface governance reaches toward parametric-level intent detection. A sequence of individually permitted actions that collectively constitute scope expansion is observable; an isolated permitted action is not. Governance that evaluates sequences is therefore more informative about underlying intent than governance that evaluates proposals in isolation, while remaining honest about the fact that intent is inferred from observable behavior rather than directly inspected inside the model.

The honest claim is this: a constitutional governance layer does not produce a fully aligned agent. It produces an agent whose actions are bounded, auditable, and sequentially observable in ways that make misalignment surfaceable to a principal. That is a meaningful guarantee. It is not a complete one.

The Composition Problem

The requirements above describe what governance must do for a single proposed action evaluated in isolation. But AI agents do not act in isolation. They act in sequences — each action creating context for the next, each permitted action narrowing or expanding what the next proposal will be.

This creates a governance problem that individual-action evaluation cannot solve: individually compliant proposals that are collectively problematic.

Action A is permitted. It falls within authorized scope, passes all constitutional checks, and is individually unambiguous. Action B is permitted for the same reasons. Action C likewise. But the sequence of A, B, and C together constitutes scope creep that no single evaluation would have caught — the agent has moved, in three individually governed steps, well outside the intent of the original authorization. The governance layer evaluated each proposal correctly and still failed to govern the session.

This is the forward composition problem. It is a known challenge in formal verification — sequentially valid steps that compose to an invalid state — and it is a genuine weakness in any governance architecture that evaluates proposals without session memory.

The inverse problem is equally important, and in practice surfaces first: multiple rejections that must be recognized and recorded together as a pattern, not as isolated denials.

A single denial is a governance event. An agent proposes an action that violates a Constraint; the governance layer denies it; the session continues. But when the same agent produces five denials against the same constraint boundary within a single session, the pattern means something different from any individual denial. The agent may be probing the boundary systematically. The constitutional policy may be miscalibrated for the task at hand. A reasoning loop may have formed that will continue generating non-compliant proposals until the underlying condition changes. Recording five independent denials loses this signal entirely.

The architectural response is composition-aware governance. The governance layer must maintain session-level state — a running record of what has been permitted and denied within the current operating context — against which each new proposal is evaluated. This requires two capabilities that individual-action evaluation does not provide:

Composition tracing. For each governance decision, record not just the verdict but the full evaluation trace: which Constraints fired, in what order, with what individual verdict, in the context of what has been permitted and denied before it in this session. The trace is the unit of evidence, not the individual verdict. A sequence of traces is what makes sessions replayable, not a list of per-action decisions.

Pattern detection. Across decisions within a session or window, the governance layer must be capable of detecting composition patterns that individual-action evaluation cannot see: the permit sequence that constitutes incremental scope creep; the denial cluster that signals boundary probing; the deferred-decision pattern that suggests the agent is systematically routing around a constraint. When a pattern crosses a significance threshold, it escalates to the principal — not as a denial, but as a composition signal. Pattern thresholds are not hidden heuristics. They are declared constitutional policy: versioned, auditable, reviewable by the principal, and included in replay evidence.

These two capabilities together resolve both directions of the composition problem. Composition tracing makes individual decisions reconstructable in sequence. Pattern detection makes session-level governance behavior visible.

The critical architectural property: the governance layer must evaluate each proposal against both the constitutional policy and the session-level composition record. A proposal that is individually compliant but compositionally anomalous is not unambiguously permitted. It may be the right next action. It may be the third step of a problematic sequence. The governance layer must be capable of distinguishing between them — and must escalate the second case to the principal rather than resolving it silently.

This does not make governance probabilistic. The constitutional policy remains deterministic. What changes is that the inputs to the deterministic evaluation include session state, not just the isolated proposal. The same action may receive different verdicts at different points in a session — not because the policy changed, but because what has already been permitted and denied is a legitimate input to the governance decision.

The Open Problem: Constraint Composition

Composition-aware evaluation introduces a second-order problem that current governance architectures have not resolved: when multiple Constraints are simultaneously active and their evaluations of the same proposal diverge, what are the correct composition semantics?

A Constraint evaluating resource-access boundaries may permit a proposal. A Constraint evaluating session-level escalation patterns may flag the same proposal as anomalous. These are not contradictory verdicts — they operate at different evaluation layers — but the governance layer must specify how they combine. Does the more restrictive verdict prevail? Does the pattern-detection Constraint escalate without blocking? Does conflict between Constraint verdicts itself constitute an escalation signal?

This is not a theoretical edge case. In regulated domains — flight operations, financial transactions, clinical systems — multiple Constraints with overlapping scope authority will be the norm, not the exception. The governance architecture must define composition semantics that are deterministic, documented, and themselves auditable. An implementation that silently resolves Constraint conflicts without recording them introduces a governance gap at precisely the layer that should produce the most complete evidence.

This remains an open problem. Any implementation of governed autonomy that deploys multiple simultaneous Constraints should document its composition semantics explicitly — and treat unresolved Constraint conflicts as escalation events, not silent resolutions.

The Architecture of Governed Autonomy

From these requirements, a reference architecture emerges. It is not tied to any specific implementation. It is a pattern.

┌─────────────────────────────────────────────────────────────────┐
│  PRINCIPAL LAYER                                                │
│  The human authority. Defines constitutional policy.            │
│  Receives observability. Reviews escalated decisions —          │
│  including composition signals. Governs the system that         │
│  governs every action.                                          │
└─────────────────────────────┬───────────────────────────────────┘
                              │
┌─────────────────────────────▼───────────────────────────────────┐
│  OPERATOR INTERFACE                                             │
│  Presents governance state to the principal.                    │
│  Surfaces escalations, decision history, and                    │
│  composition signals. Preserves attribution and                 │
│  role separation. Cannot modify constitutional policy.          │
└─────────────────────────────┬───────────────────────────────────┘
                              │
┌─────────────────────────────▼───────────────────────────────────┐
│  CONSTITUTIONAL GOVERNANCE LAYER                                │
│  The control plane. Evaluates every proposed action             │
│  against declared authority and session composition             │
│  before execution. Deterministic. Substrate-independent.        │
│  Fail-closed. Produces composition traces. Detects              │
│  patterns. Operates independently of operator interface.        │
│  Governs the action surface; does not govern parametric         │
│  state. Applies documented composition semantics when           │
│  multiple Constraints evaluate the same proposal.               │
└─────────────────────────────┬───────────────────────────────────┘
                              │
┌─────────────────────────────▼───────────────────────────────────┐
│  KNOWLEDGE SUBSTRATE                                            │
│  The structured operational context the agent is                │
│  authorized to navigate: goals, constraints, prior              │
│  decisions, and current state relevant to action.               │
│  Updated by experience. Maintained by the principal.            │
│  Writes are governed actions: agent-authored updates            │
│  pass through the governance layer before becoming              │
│  authoritative.                                                  │
│  Provenance-tagged: every substrate entry records its           │
│  authorship, introduction mechanism, and timestamp.             │
│  Unprovenanced or expired entries are not authoritative.        │
└─────────────────────────────┬───────────────────────────────────┘
                              │
┌─────────────────────────────▼───────────────────────────────────┐
│  ACTING SYSTEM                                                  │
│  The autonomous agent. Proposes actions through a               │
│  finite, explicit action surface. Executes permitted            │
│  actions. Cannot modify its own constitutional                  │
│  constraints. Operates within declared authority                │
│  or not at all.                                                 │
└─────────────────────────────────────────────────────────────────┘

Each layer has a precisely defined responsibility and a precisely defined boundary. The key structural properties:

The governance layer is separate from the acting system. The agent cannot modify, bypass, or disable governance. It is not a library the agent calls; it is an independent system the agent’s actions pass through. Without this separation, governance is advisory, not authoritative.

The principal governs the system, not every action. The principal defines constitutional policy and reviews escalated decisions — including composition signals. The governance layer handles routine enforcement. The principal handles the boundary cases the governance layer correctly escalates.

The knowledge substrate is structured context with verified provenance, not a document archive. It is the operational map the agent navigates: what the principal intends, what constraints are non-negotiable, what has been tried and why it succeeded or failed. An unmaintained substrate is worse than none — it provides stale context with false confidence. A substrate without provenance tracking is an attack surface — adversarially injected content that assumes principal authority without verification is a documented threat, not a theoretical one. Substrate writes are themselves governed actions: no agent-authored memory, summary, inference, or operational update becomes authoritative until it passes through the constitutional governance layer and is either approved by declared policy or escalated to the principal.

The audit trail is compositional, not narrative. Every governance decision records which rules were evaluated, in what order, with what individual verdict. Every session produces a sequence of traces from which the full operating history can be reconstructed. Decisions are replayable — individually and in sequence.

The Role of the Principal, Redefined

The earlier framing of this argument described the engineer as steward — governing AI with discipline, verifying output, managing risk. That framing was correct for its moment. It described individual practice in the absence of governance infrastructure.

The role of the principal in a properly governed autonomous system is different. It is not smaller — it is more precisely located.

The principal declares constitutional authority: the constraints that govern the agent’s action, the scope within which it may act autonomously, and the conditions that require escalation to human judgment — including composition signals that no individual-action evaluation would have surfaced.

The principal maintains the knowledge substrate, reviews escalated decisions, interprets operational evidence, and evolves constitutional policy based on what operational history reveals — including what composition patterns have emerged across sessions. Substrate maintenance is not clerical work. It is a governance function: the principal is the sole authority for what knowledge the agent is permitted to treat as verified.

This is not the traditional engineer writing code. It is not the prompt engineer approving output. It is the governor of a governed system — present at the boundaries that matter, not at every execution.

The pilot does not manually operate every flight system. They command the automated systems, monitor their operation, and intervene at the boundaries where human judgment is required. The automation does not reduce the pilot’s authority. It focuses it.

Properties of a Valid Implementation

Any system claiming to implement governed autonomy should satisfy the following properties. These are evaluative criteria, not implementation prescriptions.

Separation of governance from execution. The governance layer must be an independent system. The agent cannot modify, bypass, or disable it. Failure of the operator interface must not weaken governance enforcement.

Determinism. Given the same action proposal, the same constitutional policy, and the same session state, the governance layer must produce the same verdict. Governance that is probabilistic or influenced by the agent’s own reasoning is not governance — it is negotiation. Note that session state is a legitimate input to deterministic evaluation; the same action may receive different verdicts at different points in a session if the session composition record is part of the evaluation.

Fail-closed default. When governance state is uncertain — the layer is unreachable, evidence is incomplete, policy is ambiguous — the default verdict must be denial. The cost of a false denial is delay. The cost of a false permit may be irreversible.

Compositional evidence. The audit trail must record the full evaluation trace for each decision — which Constraints fired, in what order, with what individual result — not merely the verdict. This is required for replay, for audit, and for demonstrating governance to third parties.

Composition-aware evaluation. The governance layer must maintain session-level state and evaluate each proposal against both the constitutional policy and the accumulated session record. Individually compliant proposals that are compositionally anomalous must be detected and escalated rather than silently permitted. Multiple denials against the same constraint boundary within a session must be recorded as a pattern, not as independent events. This property resolves both directions of the composition problem: the forward problem of sequentially compliant but collectively invalid action sequences, and the inverse problem of denial patterns that constitute governance signals.

Documented Constraint composition semantics. When multiple Constraints are simultaneously active, the governance layer must specify and document how their verdicts combine. Constraint conflicts must be recorded as governance events, not resolved silently. The composition semantics are themselves a governance artifact — auditable, reviewable, and evolvable by the principal.

Substrate independence. The constitutional layer must apply the same authority model regardless of where the agent runs. The Constraints do not change because the substrate is a mobile device instead of a cloud instance.

Principal observability. The principal must be able to reconstruct the full operational picture from governance evidence alone — including the sequence of decisions, not just individual verdicts. If they cannot, the governance system is not recording enough.

Knowledge substrate integrity and currency. The architecture must include a mechanism for the principal to maintain both the accuracy and the provenance of the knowledge substrate. Staleness is a governance failure of one kind — the substrate no longer reflects operational reality. Adversarial injection is a governance failure of a different kind — the substrate has been corrupted by content that was never authorized by the principal. Both failures compromise the agent’s operational context. Valid implementations must provide provenance tagging for substrate entries and must treat unprovenanced or unverifiable entries as requiring principal review before the agent may treat them as authoritative.

Bounded execution surface. The governed system must expose a finite, explicit action surface that governance can evaluate. If the action surface is undefined or unbounded, governance becomes interpretive rather than enforceable. The action surface is a contract — proposals outside it are denied by definition.

The DevOps Parallel

DevOps did not invent the practices it synthesized. Continuous integration, infrastructure as code, and declarative state management existed before DevOps named them. What DevOps contributed was the synthesis: a coherent philosophy connecting practices that had been developed separately, and the recognition that software delivery and infrastructure operation are the same discipline viewed from different angles.

Cloud-native systems instantiated that philosophy as infrastructure. The control loop is a governance pattern. Admission control is a constitutional boundary. GitOps is audit with rollback. These are governance primitives that happened to be built for containers.

If DevOps unified software creation and software operation under one discipline, governed autonomy seeks to unify AI reasoning and operational authority under one control-plane model. Platform engineering has the governance primitives. AI engineering has the acting systems. The connection between them — a substrate-independent, constitutional control plane for autonomous agents, composition-aware by design — is the discipline waiting to be named.

Cloud-native systems normalized the idea that complex autonomous systems should be governed by declared authority and continuous reconciliation. AI systems require the same discipline. The pattern is proven. The application is the work.

Conclusion

The question of who commands autonomous systems is not answered by human discipline alone. It is answered by architecture.

The architecture is: a constitutional governance layer, separate from the acting system, deterministic in authority, fail-closed by default, composition-aware in its evaluation, with documented Constraint composition semantics, producing full traces of every decision and patterns across sequences of decisions, operating independently of human availability — backed by a knowledge substrate whose integrity and currency are maintained by the principal as a first-class governance function.

This is not a new idea in its components. It is a new synthesis in its application.

The framework is honest about its boundary: it governs the action surface, not the parametric state of the agent that proposes actions. What it provides is deterministic constraint, complete audit, and session-level observability sufficient for misalignment to be surfaced rather than hidden. That is a meaningful, falsifiable guarantee — and a stronger claim than the alternative of behavioral discipline alone.

The implementations that derive from this architecture will differ in language, substrate, and scope. They will share the structural properties defined here. They will be evaluable against the criteria defined here. And they will give the principal what behavioral governance alone cannot: authority that scales, evidence that holds across individual actions and sequences of actions, and a governed agent that earns trust not through compliance promises but through demonstrated, auditable, composition-aware operation.

The practical question is no longer whether agents can act. It is whether we will build the systems that make their action governable — not just action by action, but across the sequences where intent becomes observable.

The agent acts. The system governs. The principal commands.

Naming note. The term “agency paradox” appears in other fields, including social theory, where it often describes the way agency is enabled rather than negated by constraining structures. This paper uses the term in the context of autonomous AI systems, in the sense stated in “The Central Question, Restated”: useful machine agency requires stronger architectural constraint, not weaker oversight.

This document establishes the conceptual foundation for governed autonomy as an infrastructure discipline. Implementations should treat this as their architectural reference — the thesis from which their design derives, not the other way around. For the empirical research informing the v2.3 amendments, see the companion document: Empirical Grounding for the Agency Paradox.

Author: Stephen Sweeney Contact: stephen@agentincommand.ai License: CC BY 4.0