Chapter 2. Reference Architecture for a Safe Agent¶
1. Why a Reference Diagram Is Useful at All¶
After the first chapter, you should already have the intuition for why "just a smart agent" quickly becomes fragile. The next step is more grounded: understand which layers the system should contain if you want it to survive longer than one presentation.
That is what a reference architecture is for. Not as dogma, but as a baseline.
2. Top View: What the Platform Consists Of¶
Below is a diagram that works well as a starting map.
Reference diagram of a safe agent platform
flowchart TB
user["User / API / Event"] --> interface["Interface layer"]
interface --> identity["Identity & session layer"]
identity --> control["Agent control plane"]
control --> runtime["Orchestration runtime"]
runtime --> cognition["Cognition plane"]
runtime --> memory["Memory & knowledge plane"]
runtime --> tools["Tool execution plane"]
runtime --> telemetry["Telemetry & eval plane"]
tools --> external["External systems / MCP / SaaS"]
memory --> stores["Vector DB / KB / profile memory"]
control --> approval["Approval / policy / quotas"]
telemetry --> audit["Traces / metrics / audit"] | Layer | Role | Why it is mandatory |
|---|---|---|
| Interface layer | Chat, API, event ingestion, webhooks | Separates user channels from the runtime |
| Identity and session layer | User, service account, thread, tenant, request scope | Needed for IAM, audit, and isolation |
| Agent control plane | Policies, approvals, model policies, tool catalog, quotas | This is where controllability lives |
| Orchestration runtime | Workflow graph, planner, router, subagents, checkpoints | This is where the task is executed |
| Cognition plane | Model router, prompt compiler, structured outputs, validators | The model becomes a component, not the center of the world |
| Memory and knowledge plane | Short-term state, long-term memory, retrieval, summaries | Limits context sprawl |
| Tool execution plane | Sandboxed tools, MCP servers, connectors, side-effect isolation | Reduces blast radius |
| Telemetry and eval plane | Traces, metrics, logs, datasets, graders, regression gates | Makes quality measurable |
3. What Happens at Ingress¶
An incoming request should not simply "fly into the model." First it has to become a normalized event with context.
The minimally useful set:
tenant_id;principal;- risk class;
- access policy reference;
- session identifier;
- trace id.
In short: the request should enter the system not as a message, but as a managed execution context.
4. Why the Control Plane Matters More Than It Seems¶
This is the layer most demo architectures are missing.
It is responsible not for intelligence, but for the system's right to act:
- which models may be used;
- which tools are available;
- which approvals are required;
- which limits apply;
- which rules are active in dev, staging, and prod.
Example of policy-as-code:
agent_policy:
model_access:
allowed_models: ["gpt-5.4", "gpt-5-mini", "claude-sonnet"]
deny_if_contains: ["pci_raw", "prod_secrets"]
tools:
read_kb:
approval: none
jira_create_ticket:
approval: manager
prod_db_write:
approval: security_and_owner
allowed_environments: ["staging"]
runtime:
max_steps: 24
max_parallel_subagents: 4
require_checkpoint_every_step: true
5. Where Execution Lives¶
The orchestration runtime chooses the execution pattern:
- deterministic workflow for regulated scenarios;
- routed workflow for branch selection;
- plan-and-execute for long tasks;
- planner + subagents for independent subtasks;
- HITL interrupts for high-risk operations.34
The best property of a good runtime is unexpectedly simple: it should be boring.
The more "magic" it contains, the harder it is to predict cost, behavior, and failure.
6. Why the Cognition Plane Is Not the Same as One Model¶
It is more useful to think not in terms of "we have one powerful model", but in terms of a set of manageable components:
- planner model;
- executor model;
- classifier/extractor model;
- structured output validator;
- fallback model.
That helps with quality, cost, and graceful degradation.15
7. Why Memory, Knowledge, and Tools Should Not Be Mixed Together¶
It is useful to separate at least three independent things:
- short-term state: the current execution state of the flow;
- long-term memory: facts, profiles, episodes;
- retrieval: access to external knowledge.2
And separately, a fourth:
- tool execution: real actions in the outside world.
That separation feels bureaucratic only until the first serious incident.
8. What the Request Path Through the System Looks Like¶
Below is a more "live" view of how a request moves through the key control points.
Request path through the main control points
sequenceDiagram
autonumber
participant U as User
participant I as Interface
participant C as Control plane
participant R as Runtime
participant T as Tool gateway
participant A as Audit
U->>I: Request
I->>C: Normalization + principal + tenant + risk
C->>R: Allowed execution context
R->>C: Request for model/tool action
C->>T: Policy check / approval / quotas
T-->>R: Allowed result
R->>A: Trace + step metadata
R-->>U: Response 9. A Minimal Code Principle¶
If you want to see it in a very compact form, here is a practical template:
from dataclasses import dataclass
@dataclass
class ToolRequest:
tool_name: str
actor_id: str
risk_class: str
payload: dict
def execute_tool(request: ToolRequest, policy_engine, approval_service, gateway):
decision = policy_engine.evaluate(request)
if not decision.allowed:
raise PermissionError(decision.reason)
if decision.requires_approval:
approval_service.require_human_signoff(request, decision)
return gateway.call(request.tool_name, request.payload)
The point here is simple: the model may propose an action, but the right to execute lives not in the model, but in the gateway and policy layer.
10. Practical Takeaway¶
A good agent platform stands on several boring but very valuable things:
- explicit ingress context;
- a control plane;
- a separate runtime;
- a separate tool gateway;
- traces and evals;
- approvals where they are needed.
Once those are in place, you can move calmly to the more nervous topic: where exactly the security perimeter sits in such a system.