Skip to content

Chapter 2. Reference Architecture for a Safe Agent

1. Why a Reference Diagram Is Useful at All

After the first chapter, you should already have the intuition for why "just a smart agent" quickly becomes fragile. The next step is more grounded: understand which layers the system should contain if you want it to survive longer than one presentation.

That is what a reference architecture is for. Not as dogma, but as a baseline.

2. Top View: What the Platform Consists Of

Below is a diagram that works well as a starting map.

Reference diagram of a safe agent platform

flowchart TB
    user["User / API / Event"] --> interface["Interface layer"]
    interface --> identity["Identity & session layer"]
    identity --> control["Agent control plane"]
    control --> runtime["Orchestration runtime"]
    runtime --> cognition["Cognition plane"]
    runtime --> memory["Memory & knowledge plane"]
    runtime --> tools["Tool execution plane"]
    runtime --> telemetry["Telemetry & eval plane"]
    tools --> external["External systems / MCP / SaaS"]
    memory --> stores["Vector DB / KB / profile memory"]
    control --> approval["Approval / policy / quotas"]
    telemetry --> audit["Traces / metrics / audit"]
Layer Role Why it is mandatory
Interface layer Chat, API, event ingestion, webhooks Separates user channels from the runtime
Identity and session layer User, service account, thread, tenant, request scope Needed for IAM, audit, and isolation
Agent control plane Policies, approvals, model policies, tool catalog, quotas This is where controllability lives
Orchestration runtime Workflow graph, planner, router, subagents, checkpoints This is where the task is executed
Cognition plane Model router, prompt compiler, structured outputs, validators The model becomes a component, not the center of the world
Memory and knowledge plane Short-term state, long-term memory, retrieval, summaries Limits context sprawl
Tool execution plane Sandboxed tools, MCP servers, connectors, side-effect isolation Reduces blast radius
Telemetry and eval plane Traces, metrics, logs, datasets, graders, regression gates Makes quality measurable

3. What Happens at Ingress

An incoming request should not simply "fly into the model." First it has to become a normalized event with context.

The minimally useful set:

  • tenant_id;
  • principal;
  • risk class;
  • access policy reference;
  • session identifier;
  • trace id.

In short: the request should enter the system not as a message, but as a managed execution context.

4. Why the Control Plane Matters More Than It Seems

This is the layer most demo architectures are missing.

It is responsible not for intelligence, but for the system's right to act:

  • which models may be used;
  • which tools are available;
  • which approvals are required;
  • which limits apply;
  • which rules are active in dev, staging, and prod.

Example of policy-as-code:

agent_policy:
  model_access:
    allowed_models: ["gpt-5.4", "gpt-5-mini", "claude-sonnet"]
    deny_if_contains: ["pci_raw", "prod_secrets"]
  tools:
    read_kb:
      approval: none
    jira_create_ticket:
      approval: manager
    prod_db_write:
      approval: security_and_owner
      allowed_environments: ["staging"]
  runtime:
    max_steps: 24
    max_parallel_subagents: 4
    require_checkpoint_every_step: true

5. Where Execution Lives

The orchestration runtime chooses the execution pattern:

  • deterministic workflow for regulated scenarios;
  • routed workflow for branch selection;
  • plan-and-execute for long tasks;
  • planner + subagents for independent subtasks;
  • HITL interrupts for high-risk operations.34

The best property of a good runtime is unexpectedly simple: it should be boring.

The more "magic" it contains, the harder it is to predict cost, behavior, and failure.

6. Why the Cognition Plane Is Not the Same as One Model

It is more useful to think not in terms of "we have one powerful model", but in terms of a set of manageable components:

  • planner model;
  • executor model;
  • classifier/extractor model;
  • structured output validator;
  • fallback model.

That helps with quality, cost, and graceful degradation.15

7. Why Memory, Knowledge, and Tools Should Not Be Mixed Together

It is useful to separate at least three independent things:

  • short-term state: the current execution state of the flow;
  • long-term memory: facts, profiles, episodes;
  • retrieval: access to external knowledge.2

And separately, a fourth:

  • tool execution: real actions in the outside world.

That separation feels bureaucratic only until the first serious incident.

8. What the Request Path Through the System Looks Like

Below is a more "live" view of how a request moves through the key control points.

Request path through the main control points

sequenceDiagram
    autonumber
    participant U as User
    participant I as Interface
    participant C as Control plane
    participant R as Runtime
    participant T as Tool gateway
    participant A as Audit

    U->>I: Request
    I->>C: Normalization + principal + tenant + risk
    C->>R: Allowed execution context
    R->>C: Request for model/tool action
    C->>T: Policy check / approval / quotas
    T-->>R: Allowed result
    R->>A: Trace + step metadata
    R-->>U: Response

9. A Minimal Code Principle

If you want to see it in a very compact form, here is a practical template:

from dataclasses import dataclass


@dataclass
class ToolRequest:
    tool_name: str
    actor_id: str
    risk_class: str
    payload: dict


def execute_tool(request: ToolRequest, policy_engine, approval_service, gateway):
    decision = policy_engine.evaluate(request)
    if not decision.allowed:
        raise PermissionError(decision.reason)

    if decision.requires_approval:
        approval_service.require_human_signoff(request, decision)

    return gateway.call(request.tool_name, request.payload)

The point here is simple: the model may propose an action, but the right to execute lives not in the model, but in the gateway and policy layer.

10. Practical Takeaway

A good agent platform stands on several boring but very valuable things:

  • explicit ingress context;
  • a control plane;
  • a separate runtime;
  • a separate tool gateway;
  • traces and evals;
  • approvals where they are needed.

Once those are in place, you can move calmly to the more nervous topic: where exactly the security perimeter sits in such a system.