Secure AI Agent Architecture¶

A modern practical book for engineers and platform leaders who want to build not demo agents, but production systems that are observable, controllable, and safe.

This book takes Dmitry Vikulin's article on reliable AI agents as a starting point and expands it into a platform architecture with governance, policy enforcement, human approval, observability, evals, and operational controls.

Open the book plan Read Part I View sources

Where to invest first¶

This interactive chart is a quick rule of thumb: in most real systems, control, safety, and observability deserve attention earlier than maximum autonomy.

What is inside¶

Architectural patterns: workflow, router, planner, subagents, human-in-the-loop.
Security: IAM, policy-as-code, prompt injection defenses, sandboxing, data boundaries.
Reliability: checkpoints, idempotency, retries, graceful degradation.
Transparency: traces, metrics, evals, regression control.
Platform design: gateways, shared runtime, knowledge plane, tool plane, control plane.

The main idea¶

The most common mistake in agent systems is to start with autonomy instead of controllability. Practice from Anthropic, OpenAI, LangGraph, and enterprise platforms from Google points to a more stable path:

Build a predictable workflow first.
Add autonomy locally and measurably.
Route all risky actions through policy, approval, and tracing.
Keep quality through evals and telemetry, not promises about the model.

Why MkDocs was selected¶

MkDocs + Material for MkDocs still remains a pragmatic choice in 2026 for a Python-first documentation book: it is actively maintained, fast to build, and fits naturally with a Markdown workflow and a Python toolchain based on uv.¹²³

If the project later needs richer UI components and MDX-style composition, Astro Starlight is the most likely upgrade path. For the first public version, however, the Python-first stack is simpler and more reliable.⁴

Sources behind this architecture¶

Original framing for the agent building blocks: vikulin.ai
"Workflow before agents": Anthropic, Building effective agents
Durable execution, memory, and HITL: LangGraph docs
Tracing and agent evals: OpenAI docs
Risk management and security controls: NIST AI RMF, OWASP Prompt Injection Cheat Sheet