Sources¶

Below is the main set of primary sources used by the current version of the book. Access date: April 22, 2026.

How to read this list

It is useful to separate these sources not only by topic, but also by the strength of support they provide:

Normative frame: NIST, OWASP, CISA, and related documents that define stable governance contours;
Platform practice: OpenAI, Anthropic, LangGraph, Google Cloud, Microsoft, and similar material showing how teams assemble those contours in production;
HCI, HITL, and human oversight: sources that show where automation fails and how to keep a human in the loop;
Research frontier: newer papers on memory, observability, verifier design, and multi-agent reliability.

If you need the strongest base for Parts I, V, and VIII, start with the normative frame and the HCI/HITL layer. If you need current engineering practice, read the platform docs and recent research, but always pay attention to publication dates.

Normative Frameworks and Governance Contours¶

Agent Architecture and Platform Patterns¶

Dmitry Vikulin, Architecture of Reliable AI Agents
Anthropic, Building Effective AI Agents
Anthropic, Harness design for long-running application development
OpenAI, A practical guide to building agents (PDF)
OpenAI, Agents SDK
OpenAI Agents SDK, Sandbox Agents, Sandbox Concepts, Sandbox clients, and Agent memory
OpenAI, Agent Builder
LangGraph, Overview
LangGraph, Durable execution
LangGraph, Persistence
LangGraph, Memory overview
LangChain, Multi-agent
Google Cloud, Achieve agentic productivity with Vertex AI Agent Builder
Google Cloud, More ways to build, scale, and govern AI agents with Vertex AI Agent Builder
Google Cloud, Vertex AI Agent Builder overview
Google Cloud Architecture Center, Multi-agent AI system in Google Cloud
Microsoft Azure Architecture Center, AI Agent Orchestration Patterns
Cloudflare, Build Agents on Cloudflare
Cloudflare Agents SDK, Store and sync state and Schedule tasks
Cloudflare Agents SDK, Human in the Loop and WebSockets
Cloudflare, Build and deploy Remote Model Context Protocol (MCP) servers to Cloudflare

Observability, Evals, and Verifier Design¶

OpenAI, Agent evals
OpenAI, Trace grading
OpenAI, Background mode
OpenAI, Using tools
OpenAI, Structured model outputs
Microsoft Learn, Observability for Generative AI and agentic AI systems
Google Cloud, Observability and monitoring
AWS, Introducing stateful MCP client capabilities on Amazon Bedrock AgentCore Runtime
arXiv, The Art of Building Verifiers for Computer Use Agents
GitHub, microsoft/fara

HCI, HITL, and Human Oversight¶

Microsoft Research, Guidelines for Human-AI Interaction
LangChain Deep Agents, Human-in-the-loop
LangGraph, Interrupts
OpenReview, The Illusion of Consensus in Human-Centered Interactive AI
Microsoft Learn, Agentic AI adoption maturity model

Governance, Security, and Operational Assurance¶

Google Cloud, How Google secures AI Agents
Google Cloud, Recommended AI Controls framework
Google Cloud, Introducing Agent Sandbox
Google Research, Security Assurance in the Age of Generative AI
Google Research, Securing the AI Software Supply Chain
Google Research, An Introduction to Google’s Approach for Secure AI Agents
Google Research, Identifying and Mitigating the Security Risks of Generative AI
Anthropic, Claude Code Security
Anthropic, Agentic Misalignment
Anthropic, Strengthening Red Teams
Anthropic, Introducing Bloom
Anthropic, Findings from a Pilot Anthropic-OpenAI Alignment Evaluation Exercise
MLCommons, AILuminate v1.0 Release
Microsoft Learn, Secure autonomous agentic AI systems
Microsoft Learn, Reduce autonomous agentic AI risk
Microsoft Learn, Complete production infrastructure inventory
Microsoft Learn, Agent Registry convergence with Microsoft Agent 365

Incidents and Cases¶

American Bar Association, BC Tribunal Confirms Companies Remain Liable for Information Provided by AI Chatbot

Research Frontier: Memory, Observability, and Multi-Agent Reliability¶

OpenReview, EVOLVE-MEM: A Self-Adaptive Hierarchical Memory Architecture for Next-Generation Agentic AI Systems
OpenReview, MemGen: Weaving Generative Latent Memory for Self-Evolving Agents
OpenReview, AgentTrace: A Structured Logging Framework for Agent System Observability
OpenReview, AgentTrace: Causal Graph Tracing for Root Cause Analysis in Deployed Multi-Agent Systems
OpenReview, Evaluation of Multi-Turn Consistency in LLM Agents: Survival Analysis and Failure-Rationale Taxonomy
OpenReview, AMA-Bench: Evaluating Long-Horizon Memory for Agentic Applications
OpenReview, Aegis: Automated Error Generation and Attribution for Multi-Agent Systems
OpenReview, PALADIN: Self-Correcting Language Model Agents to Cure Tool-Failure Cases
OpenReview, Why Do Multiagent Systems Fail?

Publishing, Build, and the Book Platform Layer¶

MkDocs, Official documentation
Material for MkDocs, Official documentation
uv, Working on projects
ty, Official documentation
Starlight, Official documentation

Rust and the Infrastructure Layer of Agent Runtimes¶

AWS, AWS SDK for Rust is generally available
AWS Docs, Code examples for Amazon Bedrock Runtime using AWS SDK for Rust
docs.rs, aws-sdk-bedrockagentruntime
Microsoft Learn, Azure SDK for Rust
Rig, Official documentation
docs.rs, rig-core
GitHub, 0xPlaygrounds/rig

How To Use This List¶

If you extend the book further, this order is convenient:

Risk and control framing: NIST, OWASP, CISA.
Architectural patterns and runtime discipline: Anthropic, OpenAI, LangGraph, Google Cloud, Microsoft.
Observability, evals, and verifier layers: OpenAI, Microsoft, arXiv, GitHub.
HCI, HITL, and cases: Microsoft Research, OpenReview, ABA.
Research frontier: memory, consistency, observability, and multi-agent failure modes.

For reading the book itself, one more split is useful:

Stable core: normative frameworks, architecture, policy, execution, and observability;
Fast-moving layer: eval tooling, verifier design, inventory governance, frontier research, and newer cases.