Chapter 17. Policy Layer and Capability Catalog¶
1. Why a Reference Runtime Is Still Too Naive Without a Policy Layer¶
Even if you already have a clean runtime loop, that is still not enough. Without an explicit policy layer, the system remains too trusting:
- you cannot reliably distinguish allowed runs from forbidden ones;
- tool calls are hard to control consistently;
- memory writes live on scattered conventions;
- product-specific restrictions leak into orchestration code very quickly.
That is why the next mandatory layer in the reference implementation is the policy layer.
Its job is not to "slow the system down". Its job is to make decisions about access, risk, and admissibility explicit instead of scattering them across random if branches.
2. A Policy Layer Should Answer Small and Clear Questions¶
A weak policy layer tries to become "the smart brain of the system". A strong policy layer does the opposite: it solves a limited set of clear questions.
For example:
- can this run start at all;
- can this context be read;
- can this capability be invoked;
- is approval required;
- can this be written into memory;
- can this result be returned outward.
When those questions are explicit, the runtime becomes easier to explain, and guardrail changes stop being chaotic.
3. A Capability Catalog Is Not Just a Registry of Names¶
It is very easy to slide into a catalog that only stores a list of available tools. A good catalog does more:
- it describes the capability contract;
- stores the risk profile;
- declares transport and execution mode;
- captures idempotency expectations;
- records ownership and lifecycle.
So the capability catalog is not "inventory for convenience". It is the central control point for platform capabilities.
Together, the policy layer and the capability catalog form the contract core of the reference implementation
flowchart LR
A["Run request"] --> B["Runtime orchestrator"]
B --> C["Policy layer"]
B --> D["Capability catalog"]
C --> E["Allow / deny / approve"]
D --> F["Capability contract"]
E --> G["Execution layer"]
F --> G 4. What Is Worth Storing in a Capability Catalog¶
A practical field set usually looks like this:
- capability name;
- owner;
- mode: read / write / high_risk;
- transport: mcp / gateway / sandboxed_exec;
- input schema;
- output shape;
- approval requirement;
- idempotency requirement;
- timeout and retry defaults.
With that contract, the runtime can already behave predictably instead of adapting ad hoc to every capability.
5. A Policy Decision Should Be an Object, Not Just a Bool¶
A very useful engineering habit: do not reduce policy decisions to True/False.
It is often much more useful to return something like:
allowdenyapproval_requiredsanitize_and_continueescalate
And additionally:
- reason code;
- policy id;
- risk class;
- optional constraints.
That greatly improves explainability and makes telemetry far more useful.
6. Example Policy Contract¶
Here is a very simple but practical template:
policy:
run_precheck:
require_tenant: true
deny_if_principal_missing: true
capabilities:
search_docs:
decision: allow
create_ticket:
decision: approval_required
approver: manager
run_shell:
decision: deny
memory_write:
allow_kinds:
- validated_fact
- session_summary
Its power is not completeness. Its power is explicitness. You can argue about a specific rule and understand where it applies.
7. Example Capability Catalog Contract¶
It helps to think about the catalog roughly like this:
capabilities:
search_docs:
owner: knowledge_platform
mode: read
transport: mcp
timeout_seconds: 5
approval: none
create_ticket:
owner: support_platform
mode: write
transport: gateway
timeout_seconds: 15
approval: manager
idempotency_key_required: true
run_shell:
owner: platform_runtime
mode: high_risk
transport: sandboxed_exec
timeout_seconds: 10
approval: always
That kind of catalog already defines operational semantics, not just names.
8. A Simple Policy Decision Skeleton¶
The point here is that the runtime receives not only permission, but a structured decision.
from dataclasses import dataclass
@dataclass
class PolicyDecision:
action: str
reason: str
policy_id: str
def evaluate_capability(name: str) -> PolicyDecision:
if name == "search_docs":
return PolicyDecision(action="allow", reason="low_risk_read", policy_id="cap_001")
if name == "create_ticket":
return PolicyDecision(action="approval_required", reason="write_action", policy_id="cap_014")
return PolicyDecision(action="deny", reason="unsupported_capability", policy_id="cap_999")
Even code this small already gives the right shape for telemetry, approval UI flows, and investigations.
9. A Simple Capability Lookup Skeleton¶
And one more practical piece: the runtime should not know capability details directly, it should fetch them from the catalog.
from dataclasses import dataclass
@dataclass
class CapabilitySpec:
name: str
mode: str
transport: str
timeout_seconds: int
def get_capability(name: str) -> CapabilitySpec | None:
registry = {
"search_docs": CapabilitySpec("search_docs", "read", "mcp", 5),
"create_ticket": CapabilitySpec("create_ticket", "write", "gateway", 15),
}
return registry.get(name)
This also looks boring. Good. The catalog layer should be boring, stable, and inspectable.
10. Where Policy and Catalog Usually Break¶
These problems are very typical:
- policy rules are scattered across runtime code;
- the capability contract is incomplete;
- capability ownership is unclear;
- approval logic is embedded directly into orchestration;
- memory policy and execution policy behave as if they were unrelated;
- the catalog and real adapters drift apart in behavior.
When that happens, the reference implementation stops being a reference and becomes a bundle of conventions again.
11. Practical Checklist¶
If you want to review this layer quickly, ask:
- Do you have a separate policy layer instead of a pile of
ifbranches? - Does policy return a structured decision?
- Is there a single capability catalog?
- Do capabilities have owner, transport, and risk semantics?
- Does the runtime use the catalog rather than direct calls?
- Are policy decisions visible in telemetry?
If the answer is "no" several times in a row, the skeleton exists, but the contract core is not assembled yet.
12. What to Read Next¶
The next logical step in the reference implementation is to assemble a production rollout checklist, so you move from blueprint and contract core into a practical go-live framework.