Nine-point gate, every call.
Allowlist, budget, approval, scope, injection scan, path validation, rate limit, SSRF check, trust state. The order matters, cheap checks fail fast, expensive ones run only when needed.
The provider returns a structured tool_use block with arguments. The pipeline validates the schema before anything else.
Allowlist (is this tool enabled for this agent?), budget (any quota left?), approval (does this need sign-off?), scope (is the path/resource in scope?), injection (does input contain injection signatures?), path (does the file path pass validation?), rate (within rate limit?), SSRF (is the URL safe?), trust (is the tool trusted, or should we pause for review?).
Up to ten read-tier tools run concurrently. The pipeline waits on the slowest, then assembles results.
Writes and dangerous operations queue. No simultaneous mutations from concurrent agents, the audit trail is always linear at the resource level.
Tool output truncates to a configured cap (so a giant file dump doesn't blow context), then returns to the model. Failures classify into the trace's failure taxonomy.
On approval-bypassing tools, success increments the trust counter. Five clean runs in a row promotes the tool to trusted. A rejection resets the counter to zero.