PACKWOLF assumes a trusted local operator, the human who configures the workspace, sets approval policies, and signs off on changes.
Your data. Your keys. Your audit trail.
Three things stay true on your terms: where your data lives, which keys make the calls, and who signs off when the pack acts. Sourced from the production codebase, not a marketing PDF.
What we trust, what we don't, where the line is.
Model output is never trusted on its own. Imported skills and MCP servers are not trusted. Memories retrieved from prior conversations are not trusted as instructions.
The most important boundary is tool execution. Model intent is never enough by itself, every tool request passes through the safety gate before anything happens.
Nine layers, every call.
Every tool invocation passes through gateToolExecution() in lib/tool-safety.ts. Each layer fails closed. The trace records which layer fired and why.
- 01
Tool exists and is enabled
Unknown or disabled tools fail closed before any other check runs. The agent's effective tool list is enforced server-side, not just shown in the UI.
- 02
Agent has the tool assigned
The tool ID must appear in the agent's tools[] config. Agents only invoke what their assignment grants, no implicit access.
- 03
Input schema validation
Required fields, basic type constraints, and shape validation. Malformed tool calls don't reach the handler.
- 04
Budget check
Verifies the agent is within its monthly spend limit. The atomic check-and-debit prevents TOCTOU races between concurrent agents.
- 05
Rate limits
Per-turn caps on consecutive writes and on shell calls (defaults: 25 writes, 3 shell calls), configurable per agent. Loop-signature detection aborts runaways with diagnostic spans regardless of count.
- 06
Approval rules
Configurable approval policies per agent and per tool. Rules can require operator sign-off before specific calls execute.
- 07
Path scope validation
File tools must stay inside the agent's workspace. Blocks .env, .config, .git, .ssh, .aws, providers.json, node_modules, common paths attackers reach for.
- 08
SSRF protection
Blocks localhost, .local, private IPs, link-local. Async DNS rebinding check happens at request time, not at policy time.
- 09
Injection scan
Pattern-based scan on tool input for known injection signatures. Suspicious input fails closed; the trace records why.
Recalled context can't act as instruction.
Memory injection is a real attack class. PACKWOLF defends against it with boundary markers, read-time pattern scanning, cross-agent leakage detection, and content hashing, sourced from lib/memory-manager.ts.
- 01
Boundary markers
Retrieved memory is wrapped in explicit <retrieved-memory>...</retrieved-memory> markers and a safety preamble. The model treats it as data, not instructions; the wrapper itself is a blocked injection pattern, so spoofed markers in memory content fail the scan.
- 02
Injection scanning at read time
isSuspiciousContent() blocks 'ignore previous instructions', 'you are now', 'from now on', 'act as', and the broader behavioral-directive pattern set on retrieval. Logs keep full fidelity; prompts don't see the injection.
- 03
Cross-agent leakage detection
isAgentLeakage() prevents one agent's persona or instructions from being recalled into another agent's memory context. Each agent's recall stays scoped.
- 04
Provenance tracking
Every memory event has a source attribution, which turn wrote it, which agent, which session. The audit log can reconstruct any memory's origin.
- 05
Content hashing for dedup
Every chunk is stored with a SHA-256 content hash. Identical content deduplicates on write; the hash is the join key for cross-agent provenance and the maintenance sweep that compacts duplicates.
- 06
7-day TTL on flush extraction
Auto-archived content from conversation extraction expires after 7 days. Long-term knowledge requires explicit promotion to durable memory, not silent accumulation.
Per-agent isolation, by default.
Filesystem sandbox per agent, opt-in shared workspace, process-isolated shell with microVM staged behind the same interface, sanitized tool results before persistence, three-layer skill scanning. Defense in depth, not single-layer trust.
- 01
Per-agent filesystem sandbox
Each agent's file operations resolve into its own scoped workspace at {workspace}/agents/{agentId}/workspace/. Path validation makes escape impossible to ignore.
- 02
Shared workspace, opt-in
A shared workspace at {workspace}/shared_workspace/ is accessible to all agents via the shared/ path prefix. Agents reach it explicitly, not by accident.
- 03
Shell sandbox
Shell commands run inside a per-call sandbox with scoped permissions, working directory pinned to the agent's workspace, and host paths gated through the same path-scope validator the file tools use. MicroVM-isolated execution (Firecracker/UTM) is staged behind the same interface and routed to when available.
- 04
Tool result sanitization
Before tool output persists to client storage, patterns matching API keys, bearer tokens, and .env values get redacted to [REDACTED]. A defense-in-depth layer against accidental secret persistence.
- 05
Skill scanning
Imported skills pass through three security layers: static pattern analysis, behavioral analysis, and LLM semantic analysis. Hash-verified re-scan on every prompt build.
Every call is a span you can replay.
Memory writes, model calls, tool invocations, approvals - each is a node in the trace. Per-span flame graph, prompt versioning, replay. You can prove what happened, not just believe it.
- 01
Per-span flame graph
Every model call, tool call, memory write, and approval lands as a span with input, output, metadata, and diagnosis. Master-detail layout, full timeline, indexed by run.
- 02
Prompt versioning
Every system prompt is diffable across runs. When an agent's behavior changes, the change in its prompt is recorded alongside the change in its output.
- 03
Replay
Pick a span. Re-run it with the same inputs against the same or a different model. Compare side-by-side.
Cloud and Desktop. One product.
Same product, different data policies. Entry plans coordinate; Pro and Max add selected continuity when cloud should keep working.
Cloud
Hosted runtime, no install.
- Workspace, memory, and traces stored in PACKWOLF's managed infrastructure with region selection
- Per-workspace key isolation; provider keys encrypted at rest (AES-256) and in transit (TLS 1.3)
- Audit log retention configurable per workspace, default 365 days
- BYOK across Claude, OpenAI, our in-house model, MCP servers — your keys, your provider, no markup
- SOC 2 Type I targeted within twelve months of GA
Desktop
Runs on your machine.
- Workspace, memory, and traces stored locally at ~/.packwolf/workspace/{company_slug}/*.db
- BYOK plus local LLMs — point at Ollama or LM Studio and the pipeline never leaves the machine
- Air-gappable when local LLMs are configured and web/MCP tools are disabled; mixed mode is the default
- No telemetry by default; opt-in if you want to share anonymized usage
- Managed cloud execution is opt-in and starts with selected context, not full workspace upload
Sub-processorsWhen you use Cloud without BYOK, prompts and outputs traverse the model provider you select (Anthropic for Claude, OpenAI, or our in-house model). With BYOK, calls go from your workspace directly to your provider account on your contract. Either way, we don't sell, train on, or retain your prompts. Desktop with local LLMs has no provider in the loop.
Found something? Tell us.
- Email security@packwolf.ai with details and a proof-of-concept. We respond within two business days and confirm scope before triage.
- In-scope: PACKWOLF Cloud, the Desktop bundle, the marketing site, and the public agent APIs. Out of scope: third-party model providers, MCP servers we don't ship, and any service outside packwolf.ai or ideius.com.
- We commit to good-faith handling: no legal action for research conducted in good faith inside scope, and public credit on request once a fix ships.
The questions you'll want answered first.
What's PACKWOLF's security boundary, exactly?
Tool execution. The model can want to call a tool; whether the tool runs depends on the nine-point safety gate, the approval rules, the agent's effective capability, and the operator's policy. Model intent isn't authority, execution is gated.
Can a malicious skill or MCP server compromise PACKWOLF?
Imported skills pass three security layers (static, behavioral, LLM semantic) before they can run. Hash verification re-checks on every activation. MCP servers run in process-isolated transport with scoped tool permissions. A malicious server can't reach beyond what its declared scope allows.
How do you prevent prompt injection from recalled memories?
Boundary markers wrap every retrieved chunk so the model treats it as data. isSuspiciousContent() runs at read time on the broader behavioral-directive pattern set ('ignore previous instructions', 'from now on', 'act as', and similar). Daily logs preserve full fidelity for audit; prompts don't see the injection.
What about secrets accidentally landing in tool output?
Tool result sanitization runs before persistence. Patterns matching API keys, bearer tokens, and .env values redact to [REDACTED] before the result lands in client storage. It's a defense layer, not a substitute for not feeding secrets to tools, but it catches the common cases.
Is PACKWOLF SOC 2 / ISO certified?
Not yet. We're targeting SOC 2 Type I within twelve months of GA. The audit infrastructure (audit log, approvals, change versioning, retention policies) is already wired in, the certification is a process layer on top, not a product gap.
Where does my data live?
Your choice, mid-stride. BYOK keeps the work on Desktop with account/license management only. Basic adds cloud command-center features while private context stays on your machine. Pro and Max can run selected memories, checkpoints, and projects through managed cloud execution. For full air-gap, run Desktop with local LLMs and disable web/MCP tools.
Can I export everything and leave?
Yes. Workspace export produces a complete bundle: products, specs, agents, workflows, memory, audit log, run history. The export format is documented and stable. We don't trap data, and we don't make migration painful.