Capability · Execution layer

Native SDK per provider. Switch per agent, per task, or globally.

PACKWOLF talks to model providers through native SDKs, not a single OpenAI-shaped wrapper. Claude keeps prompt caching and extended thinking. OpenAI keeps vision and function calling. Local LLMs go through the priority queue. The in-house model is included on Basic, Pro, and Max plans.

5
Provider types
Native
SDK per provider
3-failure
Health threshold
60s
Cooldown
packwolf.app · Multi-provider
Live screenshot
Multi-provider screenshot
AI Models settings. Configure providers, set defaults per agent, test connections, watch health.
What it actually does

The parts that make this work.

Claude keeps Claude.

Native @anthropic-ai/sdk. Prompt caching, extended thinking, the full tool surface. Not flattened into a generic chat-completions shape.

OpenAI keeps OpenAI.

Native openai SDK. Vision, function calling, structured output as first-class. Each version of the API surface translates directly.

Local LLMs are first-class.

Ollama and LM Studio through an OpenAI-compatible adapter, but with the priority queue and model affinity guards on top. Air-gappable.

MCP servers extend the surface.

Model Context Protocol servers add tools. Some providers (Anthropic) consume MCP directly. PACKWOLF normalizes the rest.

Health monitor + cooldown.

Three failures within window → 60-second provider cooldown. Automatic failover to a backup provider if one is configured.

Per-agent provider routing.

Agent A uses Claude; Agent B uses OpenAI; Agent C uses the in-house model with local fallback. The pack mixes providers per role.

How it works

The path through multi-provider.

  1. 01

    Provider config picks the SDK.

    Each agent has a provider (or multiple with priority order). The provider type, claude / openai / local / in-house / mcp, picks which native adapter to load.

  2. 02

    Native call goes out.

    The adapter formats the call in that provider's exact shape. No translation layer flattening features. Claude's thinking blocks stay thinking blocks.

  3. 03

    Health monitor watches.

    Each adapter records success/failure events. Three failures in window → cooldown. The trace shows which provider was used and whether it was healthy.

  4. 04

    Failover (if configured).

    If primary is in cooldown, the request routes to the backup provider, say, Claude as fallback for an unavailable local model. The cost event tags which provider actually ran.

  5. 05

    Response normalizes upward.

    Provider-specific outputs (Claude tool_use blocks, OpenAI tool_calls, local content blocks) normalize into PACKWOLF's internal shape for memory writes and audit logs. The provider's distinctive features still surface where they matter.

Common questions

Things engineers actually ask.

OpenAI-compatible wrappers flatten provider features. Claude's prompt caching, OpenAI's structured output, Anthropic's extended thinking, vision differences, all of it loses fidelity. We pay the cost of native adapters because the lost features cost more downstream.

Source: docs/CLAUDE.md (Provider System)

See it in your workspace.

Closed-beta cohorts are small. Tell us what you'd want this capability to handle for your team.

Request beta access