PACKWOLF Engineering

PACKWOLF Engineering https://packwolf.ai/engineering How we build PACKWOLF. Posts on the agent runtime, observability, and scheduling. en-us Tue, 21 Apr 2026 00:00:00 GMT Why we wrote our own context-compaction stack https://packwolf.ai/engineering/context-compaction https://packwolf.ai/engineering/context-compaction Tue, 21 Apr 2026 00:00:00 GMT noreply@packwolf.ai (PACKWOLF engineering) Context Long contexts cost more, run slower, and recall worse on the middle. Compaction keeps prompts focused. Building a flame graph for agent execution https://packwolf.ai/engineering/flame-graph-for-agents https://packwolf.ai/engineering/flame-graph-for-agents Wed, 18 Mar 2026 00:00:00 GMT noreply@packwolf.ai (PACKWOLF engineering) Observability Agent runtimes have a different failure shape than web apps. The model can emit a tool name with no arguments. A priority queue for shared local LLMs https://packwolf.ai/engineering/priority-queue-for-local-llms https://packwolf.ai/engineering/priority-queue-for-local-llms Mon, 09 Feb 2026 00:00:00 GMT noreply@packwolf.ai (PACKWOLF engineering) Models Most local inference servers handle one request at a time per model. Concurrent requests cause crashes, model swap thrash, or both.