Question 1

Why four memory layers instead of one big store?

Accepted Answer

Different recall needs have different freshness contracts. Procedural guidance should always load, working memory should reflect the current session, episodic should remember last week, durable should remember last quarter. Mashing them together makes selective retrieval impossible.

Question 2

How do you stop the memory bank from drifting over time?

Accepted Answer

Write-time gates run for every new chunk: novelty (is this actually new?), quality (is this signal?), dedup (do we already have this?), and contradiction handling (does this conflict with something we already remember?). Plus nightly consolidation distills episodic into durable, so old high-signal items don't get lost in the noise.

Question 3

Can recalled memories be used to inject attacks into prompts?

Accepted Answer

Boundary markers wrap every retrieved memory chunk so the model treats them as data, not instructions. Content scanning runs at read time. Summarization passes through injection gates pre-sanitize and post-scan. None of this is theoretical, it's wired into the production retrieval path.

Question 4

What happens to memories that are wrong?

Accepted Answer

Memory chunks have edit history. Operators can prune individual chunks, time-travel restore an older version, or mark a chunk as superseded. The bank is operator-editable, not write-only.

Question 5

Does the model see all memory on every turn?

Accepted Answer

No. That would defeat the purpose. The recall gate decides whether to query at all, then a budget caps how much makes it into the prompt. Most turns get a small bounded block, not a memory dump.

Question 6

Where does PACKWOLF store memory data?

Accepted Answer

Local SQLite per workspace, with FTS5 indexes for keyword search and sqlite-vec embeddings for semantic search. On Cloud, this becomes Postgres with pgvector. Same data model on either deployment.

Four memory layers, retrieved selectively, consolidated nightly.

The parts that make this work.

Four explicit banks.

Recall is gated before search.

Vector-first, FTS5 fallback.

Write gates protect the bank.

Boundary markers stop injection.

Atomic write queue per target.

The path through memory.

A turn arrives.

Procedural loads first.

Working memory builds.

Episodic + durable get queried.

The model gets a bounded block.

Nightly, consolidation runs.

Things engineers actually ask.

See it in your workspace.