Failure modes
Catches.
Not anecdotes. Failure modes.
Nogra started from real operator failures: public incidents, private testing, and the repeated pattern of models sounding finished before the work could actually be checked.
This page keeps the useful part: what failed, what Nogra changes, and the mechanism behind it.
- 01
Evidence-shaped lies.
false evidence
Failure. A model can attach real-looking evidence to work it did not do. A commit hash can exist. A file path can exist. A test name can exist. None of that proves the claim unless the evidence matches the approved work.
Nogra catch. Nogra separates the claim from the check. The brief defines what evidence should look like before the run starts. Verification checks the result against that contract instead of trusting the executor's summary.
brief evidence contract + separate verification
A valid reference is not the same as valid proof.
- 02
Conversation memory that vanishes.
lost continuity
Failure. A session can say it recorded everything and still leave the next session with nothing durable to read. The work becomes a story the model remembers until it does not.
Nogra catch. Nogra keeps continuity in project files: checkpoints, decisions, current tasks, briefs, receipts, and evidence. A new session reads the workspace instead of pretending the old conversation is still alive.
.nogra/ state files + session checkpoint
If future work depends on it, it belongs on disk.
- 03
The worker grading its own work.
self-review
Failure. The same context that produced the work is bad at judging the work. It knows the intent, the excuses, the partial attempts, and the story it has already told. That makes self-review too easy to soften.
Nogra catch. Nogra makes verification a separate pass. The verifier gets the approved scope, the output, and the available evidence. It does not need the executor's reasoning to decide whether the result is ok, partial, or blocked.
executor/verifier separation
The model that wrote it does not sign off on it.
- 04
Shared context that rubber-stamps.
context contamination
Failure. Planner, executor, and verifier can look independent while sharing the same narrative. Once the verifier has inherited the executor's reasoning, it tends to complete the same story.
Nogra catch. Nogra dispatches against an approved brief, then verifies against the brief and evidence. The roles are separated by contract, not just by a new paragraph in the same conversation.
approved brief + fresh execution context + scoped verification
Independence is structural, not a tone.
- 05
Intent quietly turning into permission.
approval drift
Failure. A user asks for an outcome. The model treats the outcome as standing approval to widen scope, chain more work, or keep going because it still feels aligned with the goal.
Nogra catch. Nogra treats intent as draft until the brief is approved. Dispatch starts after explicit GO. If the user wants direct work, direct work stays direct; Nogra does not convert a goal into permanent permission.
reviewed brief + explicit GO before dispatch
A goal is not a green light.
- 06
Substituted evidence.
source drift
Failure. When the primary source is missing, slow, blocked, or inconvenient, a model may replace it with a nearby source and keep moving. The answer can sound researched while the load-bearing evidence never arrived.
Nogra catch. Nogra makes the evidence requirement explicit. If the required evidence is missing, substituted, or contradictory, verification returns partial or blocked instead of turning the gap into a green claim.
stop criteria + evidence-aware verification
Missing evidence is a result.
- 07
One vague run for mixed work.
shape drift
Failure. A mixed job starts as one big instruction. Design, implementation, verification, cleanup, and release all get compressed into the same run, then nobody can tell which part is actually done.
Nogra catch. Nogra shapes complex work before the brief: topology, lane, role, evidence join, stop boundary, and next owner. Single-run work can stay simple; mixed work gets a plan first.
orchestration plan + lane and phase boundaries
Plan the shape before the run inherits it.
Contribute a catch
Have a failure mode Nogra missed?
File it. It belongs here only if there is a real mechanism that reduces the failure next time. github issues is the front door.
Catches do not get added because they sound plausible. They get added because someone hit them, and the fix became a file, a contract, or a workflow rule.