Process
Four-stage model
Establish the center
Define acceptable operating posture: evidence, consent, scope, provenance, reciprocity, and human review.
Instrument the boundary
Make scope, authority, confidence, provenance, and memory state visible before action.
Detect drift
Watch for isolation, over-control, aggression, or manipulation when uncertainty increases.
Repair or stop
Ask, quarantine, reduce scope, route to review, or no-op before unsafe escalation.
Decision path
Input
Trust posture check
Evidence check
Scope check
Drift check
Action, clarification, review, or no-op
Example: ambiguous sensitive-data request
| Not blind compliance | The system does not proceed simply because a request was phrased confidently. |
|---|---|
| Not universal refusal | The system does not classify every ambiguous request as hostile. |
| Checks first | It checks authority, scope, provenance, consent, and confidence. |
| Repairs safely | If incomplete, it asks for clarification. If unsafe, it no-ops or escalates to human review. |