Safety

Guardrails, constraints, and prompt-injection-resistant patterns. The two distinct safety primitives explained.

Crux gives you three distinct safety primitives that are easy to confuse:

Guardrails — filter, transform, redact, or warn on inputs and outputs. They block bad I/O.
Constraints — validate output semantics and retry the model with feedback. They improve good I/O on business-rule violations.
Validation retry — recover from schema-parse failures. When the model returns malformed JSON or wrong types, Crux text-repairs first, then re-prompts with the Zod error.

Plus a fourth concern: Security — patterns to make your prompts injection-resistant.

The boundary, plainly

	Guardrail	Constraint	Validation retry
Runs on	Input or output (configurable)	Output only	Output only
What it catches	Forbidden content, PII, formatting	Semantic / business-rule violations	Schema-parse failures, wrong types
What happens on a hit	Block, transform, redact, or attach a warning	Re-call the model with custom feedback	Text-repair first, then re-call with the Zod error
Knows about retries	No — it's a one-pass filter	Yes — that's its whole purpose	Yes — bounded by `maxRetries` and `maxSteps`
Returns	Modified data + warnings	Validated output (or `ConstraintViolationError`)	Validated output (or `ValidationExhaustedError`)
You write	A predicate / transformer	A `constraint({ check })`	Just `validationRetry: { maxRetries }` — no logic needed

Rules of thumb:

Schema parse failed? That's validation retry — built-in, just enable it.
Output parsed but business rule failed? That's a constraint — write a check.
Need to filter / redact / transform? That's a guardrail — no retries, just I/O modification.

You need to redact PII before logging, before sending to a tool, or before showing output
You want to block certain inputs entirely (profanity, prompt-injection markers)
You want to transform outputs (strip leading code-fence markers, normalize whitespace, decode entities)
You want to warn without blocking (attach metadata for downstream code)

The model occasionally returns output that fails a business rule (off-topic, wrong language, missing field)
You're willing to spend an extra round-trip to get correct output
Your output schema validates most outputs but a small percentage need a re-prompt with the validation error

The output is structurally invalid (won't parse) — Zod schema validation already handles that
The check needs to read external state (database, RAG) — use a tool call, not a guardrail
The check is about who is asking, not what — that's auth, not safety

Filter, transform, redact, or warn on inputs and outputs.

Semantic output validation with model retry on business-rule violations.

Auto-recover from schema-parse failures and wrong types.

Injection-resistant prompt patterns.

Quality validates output quality during testing. Guardrails and constraints validate it during execution. judgeConstraint() (@crux/core/scoring) bridges the two by enforcing an LLM-judge score as a constraint.
Tools can implement domain-specific safety checks that need external data.
Devtools trace every guardrail hit and constraint retry.