Safety

Guardrails and constraints authored once, executed through the per-call Safety session.

import {
  // Authoring
  guardrail,
  constraint,
  evaluateGuardrail,
  evaluateConstraint,
  GuardrailBlockedError,
  ConstraintViolationError,
  // Consumption
  createSafety,
  createSafetyPlugin,
  defaultConstraintFeedbackFormatter,
} from '@crux/core/safety'

@crux/core/safety is one deep module. Guardrails filter content; constraints validate generated output and can retry with feedback. Both are authored as frozen objects and executed through a single per-call Safety session that every adapter constructs internally — scope merging, phase ordering, retries, suspension policy, audits, and observability live in the session, never in adapter code.

Guardrails

guardrail(config) creates a frozen guardrail object. Guardrails run on input, output, or streaming chunks and can pass, block, redact, transform, warn, or hold chunks. An optional category (e.g. 'pii', 'jailbreak') is carried through audits for risk-type aggregation.

import { guardrail } from '@crux/core/safety'

const piiGuard = guardrail({
  name: 'pii',
  category: 'pii',
  phase: 'output',
  validate: async (content) =>
    content.includes('ssn') ? { action: 'block', reason: 'PII detected' } : { action: 'pass' },
})

GuardrailBlockedError is thrown when a guard blocks content.

Constraints

constraint(config) creates a semantic output check. Constraints run after schema validation. Failed assert constraints produce feedback and retry until they pass or exhaust.

import { constraint } from '@crux/core/safety'

const citeSources = constraint({
  name: 'cite-sources',
  severity: 'assert',
  maxRetries: 2,
  check: async (output) =>
    output.text.includes('[1]')
      ? { pass: true }
      : { pass: false, feedback: 'Include at least one citation.' },
})

ConstraintViolationError is thrown when assert constraints fail after retries.

Constraints bridge into the other predicate surfaces without new concepts: judgeConstraint() turns an LLM judge into a normal constraint (online enforcement of scored quality), and constraintScorer() runs any constraint as a binary scorer in eval suites (offline regression-testing of production policy).

Registration: `createSafetyPlugin()`

One plugin registers global guardrails and constraints. Per-prompt (prompt({ constraints })), per-context, and per-call attachment merge with name-keyed precedence: per-call wins over per-prompt wins over global.

import { config } from '@crux/core'
import { createSafetyPlugin } from '@crux/core/safety'

config({
  plugins: [createSafetyPlugin({ guardrails: [piiGuard], constraints: [citeSources] })],
})

Consumption: `createSafety()`

Adapters consume safety through one session per generate()/stream() call. You only need this when building a custom adapter dialect; application code never calls it.

const safety = createSafety({ call: opts, resolved, promptId, model, systemPrompt })

;({ messages } = await safety.guardInput({ messages }))           // input guards, redaction written back
const final = await safety.finalizeOutput(output, regenerate, {   // constraints → output guards
  suspended: finishReason === 'tool_approval_required',           // suspension skips output safety
})
const meta = safety.stamp(traceMeta)                              // audits attached iff non-empty

regenerate(corrective) is the only dialect-specific concern: append the corrective messages, re-call the model, re-validate, return the new output.
formatter (a ConstraintFeedbackFormatter) injects corrective-message phrasing; the default is defaultConstraintFeedbackFormatter.
safety.transcript is a machine-readable protocol trace — the dialect parity suite asserts both dialects produce identical sequences.
safety.openStream() is the streaming sub-protocol: feed() each text delta (emit/hold directives), finish() at end-of-stream (runs buffer: 'full' guards plus report-only constraints, returns the seal with any pending tail), or transform() for a ready-made TransformStream<string, string>.

Choosing The Primitive

Need	Use
Block, redact, transform, or warn on raw input/output	`guardrail()`
Enforce business rules on generated output and retry	`constraint()`
Enforce a minimum LLM-judge score and retry	`judgeConstraint()` (`@crux/core/scoring`)
Regression-test a production constraint in eval suites	`constraintScorer()` (`@crux/core/quality`)
Repair JSON/schema parse failures	`validationRetry`

Guide: Safety
Guide: Guardrails
Guide: Constraints

Guardrails

Constraints

Registration: createSafetyPlugin()

Consumption: createSafety()

Choosing The Primitive

Related

On this page

Registration: `createSafetyPlugin()`

Consumption: `createSafety()`