Crux
Foundations

Mental Model

How Crux turns authored definitions into assembled, observed, and testable model turns.

This page is for the developer who wants to know what actually happens when they call generate().

If you want the values behind the design, read Thinking in Crux first. This page is the mechanics.

The unit Crux cares about

A Crux run is more than one prompt string. It can include system instructions, user input, context, memory, retrieved sources, tools, safety checks, model choice, retries, and output validation.

Crux's job is to make those pieces easier to assemble and inspect. The model can still give different wording from run to run, but the surrounding setup should not be accidental.

The assembly path

Prompts in Crux declare the extra pieces they need through use[].

That one pattern keeps the setup understandable. A prompt can use shared context, memory, retrieval, skills, tools, or custom blocks without every feature inventing its own wiring. Crux can then show which pieces were included, skipped, cached, or dropped for token budget.

The runtime pipeline

Every Crux execution then follows the same runtime pipeline:

define     ->     resolve     ->     adapt     ->     observe
(authoring)       (composition)    (translation)     (instrumentation)
StageWhere it happensOutput
DefineAt authoring time, in your codeFrozen Prompt, Context, Agent, etc. — pure data
ResolveWhen you call an adapterSDK-agnostic ResolvedPrompt — system text, schema, tools, settings
AdaptInside the adapterProvider-specific API call (OpenAI body, Anthropic messages, etc.)
ObserveThroughoutEvents fan out to devtools, OTel, and custom plugins

Crux's portability claim - "define once, run anywhere" - is only true because the stages are clean-cut. Definition produces data, resolution composes it, adaptation translates it, and observation records what happened without interfering.

1. Define

Definitions are pure data structures. They carry no SDK reference and no runtime state.

import { prompt, context } from '@crux/core'
import { z } from 'zod'

const brand = context({
  id: 'brand',
  priority: 30,
  input: z.object({ tone: z.string().optional() }),
  when: ({ input }) => !!input.tone,
  system: ({ input }) => `Write in a ${input.tone} tone.`,
})

const reply = prompt({
  id: 'reply',
  use: [brand],
  input: z.object({ message: z.string() }),
  output: z.object({ text: z.string() }),
  system: 'You are a helpful assistant.',
  prompt: ({ input }) => input.message,
})

The output of prompt() is a frozen Prompt — you can serialize it, list its contexts, inspect its input schema. You cannot execute it directly because it has no idea what model to call.

2. Resolve

When you pass a prompt to an adapter, Crux calls prompt.resolve(options) first. This composes everything into a single SDK-agnostic structure:

const resolved = prompt.resolve({ input: { message: 'hi', tone: 'casual' } })

resolved.system // composed system text from all contexts + prompt's own
resolved.systemBlocks // structured blocks for provider-cache hints
resolved.prompt // user prompt text
resolved.messages // alternative to system+prompt for multi-turn
resolved.schema // output Zod schema (if any)
resolved.tools // merged tool set from contexts + prompt
resolved.settings // merged settings (config < adapt < call-site)

What happens during resolution:

  1. Validate input against the prompt's merged Zod schema (prompt input + context inputs)
  2. Filter contexts — drop ones with when returning false, falsy entries in use, and unmatched match() cases
  3. Sanitize input — auto-escape XML characters in user-provided strings (configurable)
  4. Resolve contexts — call each context's system function in priority order, with caching (cacheTtl) when configured
  5. Build system blocks — concatenate the pieces of context, attaching provider-cache markers per block
  6. Apply token budget — if a budget is set, drop lowest-priority contexts until total fits. The prompt's own system is always kept.
  7. Merge tools and settings — collect tools from active contexts + prompt + call-site; merge settings with later layers winning

The result is the same regardless of which adapter you call next. This is the portability layer and the point where Crux can show what made it into the request.

3. Adapt

Adapters take a ResolvedPrompt and translate it to the SDK's native API call.

// @crux/ai (Vercel AI SDK)
import { generate } from '@crux/ai'
await generate(prompt, { model: openai('gpt-4o'), input })
// Internally: prompt.resolve(...) → generateObject({ model, system, prompt, schema, tools, ...settings })

// @crux/anthropic
import { createAnthropic } from '@crux/anthropic'
const anthropic = createAnthropic(client)
await anthropic.generate(prompt, { model: 'claude-haiku-4-5', input })
// Internally: prompt.resolve(...) → client.messages.create({ system, messages, tools, ...settings })

Adapters do three things:

  1. Map ResolvedPrompt fields to the provider's request shape
  2. Apply provider-specific cache hints (Anthropic cache_control, Google CachedContent, OpenAI prefix caching)
  3. Map the response back to a normalized GenerateResult { text, object, usage, ... }

Per-prompt provider overrides live on the Prompt itself via adapt:

prompt({
  // ...
  adapt: {
    'gpt-4o': { settings: { temperature: 0.1 } },
    'claude-*': { system: 'Override for all Claude models' },
  },
})

Adapt is matched by exact model ID first, then prefix wildcard, then '*' fallback. The adapter merges the override into resolution before translation.

4. Observe

Every interesting moment in the pipeline can emit an event:

  • onGenerateStart / onGenerateEnd
  • onContextResolve, onContextCacheHit, onContextCacheMiss
  • onToolCall, onToolResult
  • onMemoryRead, onMemoryWrite
  • onCompactStart, onCompactEnd
  • onJudgeResult, onEvalCase
  • onBudgetCheck
  • onBlackboardUpdate, onHandoffPrepare, onDelegateStart, onDelegateComplete
  • onFlowStep, onFlowSuspend, onFlowResume

Plugins install themselves into these hooks via mergeRuntime(). The same event fans out to all installed plugins — OTel spans, custom telemetry, policy hooks — without primitives ever knowing they're being watched. Devtools tracing uses the canonical @crux/core/observability graph transport so the Go backend owns tree and relation assembly for both the web UI and TUI.

import { config } from '@crux/core'
import { withTelemetry } from '@crux/otel'

config({
  plugins: [withTelemetry({ serviceName: 'my-app' })],
  devtools: { serverUrl: process.env.DEVTOOLS_URL }, // local server or tunnel only
})

Devtools are zero-cost when disabled — instrumentation hooks are no-ops if no plugin installs them.

5. Test

Testing is the loop around the runtime pipeline. Quality suites, targets, experiments, cassettes, baselines, and feedback records let you check whether the AI feature still behaves the way you intended.

Today, those quality foundations can test outputs and many execution facts. More direct assertions for things like context inclusion, routing, memory, safety, and fallback behavior are still being expanded.

How everything else fits in

Once you see the four stages, the rest of Crux becomes legible:

  • Memory can add relevant state to the prompt and expose focused tools during a generation.
  • Compaction runs before resolve — it summarizes or trims the message history that you'd then feed into the prompt's messages.
  • Agents are prompts with a defined identity, a model preference, and a list of allowed handoffs. Compositions (pipeline / parallel / consensus / swarm) are higher-order functions over agents.
  • Flows are suspendable/resumable wrappers over multi-step orchestration. A flow records its step results so a resume re-runs only the work after the suspension point.
  • Plans / Tasks are structured documents that persist via CruxStore — read by humans (UI), written by agents (tools).
  • Quality runs suites across targets, optionally scored by LLM judges, and uses the same prompt setup as production.
  • Devtools / OTel are plugins that subscribe to instrumentation hooks.

Every primitive should make the AI feature easier to understand, not harder.

Crux already supports typed prompts, composable context, local devtools, tracing, and quality foundations. Some deeper debugging and test helpers are still being expanded.

Where to next

On this page