Crux
GuidesRouting & Fallback

Fallback

Automatically try backup models when the primary model fails.

fallback() wraps multiple models into a single model reference. It tries each model in order until one succeeds.

Use fallback when a provider is down, rate-limited, timing out, or returning another retryable error. Fallback is not a quality gate: if the first model succeeds with a weak answer, fallback is done. Use cascade() for quality escalation.

import { fallback } from '@crux/core/routing'
import { generate } from '@crux/ai'
import { openai } from '@ai-sdk/openai'
import { anthropic } from '@ai-sdk/anthropic'

const model = fallback(
  openai('gpt-4o'),
  anthropic('claude-sonnet-4-20250514'),
  {
    id: 'primary-with-backup',
    timeout: 10_000,
  },
)

const result = await generate(myPrompt, {
  model,
  input: { query: 'Explain quantum computing' },
})

If OpenAI returns a rate limit or server error, Crux automatically retries with Claude. No call-site try/catch is needed for the normal fallback path.

When Do You Need This?

  • Production resilience: your app should keep working during provider outages.
  • Rate limit handling: high-volume apps can hit provider limits.
  • Provider diversity: backup models can come from a different provider.
  • Validation recovery: a stronger model may succeed after structured-output validation retries are exhausted.

If you only use one model provider and are comfortable with errors propagating, you do not need fallback().

Error Classification

Not all errors trigger fallback. Crux classifies errors into categories:

CategoryTriggersExamples
rate_limitHTTP 429Provider rate limit exceeded
timeoutETIMEDOUT, AbortErrorModel took too long to respond
server_errorHTTP 500-599Provider internal error, bad gateway
connection_errorECONNREFUSED, ENOTFOUND, fetch failuresDNS failure, provider unreachable
auth_errorHTTP 401, 403Invalid API key, insufficient permissions
validation_exhaustedValidationExhaustedErrorAll validation retries exhausted

Validation errors, content policy violations, and unknown errors do not trigger fallback by default. They are thrown immediately because retrying with a different model usually does not help.

Filter Error Categories

By default, classified retryable errors trigger fallback. Use on to restrict the categories.

const model = fallback(primary, backup, {
  on: ['rate_limit', 'timeout'],
})

Server errors and connection errors would be thrown immediately in this configuration.

Custom Error Predicates

For edge cases beyond the built-in categories, provide a shouldFallback predicate. It takes priority over on.

const model = fallback(gpt4o, claudeSonnet, {
  shouldFallback: (error) => {
    if (error.code === 'content_filter') return true
    if (error.status === 429) return true
    return false
  },
})

Inspecting Results

When fallback occurs, the result includes metadata:

const result = await generate(myPrompt, {
  model: fallback(gpt4o, claudeSonnet),
  input,
})

if (result._meta.fallback) {
  console.log(`Took ${result._meta.fallback.attempts} attempts`)
  console.log(`Failed models: ${result._meta.fallback.failedModels.join(', ')}`)
}

The _meta.fallback object contains:

FieldTypeDescription
attemptsnumberTotal attempts including the successful one
failedModelsstring[]Model IDs that failed
detailsFallbackAttemptDetail[]Per-attempt breakdown

Each detail entry has model, durationMs, status, error, errorCategory, and cost when available.

When the first model succeeds, _meta.fallback is undefined.

All Models Fail

When every model in the fallback chain fails, Crux throws an AggregateError containing all individual errors.

try {
  await generate(myPrompt, {
    model: fallback(modelA, modelB, modelC),
    input,
  })
} catch (error) {
  if (error instanceof AggregateError) {
    console.log(error.message)
    error.errors.forEach((cause, index) => {
      console.log(`Model ${index + 1}: ${cause.message}`)
    })
  }
}

Observability Callbacks

Use onAttemptError to log or alert when a model fails.

const model = fallback(gpt4o, claudeSonnet, {
  id: 'resilient-model',
  onAttemptError: (error, attempt, model) => {
    logger.warn('Fallback attempt failed', {
      attempt,
      model: model.modelId,
      error: error.message,
    })
  },
})

id is optional for execution, but recommended when the fallback policy is part of your authored architecture. The index records it as routing.fallback with ordered routing.fallback.option children, and runtime fallback.attempt spans include the same routingId.

Streaming

fallback() works with stream() too. Fallback only triggers on start or connection failures. Once chunks start flowing, the stream is committed to that model.

import { stream } from '@crux/ai'

const result = await stream(myPrompt, {
  model: fallback(gpt4o, claudeSonnet),
  input,
})

If the first model fails to connect, Crux retries with the backup. If the first model starts streaming and then errors mid-stream, the error propagates.

Per-Attempt Timeout

Set timeout to abort slow models before trying the next one.

const model = fallback(gpt4o, claudeSonnet, {
  timeout: 10_000,
})

Timed-out attempts are classified as timeout errors. This works with both generate() and stream().

Interaction With Flows

fallback() composes naturally with flow(). Fallback is model-level; flow retries are step-level. They stack.

const pipeline = flow('pipeline', async (flow) => {
  await flow.step('analyze', { retry: 2 }, () =>
    generate(analyzePrompt, {
      model: fallback(gpt4o, claudeSonnet),
      input,
    }),
  )
})

await pipeline.run()

API Reference

See fallback(), isFallback(), and classifyError().

On this page