Fallback
Automatically try backup models when the primary model fails.
fallback() wraps multiple models into a single model reference. It tries each model in order until one succeeds.
Use fallback when a provider is down, rate-limited, timing out, or returning another retryable error. Fallback is not a quality gate: if the first model succeeds with a weak answer, fallback is done. Use cascade() for quality escalation.
import { fallback } from '@crux/core/routing'
import { generate } from '@crux/ai'
import { openai } from '@ai-sdk/openai'
import { anthropic } from '@ai-sdk/anthropic'
const model = fallback(
openai('gpt-4o'),
anthropic('claude-sonnet-4-20250514'),
{
id: 'primary-with-backup',
timeout: 10_000,
},
)
const result = await generate(myPrompt, {
model,
input: { query: 'Explain quantum computing' },
})If OpenAI returns a rate limit or server error, Crux automatically retries with Claude. No call-site try/catch is needed for the normal fallback path.
When Do You Need This?
- Production resilience: your app should keep working during provider outages.
- Rate limit handling: high-volume apps can hit provider limits.
- Provider diversity: backup models can come from a different provider.
- Validation recovery: a stronger model may succeed after structured-output validation retries are exhausted.
If you only use one model provider and are comfortable with errors propagating, you do not need fallback().
Error Classification
Not all errors trigger fallback. Crux classifies errors into categories:
| Category | Triggers | Examples |
|---|---|---|
rate_limit | HTTP 429 | Provider rate limit exceeded |
timeout | ETIMEDOUT, AbortError | Model took too long to respond |
server_error | HTTP 500-599 | Provider internal error, bad gateway |
connection_error | ECONNREFUSED, ENOTFOUND, fetch failures | DNS failure, provider unreachable |
auth_error | HTTP 401, 403 | Invalid API key, insufficient permissions |
validation_exhausted | ValidationExhaustedError | All validation retries exhausted |
Validation errors, content policy violations, and unknown errors do not trigger fallback by default. They are thrown immediately because retrying with a different model usually does not help.
Filter Error Categories
By default, classified retryable errors trigger fallback. Use on to restrict the categories.
const model = fallback(primary, backup, {
on: ['rate_limit', 'timeout'],
})Server errors and connection errors would be thrown immediately in this configuration.
Custom Error Predicates
For edge cases beyond the built-in categories, provide a shouldFallback predicate. It takes priority over on.
const model = fallback(gpt4o, claudeSonnet, {
shouldFallback: (error) => {
if (error.code === 'content_filter') return true
if (error.status === 429) return true
return false
},
})Inspecting Results
When fallback occurs, the result includes metadata:
const result = await generate(myPrompt, {
model: fallback(gpt4o, claudeSonnet),
input,
})
if (result._meta.fallback) {
console.log(`Took ${result._meta.fallback.attempts} attempts`)
console.log(`Failed models: ${result._meta.fallback.failedModels.join(', ')}`)
}The _meta.fallback object contains:
| Field | Type | Description |
|---|---|---|
attempts | number | Total attempts including the successful one |
failedModels | string[] | Model IDs that failed |
details | FallbackAttemptDetail[] | Per-attempt breakdown |
Each detail entry has model, durationMs, status, error, errorCategory, and cost when available.
When the first model succeeds, _meta.fallback is undefined.
All Models Fail
When every model in the fallback chain fails, Crux throws an AggregateError containing all individual errors.
try {
await generate(myPrompt, {
model: fallback(modelA, modelB, modelC),
input,
})
} catch (error) {
if (error instanceof AggregateError) {
console.log(error.message)
error.errors.forEach((cause, index) => {
console.log(`Model ${index + 1}: ${cause.message}`)
})
}
}Observability Callbacks
Use onAttemptError to log or alert when a model fails.
const model = fallback(gpt4o, claudeSonnet, {
id: 'resilient-model',
onAttemptError: (error, attempt, model) => {
logger.warn('Fallback attempt failed', {
attempt,
model: model.modelId,
error: error.message,
})
},
})id is optional for execution, but recommended when the fallback policy is part of your authored architecture. The index records it as routing.fallback with ordered routing.fallback.option children, and runtime fallback.attempt spans include the same routingId.
Streaming
fallback() works with stream() too. Fallback only triggers on start or connection failures. Once chunks start flowing, the stream is committed to that model.
import { stream } from '@crux/ai'
const result = await stream(myPrompt, {
model: fallback(gpt4o, claudeSonnet),
input,
})If the first model fails to connect, Crux retries with the backup. If the first model starts streaming and then errors mid-stream, the error propagates.
Per-Attempt Timeout
Set timeout to abort slow models before trying the next one.
const model = fallback(gpt4o, claudeSonnet, {
timeout: 10_000,
})Timed-out attempts are classified as timeout errors. This works with both generate() and stream().
Interaction With Flows
fallback() composes naturally with flow(). Fallback is model-level; flow retries are step-level. They stack.
const pipeline = flow('pipeline', async (flow) => {
await flow.step('analyze', { retry: 2 }, () =>
generate(analyzePrompt, {
model: fallback(gpt4o, claudeSonnet),
input,
}),
)
})
await pipeline.run()API Reference
See fallback(), isFallback(), and classifyError().