Crux
Foundations

Comparison

How Crux compares to SDKs, prompt strings, frameworks, tracing tools, and eval tools.

Crux is not trying to replace your SDK, provider, app framework, agent framework, RAG stack, tracing pipeline, or eval tool. It helps with the surrounding AI code that is easy to scatter across an app: prompts, context, memory, tools, safety, routing, quality checks, and debugging.

The key distinction is where Crux sits. Many tools help you inspect or score what happened after the model responded. Crux also helps you prepare what the model sees before the call and test whether that setup keeps working.

Use this page as a fit check. For small scripts, raw SDK calls may be enough. For shared context, memory, retrieval, safety, routing, or repeatable quality checks, Crux starts to help.

vs. Vercel AI SDK (alone)

The Vercel AI SDK is an excellent execution and UI toolkit. Crux uses it as a primary adapter via @crux/ai. The comparison is about what you add when you organize the prompt setup around that execution.

AI SDK aloneAI SDK + Crux
Prompt definitionInline system/prompt strings per callTyped, reusable prompt() with schemas
Context compositionManual string concatenationDeclarative context() with priority-based merging
Request setupOwned by application codePrompt and context setup is reusable and inspectable
Multi-providerProvider-specific model importsSame prompt definition, adapter at execution boundary
QualityNot includedSuites, targets, experiments, cassettes, and baselines
MemoryNot includedWorking, episodic, semantic — with asContext() / asTools()
CompactionNot includedSliding window, budget tracking, key facts extraction
DebuggingTrace the model callInspect what was assembled around the call

Bottom line: AI SDK handles execution and UI. Crux helps organize, inspect, and test the prompt setup around that execution. They are complementary.

vs. Raw SDK calls

Raw SDK calls (OpenAI SDK, Anthropic SDK, Google GenAI) give you maximum control and minimal ceremony. Crux sits around the call: it composes context, validates schemas, adds safety, quality, and observability, then delegates execution to the SDK. You keep full SDK access through adapt.

Raw SDKCrux + adapter
Type safetyManual JSON.parse + castingZod schemas, fully inferred types
Context reuseCopy-paste strings between promptsComposable context() fragments with priority
Provider switchRewrite every call siteChange one adapter import
QualityBuild your own testsquality() suites, targets, variants, and comparisons
ObservabilityRequest logs and custom telemetryDevtools show assembled requests and related details
Token managementCount tokens manuallyBudget tracking + automatic context dropping

Bottom line: Use raw SDKs for one-off scripts or when you want to own every concern manually. Use Crux when shared context, memory, safety, routing, or quality needs a durable home.

vs. Prompt strings in code

If your app has one or two simple prompts, raw template strings are fine. Crux adds value when complexity grows. Here's where the line is.

Template stringsCrux
1–2 promptsSimple and sufficientUnnecessary overhead
Shared contextCopy-paste between filesDefine once, compose with use: [...]
Structured outputJSON.parse + manual validationZod schema → typed result.object
Multiple modelsSeparate code paths per providerSame prompt, different adapter
Quality assuranceManual testing in playgroundAutomated eval across model matrix
Team collaborationHard to discover, hard to reviewRegistry, tags, devtools, introspection

Bottom line: Start with strings. Adopt Crux when you have shared context, need structured output, or want automated evaluation.

vs. tracing and observability tools

Tracing tools are useful and often necessary. They show requests, spans, latency, cost, errors, and production behavior. Crux can emit OpenTelemetry spans into those pipelines.

The difference is that Crux also knows how the request was assembled. A runtime-only trace can show what was sent; Crux can connect the sent request back to authored prompts, contexts, memory, retrieval, routing, safety, and quality records.

Tracing toolCrux
Primary lensWhat happened after executionWhat was assembled before execution and what happened after
Source linkUsually runtime span metadataCrux can connect runtime behavior to authored definitions
Setup detailsCustom attributes if you emit themPrompt/context/memory/tool setup can be shown by Crux
Quality connectionExternal or customQuality suites and baselines use the same Crux definitions
Production pipelineStrong export and analysis workflowsCrux can emit compatible spans into those systems

Bottom line: Keep your tracing stack. Use Crux when you also need to understand how the request was assembled from code.

vs. eval tools

Eval tools judge behavior. Crux's quality system also cares about the setup that produced the behavior.

Eval toolCrux quality
Common assertionDid the final output pass?Did the output and the setup meet expectations?
Inputs under testPrompt, model, fixture, scorerPrompt, context, memory, retrieval, routing, safety, fallback
BaselinesOften score snapshotsRecorded quality expectations tied to source and runtime facts
System checksCustom or tool-specificCrux is expanding first-class checks for setup behavior

Bottom line: If an eval tool already fits your team, Crux can complement it. Crux matters when you need to test how the answer was produced, not only the text that came back.

vs. LangChain / LlamaIndex

LangChain and LlamaIndex are broad orchestration and retrieval frameworks. They can own chains, agents, retrieval, vector stores, and execution end-to-end. Crux is the lightweight alternative for TypeScript teams that want typed AI building blocks around their existing SDK call.

LangChain / LlamaIndexCrux
ScopeFull orchestration (chains, agents, RAG, vector stores)Building blocks around the SDK call (prompts, memory, retrieval, safety, evals, observability)
ExecutionOwn runtime with own abstractionsDelegates to AI SDK / OpenAI / Google / Anthropic directly
TypeScriptPorted from Python, partial type coverageTypeScript-first, full inference from Zod schemas
API surfaceLarge — many concepts, many ways to do thingsSmall — ~10 core functions, consistent interfaces
IntegrationUse their patterns or fight the frameworkCompose into any existing architecture
MemoryBuilt-in vector store + retriever abstractionsBlock-based memory + pluggable stores

Bottom line: Choose a framework when you want it to own the workflow. Choose Crux when you want reusable AI building blocks, debugging, and quality checks without moving your app into a new runtime.

Where to next

On this page