Crux
GuidesRetrieval & RAG

Retrieval & RAG

The full Crux path for loading documents, indexing them, querying them, and injecting evidence into prompts.

Retrieval is the part of Crux that turns external knowledge into prompt-ready evidence.

The useful mental model is two paths:

Write path:
source -> loader -> corpus.sync() -> indexer() -> records in DataStore + vectors in VectorStore
                                      ^
                                      |
                                embedding()

Read path:
question -> retriever() -> hits -> prompt context or tools
               ^
               |
         embedding() + DataStore + VectorStore

Optional read layers:
retrievalPipeline(retriever) -> better selected hits
grounding({ retriever })     -> controlled evidence + validated citations

Each primitive owns one job:

PrimitiveJobStart here
embedding()Turn text into dense or sparse vectors.Embeddings
@crux/ingestLoad text, files, folders, and URLs into documents.Ingestion
indexer()Chunk documents, embed chunks, and write records.Indexing
corpus()Keep repeated indexing jobs incremental and safe.Corpus sync
retriever()Query the corpus directly, as prompt context, or as tools.Querying
retrievalPipeline()Improve query-time retrieval with planning, expansion, compression, diversity, and recency.Pipelines
grounding()Inject evidence and validate citation contracts.Grounding

The Normal Product Path

Most apps start with this shape: define embeddings once, index a corpus, then use a retriever directly in a prompt.

import { prompt } from '@crux/core'
import { embedding } from '@crux/openai'
import { corpus, indexer } from '@crux/core/indexing'
import { retriever } from '@crux/core/retrieval'
import { inMemoryDataStore, inMemoryVectorStore } from '@crux/core/storage'
import { filesSource } from '@crux/ingest/files'
import { z } from 'zod'

const dense = embedding({
  name: 'docs',
  model: 'text-embedding-3-small',
})
const data = inMemoryDataStore()
const vectors = inMemoryVectorStore()

const docsIndexer = indexer({
  id: 'docs',
  namespace: 'product-docs',
  data,
  vectors,
  dense,
})

const docsCorpus = corpus({
  id: 'docs',
  namespace: 'product-docs',
  data,
  indexer: docsIndexer,
})

await docsCorpus.sync(
  filesSource({ directory: './docs', recursive: true }, { namespace: 'product-docs' }).load(),
  { sourceSet: 'complete', stale: 'delete' },
)

const docs = retriever({
  id: 'docs',
  namespace: 'product-docs',
  data,
  vectors,
  dense,
  context: {
    query: ({ question }) => question,
    limit: 6,
  },
})

export const answerDocs = prompt({
  id: 'answer-docs',
  use: [docs],
  input: z.object({ question: z.string() }),
  system: 'Answer from the retrieved product docs.',
  prompt: ({ input }) => input.question,
})

You can add sparse embeddings and switch the retriever to hybrid mode without changing the shape:

const docsIndexer = indexer({
  id: 'docs',
  namespace: 'product-docs',
  data,
  vectors,
  dense,
  sparse,
})

const docs = retriever({
  id: 'docs',
  namespace: 'product-docs',
  data,
  vectors,
  dense,
  sparse,
  search: { mode: 'hybrid', fusion: 'dbsf', limit: 8 },
})

Hybrid is retrieval composition over dense and sparse vectors. It is not an embedding kind.

Which Page Should I Read?

If you are new to RAG, read the pages in sidebar order. They follow the real lifecycle: vectors, sources, indexing, querying, advanced read-side behavior, citation contracts, then production sync.

If you already know what you need, jump directly:

What To Use When

Use memory() for agent state and learned user/project knowledge:

const assistantMemory = memory({
  id: 'assistant',
  blocks: [recentMessages(), facts({ embed: dense })],
})

Use retrieval for documents, help centers, source files, tickets, PDFs, spreadsheets, or anything that should be searched as an external corpus:

await docsCorpus.sync(loader.load())
const hits = await docs.retrieve('enterprise SSO setup')

Use retrievalPipeline() when the retriever finds candidates but needs better query planning, broader recall, parent context, compression, diversity, or recency scoring:

const advancedDocs = retrievalPipeline(docs, [
  multiQuery({ generate, model, count: 4 }),
  parentExpand({ store }),
  diversify({ strategy: 'mmr', limit: 8 }),
])

Use grounding() when the answer must cite retrieved evidence and Crux should validate the citation contract:

const groundedDocs = grounding({
  id: 'docs',
  retriever: advancedDocs,
  query: ({ input }) => input.question,
  citations: { required: true, quotes: 'required' },
})

Use a Quality suite when you want to prove the setup still works after changing chunking, retrieval, grounding, or prompts:

import { expect, quality, suite, target } from '@crux/core/quality'

await quality({ id: 'docs-rag' }).evaluate({
  suite: suite<{ question: string }>('docs', (test) => {
    test('enterprise SSO', {
      input: { question: 'How do I configure SSO?' },
      expect: (ctx) => expect.retrieval(ctx).toContainHit({ sourceId: 'sso.md' }),
    })
  }),
  target: target.retriever(advancedDocs, {
    query: ({ question }) => question,
  }),
})

For the full quality loop, see RAG Quality.

Primitive Boundaries

The boundaries are there so users can swap one layer without rewriting the rest:

// Vector generation policy
embedding({ preprocess, truncate, retry, cache })

// Source loading and parsing
filesSource({ directory: './docs' }).load()

// Write-time document preparation
indexer({ pipeline: indexingPipeline({ chunker: chunker.structured() }) })

// Read-time search defaults
retriever({ search: { mode: 'hybrid', limit: 8 } })

// Query-time result shaping
retrievalPipeline(docs, [multiQuery(...), compress(...)])

// Evidence injection and citation validation
grounding({ retriever: docs, citations: { required: true } })

If your backend already has its own search API, use a custom retriever({ async retrieve() { ... } }). Crux does not force advanced users through a vector-store abstraction.

On this page