Retrieval & RAG

The full Crux path for loading documents, indexing them, querying them, and injecting evidence into prompts.

Retrieval is the part of Crux that turns external knowledge into prompt-ready evidence.

The useful mental model is two paths:

Write path:
source -> loader -> corpus.sync() -> indexer() -> records in DataStore + vectors in VectorStore
                                      ^
                                      |
                                embedding()

Read path:
question -> retriever() -> hits -> prompt context or tools
               ^
               |
         embedding() + DataStore + VectorStore

Optional read layers:
retrievalPipeline(retriever) -> better selected hits
grounding({ retriever })     -> controlled evidence + validated citations

Each primitive owns one job:

Primitive	Job	Start here
`embedding()`	Turn text into dense or sparse vectors.	Embeddings
`@crux/ingest`	Load text, files, folders, and URLs into documents.	Ingestion
`indexer()`	Chunk documents, embed chunks, and write records.	Indexing
`corpus()`	Keep repeated indexing jobs incremental and safe.	Corpus sync
`retriever()`	Query the corpus directly, as prompt context, or as tools.	Querying
`retrievalPipeline()`	Improve query-time retrieval with planning, expansion, compression, diversity, and recency.	Pipelines
`grounding()`	Inject evidence and validate citation contracts.	Grounding

The Normal Product Path

Most apps start with this shape: define embeddings once, index a corpus, then use a retriever directly in a prompt.

import { prompt } from '@crux/core'
import { embedding } from '@crux/openai'
import { corpus, indexer } from '@crux/core/indexing'
import { retriever } from '@crux/core/retrieval'
import { inMemoryDataStore, inMemoryVectorStore } from '@crux/core/storage'
import { filesSource } from '@crux/ingest/files'
import { z } from 'zod'

const dense = embedding({
  name: 'docs',
  model: 'text-embedding-3-small',
})
const data = inMemoryDataStore()
const vectors = inMemoryVectorStore()

const docsIndexer = indexer({
  id: 'docs',
  namespace: 'product-docs',
  data,
  vectors,
  dense,
})

const docsCorpus = corpus({
  id: 'docs',
  namespace: 'product-docs',
  data,
  indexer: docsIndexer,
})

await docsCorpus.sync(
  filesSource({ directory: './docs', recursive: true }, { namespace: 'product-docs' }).load(),
  { sourceSet: 'complete', stale: 'delete' },
)

const docs = retriever({
  id: 'docs',
  namespace: 'product-docs',
  data,
  vectors,
  dense,
  context: {
    query: ({ question }) => question,
    limit: 6,
  },
})

export const answerDocs = prompt({
  id: 'answer-docs',
  use: [docs],
  input: z.object({ question: z.string() }),
  system: 'Answer from the retrieved product docs.',
  prompt: ({ input }) => input.question,
})

You can add sparse embeddings and switch the retriever to hybrid mode without changing the shape:

const docsIndexer = indexer({
  id: 'docs',
  namespace: 'product-docs',
  data,
  vectors,
  dense,
  sparse,
})

const docs = retriever({
  id: 'docs',
  namespace: 'product-docs',
  data,
  vectors,
  dense,
  sparse,
  search: { mode: 'hybrid', fusion: 'dbsf', limit: 8 },
})

Hybrid is retrieval composition over dense and sparse vectors. It is not an embedding kind.

Which Page Should I Read?

If you are new to RAG, read the pages in sidebar order. They follow the real lifecycle: vectors, sources, indexing, querying, advanced read-side behavior, citation contracts, then production sync.

If you already know what you need, jump directly:

Embeddings

Dense, sparse, hybrid, preprocessing, truncation, retries, rate limits, and cache.

Ingestion

Text, files, URLs, built-in parsers, OCR hooks, and custom loaders.

Indexing

Pipelines, transforms, stage cache, and dry runs for the write side of retrieval.

Chunkers

Structured, plain text, parent-child, semantic, and custom chunking strategies.

Querying

Dense, sparse, hybrid, prompt `use`, tools, custom retrievers, and rerankers.

Pipelines

Query planning, multi-query, parent expansion, compression, diversity, decay, and custom stages.

Grounding

When retrieval must become cited, validated evidence.

Corpus Sync

Repeated indexing jobs, stale deletion, dry runs, source ledgers, and cache control.

What To Use When

Use memory() for agent state and learned user/project knowledge:

const assistantMemory = memory({
  id: 'assistant',
  blocks: [recentMessages(), facts({ embed: dense })],
})

Use retrieval for documents, help centers, source files, tickets, PDFs, spreadsheets, or anything that should be searched as an external corpus:

await docsCorpus.sync(loader.load())
const hits = await docs.retrieve('enterprise SSO setup')

Use retrievalPipeline() when the retriever finds candidates but needs better query planning, broader recall, parent context, compression, diversity, or recency scoring:

const advancedDocs = retrievalPipeline(docs, [
  multiQuery({ generate, model, count: 4 }),
  parentExpand({ store }),
  diversify({ strategy: 'mmr', limit: 8 }),
])

Use grounding() when the answer must cite retrieved evidence and Crux should validate the citation contract:

const groundedDocs = grounding({
  id: 'docs',
  retriever: advancedDocs,
  query: ({ input }) => input.question,
  citations: { required: true, quotes: 'required' },
})

Use a Quality suite when you want to prove the setup still works after changing chunking, retrieval, grounding, or prompts:

import { expect, quality, suite, target } from '@crux/core/quality'

await quality({ id: 'docs-rag' }).evaluate({
  suite: suite<{ question: string }>('docs', (test) => {
    test('enterprise SSO', {
      input: { question: 'How do I configure SSO?' },
      expect: (ctx) => expect.retrieval(ctx).toContainHit({ sourceId: 'sso.md' }),
    })
  }),
  target: target.retriever(advancedDocs, {
    query: ({ question }) => question,
  }),
})

For the full quality loop, see RAG Quality.

Primitive Boundaries

The boundaries are there so users can swap one layer without rewriting the rest:

// Vector generation policy
embedding({ preprocess, truncate, retry, cache })

// Source loading and parsing
filesSource({ directory: './docs' }).load()

// Write-time document preparation
indexer({ pipeline: indexingPipeline({ chunker: chunker.structured() }) })

// Read-time search defaults
retriever({ search: { mode: 'hybrid', limit: 8 } })

// Query-time result shaping
retrievalPipeline(docs, [multiQuery(...), compress(...)])

// Evidence injection and citation validation
grounding({ retriever: docs, citations: { required: true } })

If your backend already has its own search API, use a custom retriever({ async retrieve() { ... } }). Crux does not force advanced users through a vector-store abstraction.