Extract key facts
Pull typed structured data out of a conversation with a Zod schema.
Where summarization produces prose, extraction produces typed objects — decisions, action items, open questions, named entities, or any shape you define. It's a one-shot, stateless wrapper around generateObject tuned for "what's the structured signal in this messy chat?"
What problem does this solve?
When a conversation gets compacted, prose summaries lose structure. "We decided to ship by Friday and Sarah owns the migration" is fine for the model to read but useless for tooling — you can't filter, count, or programmatically act on a sentence.
Extraction produces typed objects that survive compaction: structured decisions, structured action items, structured open questions. Pair it with sliding window — extract structure before the window evicts and summarizes, so the structure persists in your application layer while prose gets compressed.
When should I use it?
- You want typed output, not prose, from a conversation
- You'll act on the data programmatically — filter, count, render in a UI
- The data has a fixed shape — Zod schema works
When should I NOT use it?
- You want a narrative summary for the next LLM call — use summarize messages
- The data is unbounded in shape (anything could be relevant) — extraction won't fit a schema
- You only need a single field — overkill, just summarize and parse
Quick start
import { z } from 'zod'
import { generateObjectFn } from '@crux/ai'
import { extractKeyFacts } from '@crux/core/compaction'
const facts = await extractKeyFacts({
messages: conversation,
generate: generateObjectFn,
model: myModel,
schema: z.object({
decisions: z.array(z.string()),
actionItems: z.array(
z.object({ task: z.string(), assignee: z.string() })
),
openQuestions: z.array(z.string()),
}),
})
// facts is fully typed:
// { decisions: string[]; actionItems: { task: string; assignee: string }[]; openQuestions: string[] }Designing the schema
Three patterns work well:
Decisions and questions — capture what was agreed and what's unresolved:
schema: z.object({
decisions: z.array(z.string()),
openQuestions: z.array(z.string()),
})Action items with ownership — for project conversations:
schema: z.object({
actions: z.array(
z.object({
task: z.string(),
assignee: z.string(),
deadline: z.string().optional(),
blockers: z.array(z.string()).default([]),
})
),
})Named entities and bindings — for agent conversations where the LLM needs to remember "the project = Atlas":
schema: z.object({
bindings: z.array(
z.object({
term: z.string(), // the user's shorthand
refers_to: z.string(), // canonical name
})
),
})Combining with sliding window
Pair extraction with sliding window so structured facts persist as a side-effect of compaction. The window owns its own eviction; you observe the prose summary via getMessages() and run extraction periodically against the messages you've collected.
const window = createSlidingWindow({ id: 'chat', windowSize: 20, /* ... */ })
// Track recent turns yourself for extraction
const recentTurns: Message[] = []
await window.push(newTurn)
recentTurns.push(newTurn)
// Every N turns, extract structure from the buffer and trim it
if (recentTurns.length >= 5) {
const facts = await extractKeyFacts({
messages: recentTurns,
generate: generateObjectFn,
model: cheapModel,
schema: actionSchema,
})
await persistFacts(facts)
recentTurns.length = 0
}The sliding window keeps its own running summary internally. Extraction runs on whatever message buffer you control, on whatever cadence you pick.
This way the prose summary keeps narrative context for the LLM, and the structured facts persist in your application database for filtering, querying, and UI rendering.