surfaceextractionextract_structured
Extractionscalar · returns json

EXTRACT_STRUCTURED

Extract structured fields from text per a user-supplied schema

Per-row — runs once for each row.

extractionllmtext

Syntax

EXTRACT_STRUCTURED({{ text }}, {{ schema }})

Arguments

nametypedescription
textVARCHARThe text document to extract from
schemaVARCHARJSON schema of the desired output (free-form shape)

About

Schema-driven structured extraction from unstructured text. The user supplies a text column and a JSON schema describing the fields they want to pull out. The cell runs an LLM call with prompt-engineered JSON output, retries on parse failure up to 3 times, and returns the extracted object as JSON. v1 relies on the LLM's native JSON-mode output rather than true constrained decoding (xgrammar / outlines). That means very complex schemas may occasionally produce invalid JSON and need a retry — but for typical "extract N fields from a document" shapes it works reliably with any modern chat model. Schema format is free-form: the user describes the fields in any shape they want (property types, enums, descriptions, nested objects). The LLM interprets the schema and emits matching JSON. Usage: SELECT ticket_id, extract_structured(body, '{"product":"string","urgency":"integer 1-5","blocking":"boolean"}' ) AS fields FROM support_tickets; -- Access fields via JSON operators: SELECT ticket_id, fields->>'product' AS product, (fields->>'urgency')::INTEGER AS urgency, (fields->>'blocking')::BOOLEAN AS blocking FROM extracted_tickets; For tables where you want typed columns directly, compose with DuckDB's JSON accessors in a CTE — a CREATE EXTRACTION DDL that compiles the schema into a typed `returns_columns` scalar is a natural v2.

Examples

Basic field extraction with mixed types

SELECT
  extract_structured (
    'Product: Acme Widget. Shipped 2024-05-12 for $49.99 to customer Jane Doe.',
    '{"product_name":"string","ship_date":"date","total":"decimal","customer":"string"}'
  );

Nearby rabbit holes

same domain
Climb back to The Looking Glass