surfaceextractionparse
Extractionscalar · returns varchar

PARSE

Parse, validate, or transform patterned strings using plain-English instructions

Per-row — runs once for each row.

extractionllmtextpipeline-composable

Syntax

{{ value }} PARSE {{ task }}
PARSE_VALUE({{ value }}, {{ task }})

Arguments

nametypedescription
valueVARCHARThe string value to parse or transform
taskVARCHARWhat to extract, validate, or how to transform

About

General-purpose parser for patterned string data — phone numbers, dates, addresses, IDs, codes, currencies, anything with a recognizable shape. Describe what you want in plain English and the operator figures out how to extract, validate, or transform it. Common jobs it handles: - Extracting parts of formatted data (area code from a phone, year from a date, country from an address) - Validating that a value matches an expected format - Transforming between formats (US date → ISO, digits → formatted phone, currency strings → numeric) - Cleaning data (digits only, stripped special characters) Scales well on uniform columns: the first value in a given format pattern calls the model to generate a SQL parsing expression, then every subsequent value sharing that format hits the cache and runs at native SQL speed. A column of 100k similar strings typically pays for one or two model calls and runs at DuckDB speed after that.

Examples

Extracts a numeric value from free-form currency text

SELECT
  parse_value (
    'approximately $500 million',
    'extract the numeric amount'
  )

Nearby rabbit holes

same domain
Climb back to The Looking Glass