surfaceextractionparse_address
Extractionscalar · returns json

PARSE_ADDRESS

Parse US address into JSON with fields: street_number, street_name, unit, city, state, zip, country, formatted

Per-row — runs once for each row.

extractionllmtext

Syntax

PARSE_ADDRESS({{ address }})

Arguments

nametypedescription
addressVARCHAR

About

Parse US address string into structured components. Returns JSON with street_number, street_name, unit, city, state, zip, country, formatted. Use ->> to extract fields: PARSE_ADDRESS(addr) ->> 'city' PARSE_ADDRESS(addr) ->> 'state' Backend: deterministic Python via `usaddress`, a pure-Python probabilistic parser trained on millions of US addresses. It handles: - Street number + name + type ("123 Main St") - Directional prefixes/suffixes ("N Broadway", "Main St NW") - Unit designators ("Apt 4B", "Suite 200") - City + state + ZIP (plus ZIP+4) - PO Boxes Limitations: - US addresses only. International input will often parse but may assign components to incorrect fields. - The underlying CRF model is probabilistic — unusual addresses may get misclassified. For international addresses or LLM-assisted parsing with world knowledge, use PARSE_ADDRESS_LLM — see parse_address_single_llm.cascade.yaml.

Examples

US address parsed into city/state/zip

SELECT
  parse_address ('123 Main St, Springfield, IL 62701')

Nearby rabbit holes

same domain
Climb back to The Looking Glass