Statistical imputation for missing values (distribution-aware)
Per-row — runs once for each row.
IMPUTE({{ value }}, '{{ context }}')IMPUTE({{ value }}, '{{ context }}', '{{ field_name }}'){{ value }} IMPUTED FROM '{{ context }}'| name | type | description |
|---|---|---|
| value | VARCHAR | — |
| context | VARCHAR | Statistical context: group values, similar records, distribution info |
| field_name(optional) | VARCHAR | Field name to impute (e.g., first_name, age) |
First name imputed from context
SELECT
impute ('unknown', 'name: John Smith', 'first_name')Remove or mask personally identifiable information from text
Type-cast messy real-world values that trip up standard CAST
Return the canonical/official form of a value (auto-detects entity type)
Extracts 4-digit year from messy text, returns -1 if undetermined
LLM-backed year extraction (escape hatch for CLEAN_YEAR)
Pick the best non-null value from a group (quality-aware COALESCE)
Statistical imputation for missing values (distribution-aware)
Per-row — runs once for each row.
IMPUTE({{ value }}, '{{ context }}')IMPUTE({{ value }}, '{{ context }}', '{{ field_name }}'){{ value }} IMPUTED FROM '{{ context }}'| name | type | description |
|---|---|---|
| value | VARCHAR | — |
| context | VARCHAR | Statistical context: group values, similar records, distribution info |
| field_name(optional) | VARCHAR | Field name to impute (e.g., first_name, age) |
First name imputed from context
SELECT
impute ('unknown', 'name: John Smith', 'first_name')Remove or mask personally identifiable information from text
Type-cast messy real-world values that trip up standard CAST
Return the canonical/official form of a value (auto-detects entity type)
Extracts 4-digit year from messy text, returns -1 if undetermined
LLM-backed year extraction (escape hatch for CLEAN_YEAR)
Pick the best non-null value from a group (quality-aware COALESCE)