surfacecleaningcanonical
Cleaningscalar · returns varchar

CANONICAL

Return the canonical/official form of a value (auto-detects entity type)

Per-row — runs once for each row.

cleaningllmtext

Syntax

CANONICAL({{ value }})
{{ value }} CANONICAL

Arguments

nametypedescription
valueVARCHAR

About

Return the canonical, standard form of a value — whatever the input actually refers to. Auto-detects the type of entity (company, city, person name, brand, product) and hands back the widely-used official representation. Useful for normalizing messy data before grouping, joining, or reporting. "Microsoft Corp.", "MSFT", "microsoft corporation", and "Microsoft Inc." all collapse to "Microsoft" when passed through CANONICAL. Prefer NORMALIZE when you already know the type (`NORMALIZE(value, 'email')`) and don't want the operator to spend a call detecting it.

Examples

Company name canonicalized

SELECT
  canonical ('MICROSOFT CORPORATION')

City abbreviation expanded

SELECT
  canonical ('NYC')

Nearby rabbit holes

same domain
Climb back to The Looking Glass