surfaceclassificationlooks_like
Classificationscalar · returns boolean

LOOKS_LIKE

Check if value looks like a type (fuzzy)

Per-row — runs once for each row.

classificationnlispecialist-zootext

Syntax

LOOKS_LIKE({{ value }}, '{{ expected_type }}')
{{ value }} LOOKS_LIKE '{{ expected_type }}'

Arguments

nametypedescription
valueVARCHAR
expected_typeVARCHARType to check: email, phone, address, name, date, url, ssn, etc.

About

Check if a value looks like a specific type. More forgiving than VALID — catches malformed but recognizable values. Backend: hybrid deterministic + specialist zoo fallback. For the common structured types (email, phone, url, date, address, ip, number, name, ssn, zip, uuid) we use pure-Python validators from the deterministic parsing block of pyproject — `email-validator`, `phonenumbers`, `python-dateutil`, `urllib.parse`, `usaddress`, `nameparser`, etc. These handle the fuzzy/malformed-but-recognizable cases the LLM version was catching ("john@gmail", "call 555-1234", "123 main street boston ma") by using partial-match heuristics. For arbitrary types outside that closed set, we fall back to specialist zoo zero-shot classification with a binary hypothesis ("looks like a {type}" vs "does not look like a {type}"). For LLM-style fuzzy judgment with contextual interpretation, use LOOKS_LIKE_LLM — see looks_like_single_llm.cascade.yaml.

Examples

Malformed but recognizable email

SELECT
  looks_like ('john@gmail', 'email')

Address looks like address

SELECT
  looks_like ('542 Oak Avenue, Boston MA', 'address')

Phone number recognized

SELECT
  looks_like ('(555) 123-4567', 'phone')

URL recognized

SELECT
  looks_like ('https://example.com', 'url')

Gibberish is not an email

SELECT
  looks_like ('asdfghjkl', 'email')

ISO date recognized

SELECT
  looks_like ('2024-03-15', 'date')

Nearby rabbit holes

same domain
Climb back to The Looking Glass