surfaceextractionparse_email
Extractionscalar · returns json

PARSE_EMAIL

Parse raw email into structured JSON (from, to, cc, date, subject, body)

Per-row — runs once for each row.

extractionllmemail

Syntax

PARSE_EMAIL({{ raw_email }})

Arguments

nametypedescription
raw_emailVARCHARRaw email text with headers

About

Parse raw email text into structured fields. Extracts from, to, cc, date, subject, body, and any standard MIME headers. Backend: deterministic Python via `email.parser` (stdlib) + `email-validator` for address syntax validation. Handles: - Standard RFC 5322 header parsing - Multi-line (folded) header continuation - Address list parsing for to/cc/bcc (RFC 5322 address lists) - Plain-text body extraction via .get_payload() - Date parsing via email.utils.parsedate_to_datetime MIME multipart message handling: if the message is multipart, the first text/plain part is returned as the body. For full multipart / attachment handling, use an explicit email library at the ingestion layer — this cascade is for "extract the key fields" not "full mailbox processing." For LLM-style parsing (heavily corrupted or non-RFC-compliant emails), use PARSE_EMAIL_LLM — see parse_email_llm.cascade.yaml.

Examples

Extracts sender details from a raw email

SELECT
  parse_email (
    'From: john@example.com
To: jane@example.com
Subject: Hello

This is the body.'
  )

Nearby rabbit holes

same domain
Climb back to The Looking Glass