Extractionscalar · returns table

WEB_EXTRACT

Extract structured data from multiple pages using LLM-powered extraction

Per-row — runs once for each row.

extractionexternal-apiurl

Arguments

name	type	description
urls_json	VARCHAR	JSON array of URL patterns, e.g. '["https://example.com/*"]'
prompt	VARCHAR	Extraction instructions describing what data to extract
schema_json	VARCHAR	JSON Schema for output columns, e.g. '{"name":"string","price":"number"}'

About

Extract structured data from multiple pages using Firecrawl /extract. Supports wildcard URL patterns for cross-page aggregation. SQL Usage: SELECT * FROM web_extract( '["https://docs.example.com/*"]', 'Extract all API endpoints', '{"endpoint": "string", "method": "string", "description": "string"}' )

Nearby rabbit holes

same domain

scalar

CRAWL_BATCH

Crawl a website and extract structured data from each page (via Firecrawl)

extractionexternal-apiurl

scalar

EXTRACT

Extract specific information from unstructured text (zero-shot NER)

extractionnlispecialist-zootext

scalar

EXTRACT_LLM

LLM-backed extraction (escape hatch for EXTRACTS)

extractionllmllm-escape-hatchtext

scalar

EXTRACT_STRUCTURED

Extract structured fields from text per a user-supplied schema

extractionllmtext

aggregate

MERGE_TIMELINES

Merge multiple timelines into unified chronological sequence

extractionllmjson

scalar

PARSE

Extract information from text using natural-language instructions

extractionllmtext

Climb back to The Looking Glass

surface↓extraction↓web_extract

Extractionscalar · returns table

WEB_EXTRACT

Extract structured data from multiple pages using LLM-powered extraction

Per-row — runs once for each row.

extractionexternal-apiurl

Arguments

name	type	description
urls_json	VARCHAR	JSON array of URL patterns, e.g. '["https://example.com/*"]'
prompt	VARCHAR	Extraction instructions describing what data to extract
schema_json	VARCHAR	JSON Schema for output columns, e.g. '{"name":"string","price":"number"}'