surfaceextractionweb_extract
Extractionscalar · returns table

WEB_EXTRACT

Extract structured data from multiple pages using LLM-powered extraction

Per-row — runs once for each row.

extractionexternal-apiurl

Arguments

nametypedescription
urls_jsonVARCHARJSON array of URL patterns, e.g. '["https://example.com/*"]'
promptVARCHARExtraction instructions describing what data to extract
schema_jsonVARCHARJSON Schema for output columns, e.g. '{"name":"string","price":"number"}'

About

Extract structured data from multiple pages using Firecrawl /extract. Supports wildcard URL patterns for cross-page aggregation. SQL Usage: SELECT * FROM web_extract( '["https://docs.example.com/*"]', 'Extract all API endpoints', '{"endpoint": "string", "method": "string", "description": "string"}' )

Nearby rabbit holes

same domain
Climb back to The Looking Glass