Re-crawl a stored web-table using its saved configuration
Per-row — runs once for each row.
| name | type | description |
|---|---|---|
| table_name | VARCHAR | Name of the table to refresh (must have been created with LARS CRAWL) |
Refresh reuses crawl settings stored in the crawl registry even when seeded outside the current UDF session
SELECT
json_extract_string (
skill_json (
'python_data',
json_object(
'code',
'from lars.db_adapter import get_db_adapter; db = get_db_adapter(); db.execute("""CREATE OR REPLACE TABLE _lars_crawl_registry AS SELECT * FROM (VALUES (''test_refresh_crawl'', ''https://example.com'', ''{\"title\":\"string\"}'', ''{\"limit\":1,\"prompt\":\"Extract the page title\"}'')) AS t(table_name, url, schema_json, options_json)"""); result = "ok"'
)
),
'$[0].result'
);
SELECT
*
FROM
refresh_crawl ('test_refresh_crawl')
LIMIT
1Crawl a website and extract structured data from each page (via Firecrawl)
Extract specific information from unstructured text (zero-shot NER)
LLM-backed extraction (escape hatch for EXTRACTS)
Extract structured fields from text per a user-supplied schema
Merge multiple timelines into unified chronological sequence
Extract information from text using natural-language instructions
Re-crawl a stored web-table using its saved configuration
Per-row — runs once for each row.
| name | type | description |
|---|---|---|
| table_name | VARCHAR | Name of the table to refresh (must have been created with LARS CRAWL) |
Refresh reuses crawl settings stored in the crawl registry even when seeded outside the current UDF session
SELECT
json_extract_string (
skill_json (
'python_data',
json_object(
'code',
'from lars.db_adapter import get_db_adapter; db = get_db_adapter(); db.execute("""CREATE OR REPLACE TABLE _lars_crawl_registry AS SELECT * FROM (VALUES (''test_refresh_crawl'', ''https://example.com'', ''{\"title\":\"string\"}'', ''{\"limit\":1,\"prompt\":\"Extract the page title\"}'')) AS t(table_name, url, schema_json, options_json)"""); result = "ok"'
)
),
'$[0].result'
);
SELECT
*
FROM
refresh_crawl ('test_refresh_crawl')
LIMIT
1Crawl a website and extract structured data from each page (via Firecrawl)
Extract specific information from unstructured text (zero-shot NER)
LLM-backed extraction (escape hatch for EXTRACTS)
Extract structured fields from text per a user-supplied schema
Merge multiple timelines into unified chronological sequence
Extract information from text using natural-language instructions