surfaceextractionrefresh_crawl
Extractionscalar · returns table

REFRESH_CRAWL

Re-crawl a stored web-table using its saved configuration

Per-row — runs once for each row.

extractionexternal-apiurl

Arguments

nametypedescription
table_nameVARCHARName of the table to refresh (must have been created with LARS CRAWL)

About

Re-crawl a previously created web table using stored configuration. Backend cascade for LARS REFRESH syntax. Reads the crawl config from _lars_crawl_registry, then re-executes the crawl with the original parameters.

Examples

Refresh reuses crawl settings stored in the crawl registry even when seeded outside the current UDF session

SELECT
  json_extract_string (
    skill_json (
      'python_data',
      json_object(
        'code',
        'from lars.db_adapter import get_db_adapter; db = get_db_adapter(); db.execute("""CREATE OR REPLACE TABLE _lars_crawl_registry AS SELECT * FROM (VALUES (''test_refresh_crawl'', ''https://example.com'', ''{\"title\":\"string\"}'', ''{\"limit\":1,\"prompt\":\"Extract the page title\"}'')) AS t(table_name, url, schema_json, options_json)"""); result = "ok"'
      )
    ),
    '$[0].result'
  );

SELECT
  *
FROM
  refresh_crawl ('test_refresh_crawl')
LIMIT
  1

Nearby rabbit holes

same domain
Climb back to The Looking Glass