surfaceembeddingembed_batch_elastic
Embeddingscalar · returns varchar

EMBED_BATCH_ELASTIC

Batch embed rows and store in Elasticsearch for hybrid search

Per-row — runs once for each row.

embeddingembedding-modelspecialist-zootext

Arguments

nametypedescription
table_nameVARCHARSource table name (for tracking)
column_nameVARCHARColumn name (for metadata)
rows_jsonVARCHARJSON array of {id, text} objects
batch_size(optional)INTEGERBatch size (default 50)
index_name(optional)VARCHARElasticsearch index name

About

Batch embed rows from a table, storing results in Elasticsearch. Uses Elasticsearch's dense_vector field for hybrid search (vector + keyword). This is preferred over ClickHouse when you want BM25 keyword matching combined with semantic similarity. Accepts a JSON array of {id, text} objects. Batches API calls (50 texts per call) and bulk indexes to Elasticsearch. SQL Usage: -- Embed from any table (DuckDB or ClickHouse) SELECT embed_batch_elastic( 'products', 'description', (SELECT to_json(list({'id': CAST(id AS VARCHAR), 'text': description})) FROM products) ); -- With custom batch size and index name SELECT embed_batch_elastic('products', 'description', (SELECT ...), 100, 'my_index'); Returns JSON stats: { "rows_embedded": 850, "batches": 17, "model": "fastembed/nomic-embed-text-v1.5", "duration_seconds": 45.2, "backend": "elasticsearch" }

Nearby rabbit holes

same domain
Climb back to The Looking Glass