surfacesimilarityimage_similarity
Similarityscalar · returns double

IMAGE_SIMILARITY

Cross-modal cosine similarity between an image and a text query

Per-row — runs once for each row.

similarityembedding-modelspecialist-zooimage

Syntax

IMAGE_SIMILARITY({{ image }}, '{{ query }}')

Arguments

nametypedescription
imageVARCHAR
queryVARCHAR

About

Cross-modal image–text similarity score. Returns the raw cosine similarity in [-1, 1] (typically [0, 0.5] for SigLIP 2 in practice) between an image and a text query, computed in a single gateway round trip. Unlike IMAGE_MATCHES which returns a boolean against a threshold, IMAGE_SIMILARITY returns the raw score so you can rank, aggregate, or threshold it yourself in SQL. Example: -- Rank products by how well their photos match a description SELECT product_id, name, IMAGE_SIMILARITY(photo, 'minimalist white furniture') AS score FROM catalog ORDER BY score DESC LIMIT 20; -- Average similarity per brand SELECT brand, AVG(IMAGE_SIMILARITY(photo, 'luxury packaging')) AS avg_lux FROM catalog GROUP BY brand ORDER BY avg_lux DESC; -- Cross-modal filtering with custom threshold SELECT * FROM session_screenshots WHERE IMAGE_SIMILARITY(screenshot, 'dark mode dashboard') > 0.25; SigLIP 2 uses sigmoid loss so absolute scores are typically lower than CLIP-trained cosine scores (0.15-0.35 for real matches vs 0.3-0.7 for CLIP). The scores are still well-ordered within a dataset — use relative ranking (ORDER BY, top-K) rather than absolute thresholds when possible.

Nearby rabbit holes

same domain
Climb back to The Looking Glass