surfaceclassificationclassify_collection_llm
Classificationaggregate · returns varchar

CLASSIFY_COLLECTION_LLM

LLM-backed collection classification (escape hatch for CLASSIFY)

Per-group — reads the whole group in one call.

classificationllmllm-escape-hatchscales-largejson

Syntax

CLASSIFY_LLM({{ texts }}, '{{ categories }}')
CLASSIFY_LLM({{ texts }}, '{{ categories }}', '{{ prompt }}')

Arguments

nametypedescription
textsJSON
categoriesVARCHAR
prompt(optional)VARCHAR

About

LLM-backed escape hatch for collection classification. Use when the canonical embedding path (CLASSIFY) is insufficient because the classification requires: - Custom prompt criteria that the embedding centroid can't express - Fine-grained reasoning about the overall theme - Small collections where prompt engineering matters more than speed The LLM version shoves every text into a single prompt and asks the model to pick ONE category. This is context-window-bound — only use it for small collections (hundreds of texts, not thousands). For scalable collection classification, prefer CLASSIFY — it uses embedding majority voting and handles 100K+ texts in seconds.

Examples

LLM escape hatch classifies pet-related texts

WITH
  test_data AS (
    SELECT
      *
    FROM
      (
        VALUES
          ('The cat sat on the mat'),
          ('Dogs love to play fetch'),
          ('My parrot talks all day'),
          ('Fish swim in the tank')
      ) AS t (text)
  )
SELECT
  CLASSIFY_LLM (text, 'animal stories, food recipes, sports news')
FROM
  test_data

Nearby rabbit holes

same domain
Climb back to The Looking Glass