surfaceclassificationtoxicity
Classificationdimension · returns varchar

TOXICITY

Assess toxicity level for grouping (toxic-bert)

Per-row classifier — stable across GROUP BY.

classificationhybridspecialist-zootext

Arguments

nametypedescription
textVARCHAR
focusVARCHAR
num_levelsINTEGER

About

Toxicity/civility analyzer — buckets text into toxicity levels for GROUP BY analysis. Backend: specialist zoo `unitary/toxic-bert` via /toxicity. toxic-bert is a purpose-built multi-label classifier trained on the Jigsaw Unintended Bias dataset with 6 independent heads: toxic, severe_toxic, obscene, threat, insult, identity_hate The cascade takes the `toxic` head's probability for each text and maps it into 5 discrete buckets for grouping: 0.00-0.10 → Very Low Toxicity 0.10-0.30 → Low Toxicity 0.30-0.50 → Moderate Toxicity 0.50-0.75 → High Toxicity 0.75-1.00 → Very High Toxicity Optional `focus` argument (harassment, hate_speech, obscene, threat, insult, identity_hate) picks a different head to bucket on. The default focus is the general `toxic` head. For LLM-style assessment with nuanced civility rules, use TOXICITY_LLM — see toxicity_dimension_llm.cascade.yaml.

Examples

Polite message gets low toxicity bucket

SELECT
  toxicity ('Thank you for your help!')

Hostile message gets high toxicity bucket

SELECT
  toxicity ('you are all garbage and I hate you')

Nearby rabbit holes

same domain
Climb back to The Looking Glass