surfaceanalysiskaplan_meier
Analysisscalar · returns json

KAPLAN_MEIER

Kaplan-Meier survival curve — time-to-event estimator with 95% CI

Per-row — runs once for each row.

analysisdeterministicjson

Arguments

nametypedescription
durationsJSONArray of time-to-event values (use ARRAY_AGG in caller)
eventsJSONArray of booleans, true if event happened, false if censored

About

Kaplan-Meier survival estimator as a SQL aggregate. Takes a duration column and an event column, returns the full survival curve as a JSON struct. The ONLY time-to-event primitive anybody has ever exposed as a standard SQL function — and the math is 15 lines of lifelines. Use this when your data is structured as (cohort, duration, event): • cohort — what group does this row belong to (optional; use GROUP BY) • duration — how long until the event, OR how long we observed if censored • event — 1/true if the event happened, 0/false if censored Works for: • Customer churn (duration = tenure, event = churned?) • Equipment failure (duration = uptime, event = failed?) • Time-to-first-purchase (duration = days since signup, event = bought?) • Patient outcomes (duration = follow-up days, event = recurrence?) Returns a JSON struct: { "timeline": [0, 1, 2, ...], # time points "survival": [1.00, 0.98, 0.95, ...], # S(t) at each point "ci_upper": [1.00, 0.99, 0.97, ...], # 95% CI upper "ci_lower": [1.00, 0.96, 0.93, ...], # 95% CI lower "median_survival_time": 47, # duration at which S(t) crosses 0.5 "n_observed": 500, "n_events": 234 } With a GROUP BY, you get one survival curve per stratum (use ARRAY_AGG to collect the per-stratum columns into the scalar call): SELECT plan_tier, kaplan_meier(ARRAY_AGG(tenure_days), ARRAY_AGG(churned)) AS curve FROM customers GROUP BY plan_tier;

Examples

Basic survival curve fit

WITH
  cohort AS (
    SELECT
      *
    FROM
      (
        VALUES
          (10.0, TRUE),
          (12.0, TRUE),
          (15.0, FALSE),
          (18.0, TRUE),
          (22.0, TRUE),
          (25.0, TRUE),
          (30.0, FALSE),
          (35.0, TRUE),
          (40.0, TRUE),
          (45.0, FALSE),
          (50.0, TRUE),
          (60.0, TRUE)
      ) AS t (duration, event)
  )
SELECT
  kaplan_meier (ARRAY_AGG(duration), ARRAY_AGG(event)) AS curve
FROM
  cohort;

Nearby rabbit holes

same domain
Climb back to The Looking Glass