SQL for Active Cohorts
October 24th, 2018
Typical query for tracing out what percent of organizations are still active (members are still using the product) for X day. Key here is the
generate_series, which generates separate rows for each value, with all other columns duplicated. This query is written for Postgres, but should be do-able in most databases.
Assumes you have an
events table, and any event with a user_id is an “active” event. Find and replace ‘week’ with whatever interval you want.
-- first 4 weeks with org_weeks as ( select id, created, generate_series(created, least(now()::timestamp, created + interval '4 weeks'), interval '1 week') as cohort_date from organizations where created = :daterange ), org_events as ( select distinct org_id, created from events e join organization_members om on e.user_id=om.user_id ), active_weeks as ( select ow.id, extract(epoch from ow.cohort_date - ow.created) / 3600 / 24 / 7 as week, count(oe.org_id) as active from org_weeks ow left join org_events oe on ow.id=oe.org_id and oe.created between ow.cohort_date and ow.cohort_date + interval '1 week' group by 1, 2) select week, sum(case when active > 0 then 1 end) as actives, count(1) as total, sum(case when active > 0 then 1 end)::float / count(1) as percent from active_weeks group by 1 ;