Skip to main content
blog.philz.dev

Handy window function trick to pick out the biggest/smallest/first

Use Common Table Expressions (WITH clauses; like a let for SQL) combined with the row_number() window function to pick out the biggest (or first or smallest) row within a grouping. In the following example, we want to pick the student with the top grade per class:

$sqlite3
sqlite> create table t(class string, name string, grade int);
sqlite> insert into t values ("Math", "Alice", 99),
  ("Math", "Bob", 87),
  ("Science", "Charlie", 88),
  ("Science", "David", 93);
sqlite> select * from t;
class       name        grade
----------  ----------  ----------
Math        Alice       99
Math        Bob         87        <-- we want this one
Science     Charlie     88
Science     David       93        <-- and this one

sqlite> WITH ranked AS (
   SELECT
       *,
       row_number() OVER (PARTITION BY class
                          ORDER BY grade DESC) AS r
   FROM t)
SELECT * FROM ranked WHERE r=1;

class       name        grade       r
----------  ----------  ----------  ----------
Math        Alice       99          1
Science     David       93          1

This creates a temporary that's just like our original data, but has an extra column called "r" which is the row_number, partitioned by class, and ordered by grade (descending). Picking out the student with the top grade is picking out the rows where r=1.

This trick can be applied to get the slowest request per user, sample one request per day, and so forth.

Before I knew about this trick, I'd join against an aggregation, but that had a tendency to pick out extra rows for duplicates and was (more) painful to write.

If you're using DuckDB, there's a QUALIFY clause that makes it even easier:

$duckdb
v0.8.1 6536a77232
D create table t(class string, name string, grade int);
D insert into t values ('Math', 'Alice', 99);
D insert into t values ('Math', 'Bob', 87);
D insert into t values ('Science', 'Charlie', 88);
D insert into t values ('Science', 'David', 93);
D SELECT * FROM t QUALIFY row_number() OVER (PARTITION BY class ORDER BY grade DESC);
┌─────────┬─────────┬───────┐
│  class  │  name   │ grade │
│ varcharvarchar │ int32 │
├─────────┼─────────┼───────┤
│ Science │ David   │    93 │
│ Science │ Charlie │    88 │
│ Math    │ Alice   │    99 │
│ Math    │ Bob     │    87 │
└─────────┴─────────┴───────┘
D SELECT * FROM t QUALIFY row_number() OVER (PARTITION BY class ORDER BY grade DESC) = 1;
┌─────────┬─────────┬───────┐
│  class  │  name   │ grade │
│ varcharvarchar │ int32 │
├─────────┼─────────┼───────┤
│ Science │ David   │    93 │
│ Math    │ Alice   │    99 │
└─────────┴─────────┴───────┘

Originally posted at https://collectednotes.com/philz/handy-window-function-trick.