SQL Window Functions and Ranking
SQL window functions and ranking are powerful tools for performing calculations across sets of rows that relate to the current row. Let me break this down into digestible concepts with practical examples.
What are Window Functions?
Window functions perform calculations across a set of table rows related to the current row, but unlike aggregate functions, they don’t collapse rows into a single result. They “look through a window” at related rows while preserving the original row structure.
Basic Syntax:
function_name() OVER (
[PARTITION BY column1, column2, ...]
[ORDER BY column1, column2, ...]
[ROWS/RANGE specification]
)Key Components
PARTITION BY: Divides the result set into partitions (like GROUP BY, but doesn’t collapse rows) ORDER BY: Defines the order within each partition Window Frame: Specifies which rows to include in the calculation (ROWS/RANGE)
Ranking Functions
1. ROW_NUMBER()
Assigns a unique sequential integer to each row within a partition.
SELECT
employee_name,
department,
salary,
ROW_NUMBER() OVER (PARTITION BY department ORDER BY salary DESC) as row_num
FROM employees;2. RANK()
Assigns ranks with gaps when there are ties. If two rows tie for rank 2, the next rank is 4.
SELECT
student_name,
score,
RANK() OVER (ORDER BY score DESC) as rank
FROM exam_results;3. DENSE_RANK()
Assigns ranks without gaps. If two rows tie for rank 2, the next rank is 3.
SELECT
product_name,
sales_amount,
DENSE_RANK() OVER (ORDER BY sales_amount DESC) as dense_rank
FROM product_sales;4. NTILE(n)
Divides rows into n roughly equal buckets and assigns bucket numbers.
SELECT
customer_name,
total_purchases,
NTILE(4) OVER (ORDER BY total_purchases DESC) as quartile
FROM customers;Practical Examples
Finding top N per group:
-- Top 3 highest-paid employees per department
SELECT * FROM (
SELECT
employee_name,
department,
salary,
ROW_NUMBER() OVER (PARTITION BY department ORDER BY salary DESC) as rn
FROM employees
) ranked
WHERE rn <= 3;Comparing with previous/next values:
SELECT
month,
revenue,
LAG(revenue) OVER (ORDER BY month) as prev_month_revenue,
LEAD(revenue) OVER (ORDER BY month) as next_month_revenue,
revenue - LAG(revenue) OVER (ORDER BY month) as month_over_month_change
FROM monthly_sales;Running totals:
SELECT
order_date,
order_amount,
SUM(order_amount) OVER (ORDER BY order_date ROWS UNBOUNDED PRECEDING) as running_total
FROM orders;Common Use Cases
Pagination: Use ROW_NUMBER() with OFFSET and LIMIT Percentiles: Use NTILE() to create percentile buckets
Duplicate removal: Use ROW_NUMBER() to identify and remove duplicates Time series analysis: Use LAG/LEAD for period-over-period comparisons Top-N queries: Use ranking functions with WHERE clauses
Window Frame Specifications
-- Different frame options
ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW -- From start to current
ROWS BETWEEN 2 PRECEDING AND 2 FOLLOWING -- 5-row moving window
ROWS BETWEEN CURRENT ROW AND UNBOUNDED FOLLOWING -- Current to endThe key insight is that window functions let you perform complex analytical calculations while preserving the detail level of your data, making them essential for reporting, analytics, and data analysis tasks.
