MySQL Window functions are a powerful tool for data analysis and querying data from MySQL databases. They allow you to perform complex calculations on a set of data and return a result set with additional computed columns that are derived from the original data.
To explain window functions, let’s take a sample dataset of a table “orders” with columns “order_id”, “customer_id”, “order_date”, “order_amount”.
order_id
customer_id
order_date
order_amount
1
101
2021-01-01
100
2
102
2021-01-02
200
3
103
2021-01-03
300
4
101
2021-01-04
150
5
102
2021-01-05
250
6
103
2021-01-06
350
7
101
2021-01-07
200
8
102
2021-01-08
300
9
103
2021-01-09
400
Now let’s see some examples of MySQL window functions using this table:
RANK() – This function assigns a rank to each row based on a specific column’s value. In the below example, we are ranking the orders based on the order_amount.
SELECT order_id, customer_id, order_amount, RANK() OVER (ORDER BY order_amount DESC) as order_rank
FROM orders;
order_id
customer_id
order_amount
order_rank
9
103
400
1
6
103
350
2
5
102
250
3
8
102
300
3
3
103
300
3
7
101
200
6
2
102
200
6
4
101
150
8
1
101
100
9
ROW_NUMBER() – This function assigns a unique sequential number to each row within a result set.
SELECT order_id, customer_id, order_amount, ROW_NUMBER() OVER (ORDER BY order_date) as row_number
FROM orders;
order_id
customer_id
order_amount
row_number
1
101
100
1
2
102
200
2
3
103
300
3
4
101
150
4
5
102
250
5
6
103
350
6
7
101
200
7
8
102
300
8
DENSE_RANK() – This function assigns a rank to each row based on a specific column’s value but skips rank numbers for duplicate values.
SELECT order_id, customer_id, order_amount, DENSE_RANK() OVER (ORDER BY order_amount DESC) as order_rank
FROM orders;
order_id
customer_id
order_amount
order_rank
9
103
400
1
6
103
350
2
5
102
250
3
8
102
300
4
3
103
300
4
7
101
200
5
2
102
200
5
4
101
150
6
1
101
100
7
SUM() – This function calculates the sum of a column’s values within a window.
SELECT order_id, customer_id, order_amount, SUM(order_amount) OVER (ORDER BY order_date) as total_amount
FROM orders;
order_id
customer_id
order_amount
total_amount
1
101
100
100
2
102
200
300
3
103
300
600
4
101
150
750
5
102
250
1000
6
103
350
1350
7
101
200
1550
8
102
300
1850
9
103
400
2250
LAG() – This function retrieves the value of a specific column from the previous row within a result set. In the below example, we are retrieving the order_amount from the previous row for each order.
SELECT order_id, customer_id, order_amount, LAG(order_amount, 1, 0) OVER (ORDER BY order_date) as previous_order_amount
FROM orders;
order_id
customer_id
order_amount
previous_order_amount
1
101
100
0
2
102
200
100
3
103
300
200
4
101
150
300
5
102
250
150
6
103
350
250
7
101
200
350
8
102
300
200
9
103
400
300
NTILE() – This function divides a result set into a specified number of groups and assigns a group number to each row. In the below example, we are dividing the orders into 3 groups based on the order_amount.
SELECT order_id, customer_id, order_amount, NTILE(3) OVER (ORDER BY order_amount) as order_group
FROM orders;