Photo by Kaitlyn Baker on Unsplash
Exploring SQL Ranking Functions: A Comparative Guide
What's the difference between ROW_NUMBER(), RANK() and DENSE_RANK()?
These functions are used to assign numbering to rows based on specified criteria. They create new columns with the rank, so they are used in the SELECT
function.
The syntax for these functions is as follows;
--Rows ranked using different ranking functions
SELECT col1, col2,
ROW_NUMBER() OVER (ORDER BY col2 DESC) AS rank_row_number,
RANK() OVER (ORDER BY col2 DESC) AS rank_rank,
DENSE_RANK() OVER (ORDER BY col2 DESC) AS rank_dense_rank
FROM table1;
ROW_NUMBER()
This assigns a unique number to every row, whether the column used in the
OVER()
function has duplicates or not. This is like a basic numbering system, don't overthink it.RANK()
This differs from
ROW_NUMBER()
in that when it finds duplicate rows, it assigns them the same rank, then skips the rank by the number of duplicate rows there were. Don't fret if you can't process it yet, hold on to this for now.DENSE_RANK()
This is the same as
RANK()
except in case of duplicates, it does not skip a rank.
The table below shows the differences clearly. After carefully analyzing the results, go back to the explanations above. It will all make sense.
Do you see it? the difference between rank and dense rank?
In the column ranked using RANK()
, 25 students was a tie among 3 units, they were ranked 2
then after that it skipped 3, 4
. The next rank was 5
.
It's not the same case with DENSE_RANK()
which doesn't skip rank even when there are duplicates.
Point to note:
Inside the OVER() function, the column used to order the rows has to be specified when using RANK
or DENSE_RANK
functions, otherwise it will rank every row "1".
When using ROW_NUMBER
, if the order is not specified, it will use a default order. In the figure below it uses the order in which the rows were entered
I hope it makes sense now! Happy querying.