SQL Row Number Over Partition By
What is SQL ROW NUMBER OVER PARTITION BY?
SQL ROW_NUMBER() OVER PARTITION BY
is a window function that assigns a unique sequential integer to each row within a specific partition of a result set. It is a variation of the ROW_NUMBER
function that allows you to create row numbers within partitions based on one or more columns, providing more control over ranking and numbering within subsets of data.
When you would use it
You would use SQL ROW_NUMBER() OVER PARTITION BY
when you want to:
-
Rank rows within groups: Assign unique row numbers to each row within distinct groups or partitions based on specific column values. For example, ranking products by sales within different product categories.
-
Implement advanced pagination: Generate row numbers within partitions to enable more sophisticated pagination, such as displaying the top N items for each group.
-
Identify and remove duplicates: Assign row numbers to rows within partitions to identify and remove duplicate rows within those groups.
-
Compute cumulative totals: Calculate cumulative sums or other aggregations within partitions for data analysis.
Syntax
The basic syntax for using SQL ROW_NUMBER() OVER PARTITION BY
is as follows:
SELECT
column1,
column2,
...,
ROW_NUMBER() OVER (PARTITION BY partition_column1, partition_column2, ...) AS row_num
FROM
your_table;
column1, column2, ...
: The columns you want to retrieve.ROW_NUMBER()
: The window function that generates the row number.PARTITION BY partition_column1, partition_column2, ...
: The column or columns used to define partitions for numbering.row_num
: The alias for the row number column.
Parameter values
column1, column2, ...
: The columns you want to include in the result set.PARTITION BY partition_column1, partition_column2, ...
: The column or columns that define the partitions for numbering.
Example query
Suppose you have a table "sales" with columns "product_category," "sales_date," and "revenue," and you want to rank products by their sales within each product category. You can use the ROW_NUMBER() OVER PARTITION BY
function like this:
SELECT
product_category,
sales_date,
revenue,
ROW_NUMBER() OVER (PARTITION BY product_category ORDER BY revenue DESC) AS rank
FROM
sales;
Example table response
Assuming the "sales" table contains data like this:
| product_category | sales_date | revenue |
| --------------- | ----------- | ------- |
| Electronics | 2022-05-10 | 1200 |
| Clothing | 2022-06-15 | 800 |
| Electronics | 2023-02-20 | 1500 |
| Clothing | 2023-03-05 | 1000 |
The query mentioned earlier would return the following result:
| product_category | sales_date | revenue | rank |
| --------------- | ----------- | ------- | ---- |
| Electronics | 2022-05-10 | 1200 | 2 |
| Clothing | 2022-06-15 | 800 | 2 |
| Electronics | 2023-02-20 | 1500 | 1 |
| Clothing | 2023-03-05 | 1000 | 1 |
This result shows the rank of products by sales within each product category.
Use cases
- Ranking rows within specific partitions or groups.
- Creating advanced pagination for grouped data.
- Identifying and removing duplicate rows within partitions.
- Calculating cumulative totals or other aggregations within partitions.
SQL languages this is available for
SQL ROW_NUMBER() OVER PARTITION BY
is a window function available in most modern relational database management systems (RDBMS), including but not limited to:
- SQL Server
- PostgreSQL
- Oracle Database
- MySQL
- SQLite
- IBM Db2
- MariaDB
The syntax and behavior of this function are generally consistent across these database systems, with minor variations in specific implementations.