Identify Four Rating Mechanisms in SQL
### SQL Ranking Functions: Assigning Ranks and Buckets to Rows
In the realm of SQL, there are four essential window functions that help in assigning rankings or numbers to rows within a partition or over the entire result set based on ordering criteria: `RANK()`, `DENSE_RANK()`, `ROW_NUMBER()`, and `NTILE()`. Each of these functions handles ties and numbering schemes differently, offering flexibility when sorting and analysing data.
#### 1. **RANK()**
`RANK()` assigns a rank to each row within the partition, with the same ranks given to rows with identical values. However, the rank numbering skips the next numbers after ties. For instance, if two rows share rank 1, the next rank assigned will be 3.
```sql RANK() OVER ( [PARTITION BY column1, column2, ...] ORDER BY column_to_order DESC|ASC ) ```
In the example below, employees in each department are ranked by salary in descending order, with gaps in the rank after ties.
```sql SELECT Name, Department, Salary, RANK() OVER (PARTITION BY Department ORDER BY Salary DESC) AS emp_rank FROM employee; ```
#### 2. **DENSE_RANK()**
Similar to `RANK()`, `DENSE_RANK()` assigns ranks to rows within the partition. However, it does not skip rank numbers after ties, so ranks are consecutive. If two rows share rank 1, the next rank will be 2.
```sql DENSE_RANK() OVER ( [PARTITION BY column1, column2, ...] ORDER BY column_to_order DESC|ASC ) ```
In the example below, ranks are given to employees by salary like `RANK()`, but subsequent ranks continue consecutively without gaps.
```sql SELECT Name, Department, Salary, DENSE_RANK() OVER (PARTITION BY Department ORDER BY Salary DESC) AS emp_dense_rank FROM employee; ```
#### 3. **ROW_NUMBER()**
`ROW_NUMBER()` assigns a unique sequential integer to rows within a partition, regardless of ties. Each row gets a distinct number.
```sql ROW_NUMBER() OVER ( [PARTITION BY column1, column2, ...] ORDER BY column_to_order DESC|ASC ) ```
In the example below, employees are numbered uniquely by salary within each department, even if salaries are the same.
```sql SELECT Name, Department, Salary, ROW_NUMBER() OVER (PARTITION BY Department ORDER BY Salary DESC) AS row_num FROM employee; ```
#### 4. **NTILE(n)**
`NTILE(n)` divides the rows within a partition into approximately equal groups (tiles or buckets) and assigns the bucket number to each row. Rows are bucketed, for example, quartiles (4 buckets), percentiles (100 buckets), etc.
```sql NTILE(n) OVER ( [PARTITION BY column1, column2, ...] ORDER BY column_to_order DESC|ASC ) ```
In the example below, employees are divided into 4 groups based on salary ranking. Each row is assigned a number from 1 to 4 representing the quartile.
```sql SELECT Name, Salary, NTILE(4) OVER (ORDER BY Salary DESC) AS quartile FROM employee; ```
The following table summarises the differences between these ranking functions:
| Function | Handles Ties | Ranking Style | Unique Numbers? | Purpose | |-------------|-------------------------|-----------------------------------|----------------|---------------------------------------| | **RANK()** | Same rank for ties, next rank skipped | Ranks with gaps (e.g., 1,1,3) | No | Rank rows; gaps in rank after ties | | **DENSE_RANK()**| Same rank for ties, no gaps | Consecutive ranks (e.g., 1,1,2) | No | Rank rows; ranks consecutive without gaps | | **ROW_NUMBER()**| No ties allowed | Unique sequential numbers | Yes | Unique numbering of rows | | **NTILE(n)** | Buckets rows into n groups | Buckets numbered from 1 to n | Yes | Divide rows into equal groups |
These ranking functions are all used with the `OVER()` clause, which can include an optional `PARTITION BY` to reset ranks per group, and a mandatory `ORDER BY` that defines the ranking order.
Data-and-cloud-computing technology can be leveraged to process and analyze the large volumes of data generated from the SQL ranking functions experiments. The experiments show that each SQL ranking function (RANK(), DENSE_RANK(), ROW_NUMBER(), and NTILE) has its unique properties and can be beneficial in various scenarios, depending on the specific requirements of data analysis.