# Windowed Functions in SQL Server

**Pre-requisite: you are expected to be familiar with SQL, including GROUP BY and Aggregates. There is a CTE.**

## Part 1: Over and Over

We all know that this will fail, and why...

```
Select top 3
VehicleCode,
COUNT(*) as [COUNT]
from Reporting.Vehicles
```

ErrorColumn 'Reporting.Vehicles.VehicleCode' is invalid in the select list because it is not contained in either an aggregate function or the GROUP BY clause.

You can't mix aggregates and non aggregates, unless you use GROUP BY. Or can you!?

Watch what happens when we include an `OVER()`

clause...

```
Select top 3
VehicleCode,
COUNT(*) OVER() as [COUNT]
from Reporting.Vehicles
```

VehicleCode | COUNT |
---|---|

ABC_VH011 | 8214 |

ABC_VH012 | 8214 |

ABC_VH013 | 8214 |

Often the only place we're familiar with `OVER()`

from is with the `ROW_NUMBER()`

function. Honestly, most of us treat `ROW_NUMBER()`

as a little bit of magic text that we copy and paste from stackoverflow each time we want to use it.

Let's follow the same process, but instead of the count function we'll start with a `ROW_NUMBER()`

...

```
Select top 3
VehicleCode,
ROW_NUMBER() as ROW_NUMBER
from Reporting.Vehicles
```

ErrorThe function 'ROW_NUMBER' must have an OVER clause.

Now we're told that `The function 'ROW_NUMBER' must have an OVER clause.`

So let's include an OVER clause...

```
Select top 3
VehicleCode,
ROW_NUMBER() OVER() as ROW_NUMBER
from Reporting.Vehicles
```

ErrorThe function 'ROW_NUMBER' must have an OVER clause with ORDER BY.

Now we're heckled with "The function 'ROW_NUMBER' must have an OVER clause with ORDER BY" and I don't know they didn't say that in the first place?

(And you noticed that the `COUNT`

function above did not require an `ORDER BY`

clause. That's because ROW_NUMBER and COUNT operate under different rules.)

So we'll add that in...

```
Select top 3
VehicleCode,
ROW_NUMBER() OVER(order by VehicleCode) as ROW_NUMBER
from Reporting.Vehicles
```

VehicleCode | ROW_NUMBER |
---|---|

ABC_VH011 | 1 |

ABC_VH012 | 2 |

ABC_VH013 | 3 |

And voila! We have our `ROW_NUMBER`

... but let's learn more about what's happening here.

First we need to see exactly what the rules are.

Can you guess what will happen if we order by `DESC`

?

```
Select top 3
VehicleCode,
ROW_NUMBER() OVER(ORDER BY VehicleCode desc) as ROW_NUMBER
from Reporting.Vehicles
```

Will we now get 3,2,1 ?

VehicleCode | ROW_NUMBER |
---|---|

YAX_VH011 | 1 |

WYX_VH012 | 2 |

WNX_VH013 | 3 |

No, we still got 1,2,3 -- but the `top 3`

Vehicles we were shown were from the *end* of the list of Vehicles, in descending order.

This takes a little bit of thinking. Since there's no sort order on the outer query, the ordering created by the rest of the query is apparent. And it shows us a little bit about the way in which the ROW_NUMBER was applied. They started applying from the back of the list, i.e. starting at 1.

What if we order the outer query by VehicleCode ascending... will we *now* get 3,2,1 ?

```
Select top 3
VehicleCode
,ROW_NUMBER() OVER(ORDER BY VehicleCode desc) as ROW_NUMBER
from Reporting.Vehicles
order by VehicleCode asc
```

VehicleCode | ROW_NUMBER |
---|---|

ABC_VH011 | 8214 |

ABC_VH012 | 8213 |

ABC_VH013 | 8212 |

Can we order by an integer?

```
Select top 3
VehicleCode
,ROW_NUMBER() OVER(ORDER BY 1 asc) as ROW_NUMBER
from Reporting.Vehicles
order by 1 asc
```

ErrorWindowed functions and NEXT VALUE FOR functions do not support integer indices as ORDER BY clause expressions.

Can we order by a Windowed Function?

```
Select top 3
VehicleCode,
ROW_NUMBER() OVER(ORDER BY VehicleCode asc) as ROW_NUMBER
from Reporting.Vehicles
order by ROW_NUMBER() OVER(ORDER BY VehicleCode desc) asc
```

VehicleCode | ROW_NUMBER |
---|---|

YAX_VH001 | 8214 |

WYX_VH001 | 8213 |

WNX_VH001 | 8212 |

Can we put a windowed function into the OVER clause???

```
Select top 3
VehicleCode,
ROW_NUMBER() OVER(ORDER BY
ROW_NUMBER() OVER(ORDER BY VehicleCode desc) asc
) as ROW_NUMBER
from Reporting.Vehicles
```

ErrorWindowed functions cannot be used in the context of another windowed function or aggregate.

Can we put a windowed function in a WHERE clause?

```
Select top 3
VehicleCode,
ROW_NUMBER() OVER(ORDER BY VehicleCode asc) as ROW_NUMBER
from Reporting.Vehicles
where ROW_NUMBER() OVER(ORDER BY VehicleCode asc) > 4
```

ErrorWindowed functions can only appear in the SELECT or ORDER BY clauses.

At this stage we've learnt all the rules.

We can follow this old skating maxim:

FIRST: Learn all the rules

NEXT: Break all the rules

Here's how we **CAN** put a windowed-function into a WHERE clause...

```
;with TopVehicles AS
(Select
VehicleCode,
ROW_NUMBER() OVER(ORDER BY VehicleCode asc) as ROWNUM
from Reporting.Vehicles)
select top 3 * from TopVehicles
where ROWNUM > 4
```

We use a Common-Table-Expression (CTE) to get one extra level of indirection... now we can filter by our windowed-function.

VehicleCode | ROWNUM |
---|---|

ABC_VH001 | 5 |

ABC_VH002 | 6 |

ABC_VH002R | 7 |

**In fact this is a handy general technique.**

You can also apply a windowed function column to a windowed function, if you **first wrap it in a CTE**.

This technique creates a kind of CTE game of Pass-the-Parcel. You may need to wrap up many levels of CTE to get the result you need.

## PART 2: Other Functions

We've finished exploring `COUNT`

and `Row_Number`

functions for now.

A quick look at other functions.

I said above that "ROW_NUMBER and COUNT operate under different rules" -- that's because they are different types of functions.

Y'see, with `over()`

there are **two types of functions** you can use:

- ALL of the aggregates (e.g. COUNT, SUM, AVG) which you know from GROUPing
- "Ranking functions"

### What's an example of a Ranking function?

An obvious example would be the `RANK`

function itself, which we'll get to in a moment.

But `ROW_NUMBER`

is also a ranking function. It's just a very *unfair* ranking

```
Select top 5
VehicleCode,
ReliabilityFactor,
RANK() OVER(ORDER BY ReliabilityFactor desc) as RANK_ReliabilityFactor,
ROW_NUMBER() OVER(ORDER BY ReliabilityFactor desc) as ROW_NUM
from Reporting.Vehicles
where ReliabilityFactor is not null and ReliabilityFactor <= 3.50
```

VehicleCode | ReliabilityFactor | RANK_ReliabilityFactor | ROW_NUM |
---|---|---|---|

ARQ_VH129 | 3.5 | 1 | 1 |

MJQ_VH166 | 3.5 | 1 | 2 |

IBV_VH167 | 3.43525004386902 | 3 | 3 |

JMX_VH149 | 3.40000009536743 | 4 | 4 |

MJQ_VH006 | 3.40000009536743 | 4 | 5 |

The different ranking functions each use a different 'strategy' for numbering.

In the plain old `RANK`

strategy, if two rows are "tied" then they get the same number. And a subsequent number is skipped.

Rank is like the olympics: two people can share a gold medal, but then no one gets the silver.

A different strategy is `DENSE_RANK`

:

```
Select top 5
VehicleCode,
ReliabilityFactor,
COUNT(*) over () as [Count],
RANK() OVER(ORDER BY ReliabilityFactor desc) as RANK_ReliabilityFactor,
DENSE_RANK() OVER(ORDER BY ReliabilityFactor desc) as DENSE_RANK_ReliabilityFactor,
ROW_NUMBER() OVER(ORDER BY ReliabilityFactor desc) as ROW_NUMBER_ReliabilityFactor
from Reporting.Vehicles
where ReliabilityFactor is not null and ReliabilityFactor <= 3.50
```

VehicleCode | ReliabilityFactor | Count | RANK_ReliabilityFactor | DENSE_RANK_ReliabilityFactor | ROW_NUMBER_ReliabilityFactor |
---|---|---|---|---|---|

ARQ_VH129 | 3.5 | 3085 | 1 | 1 | 1 |

MJQ_VH166 | 3.5 | 3085 | 1 | 1 | 2 |

IBV_VH167 | 3.43525004386902 | 3085 | 3 | 2 | 3 |

JMX_VH149 | 3.40000009536743 | 3085 | 4 | 3 | 4 |

MJQ_VH006 | 3.40000009536743 | 3085 | 4 | 3 | 5 |

With dense rank, if two people tie for first they both get a gold medal. And the next person gets a silver.

This way you know every type of number will be handed out, at least once, even though ties are allowed.

So it's more fair than 'row number' but has different numeric properties. So it can be useful depending on *how* you want to join it to other things.

## PART 3: Aggregates

The famous `MAX`

and `MIN`

```
Select top 5
VehicleCode,
ReliabilityFactor,
ROW_NUMBER() OVER(ORDER BY ReliabilityFactor desc) as ROW_NUMBER_ReliabilityFactor,
MAX(ReliabilityFactor) OVER() as MAX_RF,
MIN(ReliabilityFactor) OVER() as MIN_RF
from Reporting.Vehicles
where ReliabilityFactor is not null and ReliabilityFactor <= 3.50
```

VehicleCode | ReliabilityFactor | ROW_NUMBER_ReliabilityFactor | MAX_RF | MIN_RF |
---|---|---|---|---|

ARQ_VH129 | 3.5 | 1 | 3.5 | 0 |

MJQ_VH166 | 3.5 | 2 | 3.5 | 0 |

IBV_VH167 | 3.43525004386902 | 3 | 3.5 | 0 |

JMX_VH149 | 3.40000009536743 | 4 | 3.5 | 0 |

MJQ_VH006 | 3.40000009536743 | 5 | 3.5 | 0 |

What if we specify an order by!?

```
Select top 5
VehicleCode,
ReliabilityFactor,
ROW_NUMBER() OVER(ORDER BY ReliabilityFactor desc) as ROW_NUMBER_ReliabilityFactor,
--MAX(ReliabilityFactor) OVER(ORDER BY ReliabilityFactor desc) as MAX_RF,
MIN(ReliabilityFactor) OVER(ORDER BY ReliabilityFactor desc) as MIN_RF
from Reporting.Vehicles
where ReliabilityFactor is not null and ReliabilityFactor <= 3.50
```

VehicleCode | ReliabilityFactor | ROW_NUMBER_ReliabilityFactor | MIN_RF |
---|---|---|---|

ARQ_VH129 | 3.5 | 1 | 3.5 |

MJQ_VH166 | 3.5 | 2 | 3.5 |

IBV_VH167 | 3.43525004386902 | 3 | 3.43525004386902 |

JMX_VH149 | 3.40000009536743 | 4 | 3.40000009536743 |

MJQ_VH006 | 3.40000009536743 | 5 | 3.40000009536743 |

## PART 4: RANGES/PRECEDING/FOLLOWING

```
Select top 7
VehicleCode,
ReliabilityFactor,
-- A:
COUNT(*) OVER() as [COUNT_OVER],
-- B:
COUNT(*) OVER(ORDER BY ReliabilityFactor desc ) as [COUNT_ORDER_BY_RF],
-- C:
COUNT(*) OVER(ORDER BY ReliabilityFactor desc RANGE
UNBOUNDED PRECEDING) as [COUNT_RF_RANGE],
-- D:
COUNT(*) OVER(ORDER BY ReliabilityFactor desc RANGE
BETWEEN UNBOUNDED PRECEDING
AND CURRENT ROW) as [COUNT_RF_RANGE_BETWEEN],
-- E:
COUNT(*) OVER(ORDER BY ReliabilityFactor desc ROWS
BETWEEN UNBOUNDED PRECEDING
AND CURRENT ROW) as [COUNT_RF_ROWS]
from Reporting.Vehicles
where ReliabilityFactor is not null and ReliabilityFactor <= 3.50
```

VehicleCode | ReliabilityFactor | COUNT_OVER | COUNT_ORDER_BY_RF | COUNT_RF_RANGE | COUNT_RF_RANGE_BETWEEN | COUNT_RF_ROWS |
---|---|---|---|---|---|---|

ARQ_VH129 | 3.5 | 3085 | 2 | 2 | 2 | 1 |

MJQ_VH166 | 3.5 | 3085 | 2 | 2 | 2 | 2 |

IBV_VH167 | 3.43525004386902 | 3085 | 3 | 3 | 3 | 3 |

JMX_VH149 | 3.40000009536743 | 3085 | 7 | 7 | 7 | 4 |

MJQ_VH006 | 3.40000009536743 | 3085 | 7 | 7 | 7 | 5 |

MJQ_VH174 | 3.40000009536743 | 3085 | 7 | 7 | 7 | 6 |

MJQ_VH204 | 3.40000009536743 | 3085 | 7 | 7 | 7 | 7 |

Here is a fairly exhaustive set of clauses...

```
Select top 7
VehicleCode,
ReliabilityFactor,
--A:
MIN(ReliabilityFactor) OVER(ORDER BY ReliabilityFactor desc) as MIN_RF,
--B:
MIN(ReliabilityFactor) OVER(ORDER BY ReliabilityFactor desc RANGE
UNBOUNDED PRECEDING) as MIN_RF,
--B1:
MIN(ReliabilityFactor) OVER(ORDER BY ReliabilityFactor desc RANGE
BETWEEN UNBOUNDED PRECEDING
AND CURRENT ROW) as MIN_RF,
--C:
MIN(ReliabilityFactor) OVER(ORDER BY ReliabilityFactor desc ROWS
BETWEEN UNBOUNDED PRECEDING
AND UNBOUNDED FOLLOWING) as MIN_RF_UNBOUNDED,
--D:
MIN(ReliabilityFactor) OVER(ORDER BY ReliabilityFactor desc RANGE
BETWEEN UNBOUNDED PRECEDING
AND UNBOUNDED FOLLOWING) as MIN_RF_RANGE_UNBOUNDED,
--E:
MIN(ReliabilityFactor) OVER(ORDER BY ReliabilityFactor desc RANGE
UNBOUNDED PRECEDING) as MIN_RF_RANGE_UNBOUNDED_PREC,
--F:
Max(ReliabilityFactor) OVER(ORDER BY ReliabilityFactor desc
rows BETWEEN CURRENT ROW AND UNBOUNDED FOLLOWING) as Max_RF_rows_AHEAD,
--G:
Max(ReliabilityFactor) OVER(ORDER BY ReliabilityFactor desc
rows BETWEEN 2 PRECEDING AND 2 FOLLOWING) as Max_RF_rows_neighbours
from Reporting.Vehicles
where ReliabilityFactor is not null and ReliabilityFactor <= 3.50
order by ReliabilityFactor desc
```

VehicleCode | ReliabilityFactor | MIN_RF | MIN_RF1 | MIN_RF2 | MIN_RF_UNBOUNDED | MIN_RF_RANGE_UNBOUNDED | MIN_RF_RANGE_UNBOUNDED_PREC | Max_RF_rows_AHEAD | Max_RF_rows_neighbours |
---|---|---|---|---|---|---|---|---|---|

ARQ_VH129 | 3.5 | 3.5 | 3.5 | 3.5 | 0 | 0 | 3.5 | 3.5 | 3.5 |

MJQ_VH166 | 3.5 | 3.5 | 3.5 | 3.5 | 0 | 0 | 3.5 | 3.5 | 3.5 |

IBV_VH167 | 3.43525004386902 | 3.43525004386902 | 3.43525004386902 | 3.43525004386902 | 0 | 0 | 3.43525004386902 | 3.43525004386902 | 3.5 |

JMX_VH149 | 3.40000009536743 | 3.40000009536743 | 3.40000009536743 | 3.40000009536743 | 0 | 0 | 3.40000009536743 | 3.40000009536743 | 3.5 |

MJQ_VH006 | 3.40000009536743 | 3.40000009536743 | 3.40000009536743 | 3.40000009536743 | 0 | 0 | 3.40000009536743 | 3.40000009536743 | 3.43525004386902 |

MJQ_VH174 | 3.40000009536743 | 3.40000009536743 | 3.40000009536743 | 3.40000009536743 | 0 | 0 | 3.40000009536743 | 3.40000009536743 | 3.40000009536743 |

MJQ_VH204 | 3.40000009536743 | 3.40000009536743 | 3.40000009536743 | 3.40000009536743 | 0 | 0 | 3.40000009536743 | 3.40000009536743 | 3.40000009536743 |

## PART 5: PARTITIONS

We will use `COUNT`

to demonstrate partitions.

First, a `COUNT`

with no partition...

The `COUNT`

is the total count...

```
Select top 9
VehicleCode,
Fleet,
COUNT(*) OVER() as [COUNT]
from Reporting.Vehicles
where not Fleet is null
order by Fleet, Vehiclecode
```

VehicleCode | Fleet | COUNT |
---|---|---|

PLF_VH112 | 096FQ389 | 8191 |

CHP_VH151 | 0H3A82P2 | 8191 |

CHP_VH159 | 0H3A82P2 | 8191 |

PEF_VH007 | 0C1FT1H2 | 8191 |

BXR_VH188 | 1Y0PY47X | 8191 |

BXR_VH190 | 1Y0PY47X | 8191 |

BXR_VH199 | 1Y0PY47X | 8191 |

BXR_VH200 | 1Y0PY47X | 8191 |

BXR_VH201 | 1Y0PY47X | 8191 |

Let's add some partition... This time count will show us how many rows in each Fleet:

```
Select top 9
VehicleCode,
Fleet,
ROW_NUMBER() OVER(PARTITION BY Fleet order by VehicleCode) as [ROW_IN_Fleet],
COUNT(*) OVER(PARTITION BY Fleet) as [COUNT_IN_Fleet],
-- Friendly Name...
'Vehicle ' + Cast(ROW_NUMBER() OVER(PARTITION BY Fleet order by VehicleCode) as Varchar(10)) + ' of ' + Cast(COUNT(*) OVER(PARTITION BY Fleet) as Varchar(10)) + ' on ' + Fleet as [Friendly_Name]
from Reporting.Vehicles
where not Fleet is null
order by Fleet, Vehiclecode
```

VehicleCode | Fleet | ROW_IN_Fleet | COUNT_IN_Fleet | Friendly_Name |
---|---|---|---|---|

PLF_VH112 | 096FQ389 | 1 | 1 | Vehicle 1 of 1 on 096FQ389 |

CHP_VH151 | 0H3A82P2 | 1 | 2 | Vehicle 1 of 2 on 0H3A82P2 |

CHP_VH159 | 0H3A82P2 | 2 | 2 | Vehicle 2 of 2 on 0H3A82P2 |

PEF_VH007 | 0C1FT1H2 | 1 | 1 | Vehicle 1 of 1 on 0C1FT1H2 |

BXR_VH188 | 1Y0PY47X | 1 | 5 | Vehicle 1 of 5 on 1Y0PY47X |

BXR_VH190 | 1Y0PY47X | 2 | 5 | Vehicle 2 of 5 on 1Y0PY47X |

BXR_VH199 | 1Y0PY47X | 3 | 5 | Vehicle 3 of 5 on 1Y0PY47X |

BXR_VH200 | 1Y0PY47X | 4 | 5 | Vehicle 4 of 5 on 1Y0PY47X |

BXR_VH201 | 1Y0PY47X | 5 | 5 | Vehicle 5 of 5 on 1Y0PY47X |

## PART 6: LEAD/LAG

```
Select top 9
VehicleCode,
COUNT(*) OVER(PARTITION BY Fleet) as [COUNT_IN_Fleet],
Fleet,
LEAD(Fleet) OVER(order by Fleet, VehicleCode) as [NEXT_Fleet],
LAG(Fleet) OVER(order by Fleet, VehicleCode) as [PREVIOUS_Fleet],
LEAD(Fleet,2) OVER(order by Fleet, VehicleCode) as [NEXT_NEXT_Fleet],
LAG(Fleet,2) OVER(order by Fleet, VehicleCode) as [PREVIOUS_PREVIOUS_Fleet],
LEAD(Fleet,8172) OVER(order by Fleet, VehicleCode) as [NEXT_Fleet_8172],
FIRST_VALUE(VehicleCode) OVER(partition by Fleet order by Vehiclecode) as First_Vehicle_in_Fleet,
LAST_VALUE(VehicleCode) OVER(partition by Fleet order by Vehiclecode
RANGE BETWEEN
CURRENT ROW AND UNBOUNDED FOLLOWING) as Last_Vehicle_in_Fleet
from Reporting.Vehicles
where not Fleet is null
order by Fleet, Vehiclecode
```

VehicleCode | COUNT_IN_Fleet | Fleet | NEXT_Fleet | PREVIOUS_Fleet | NEXT_NEXT_Fleet | PREVIOUS_PREVIOUS_Fleet | NEXT_Fleet_8172 | First_Vehicle_in_Fleet | Last_Vehicle_in_Fleet |
---|---|---|---|---|---|---|---|---|---|

PLF_VH112 | 1 | 096FQ389 | 0H3A82P2 | 0H3A82P2 | APYX07 | PLF_VH112 | PLF_VH112 | ||

CHP_VH151 | 2 | 0H3A82P2 | 0H3A82P2 | 096FQ389 | 0C1FT1H2 | APYX07 | CHP_VH151 | CHP_VH159 | |

CHP_VH159 | 2 | 0H3A82P2 | 0C1FT1H2 | 0H3A82P2 | 1Y0PY47X | 096FQ389 | APYX07 | CHP_VH151 | CHP_VH159 |

PEF_VH007 | 1 | 0C1FT1H2 | 1Y0PY47X | 0H3A82P2 | 1Y0PY47X | 0H3A82P2 | APYX07 | PEF_VH007 | PEF_VH007 |

BXR_VH188 | 5 | 1Y0PY47X | 1Y0PY47X | 0C1FT1H2 | 1Y0PY47X | 0H3A82P2 | APYX07 | BXR_VH188 | BXR_VH201 |

BXR_VH190 | 5 | 1Y0PY47X | 1Y0PY47X | 1Y0PY47X | 1Y0PY47X | 0C1FT1H2 | APYX07 | BXR_VH188 | BXR_VH201 |

BXR_VH199 | 5 | 1Y0PY47X | 1Y0PY47X | 1Y0PY47X | 1Y0PY47X | 1Y0PY47X | APYX07 | BXR_VH188 | BXR_VH201 |

BXR_VH200 | 5 | 1Y0PY47X | 1Y0PY47X | 1Y0PY47X | 101DY473 | 1Y0PY47X | ARG447 | BXR_VH188 | BXR_VH201 |

BXR_VH201 | 5 | 1Y0PY47X | 101DY473 | 1Y0PY47X | 101DY473 | 1Y0PY47X | ARG447 | BXR_VH188 | BXR_VH201 |

## Background reading

## See also

- Delete duplicate rows (uses
`ROW_NUMBER()`

and`PARTITION`

)