Eight window-function tricks beyond LAG and ROW_NUMBER

The first post in this series got more reader response than I expected. The comments were full of practical follow-up questions and a few good corrections, and one thing that kept coming up was a request to go deeper on window functions specifically. Pattern 6 in the previous post leaned heavily on them. People wanted to see what else lives in that toolbox.

This is that.

LAG and ROW_NUMBER are usually the first two window functions analysts learn. They are the right starting point. But after a year or two of writing SQL you start running into shapes those two can’t quite hit, and you start hitting problems where those two stop being enough.

Eight patterns below. Same disclaimer as the previous post: I work as a data analyst on a program-integrity team. Examples use generic transaction tables and synthetic scenarios. Nothing here reflects actual systems, procedures, or data from any specific employer.


1. QUALIFY

QUALIFY lets you filter directly on window-function output without wrapping the whole query in another CTE or subquery.

SELECT
  cardholder_id,
  timestamp,
  count(*) OVER (
    PARTITION BY cardholder_id ORDER BY timestamp
    RANGE BETWEEN INTERVAL '5 minutes' PRECEDING AND CURRENT ROW
  ) AS tx_5min
FROM transactions
QUALIFY tx_5min >= 5;

Without QUALIFY this usually turns into an extra query layer just to filter on the computed column. Not a big deal in tiny examples, but once fraud rules start stacking together it gets messy fast.

Support is still inconsistent depending on the engine. Snowflake, BigQuery, DuckDB, Databricks, ClickHouse and a few others have it. Postgres, MySQL, SQLite still don’t.

If you’re on Postgres, the fallback is usually just a CTE and an outer WHERE clause.


2. Frame specs (ROWS vs RANGE)

A lot of people write:

OVER (PARTITION BY x ORDER BY y)

and never think much about the frame underneath it. Usually fine until sliding windows start behaving strangely.

The two big ones:

ROWS counts rows. RANGE works off the ORDER BY value itself.

The difference starts mattering once duplicate timestamps show up. ROWS and RANGE stop behaving the same way at that point.

For fraud velocity checks, RANGE on timestamps is usually the right fit because you’re actually asking “how many transactions happened within this time window,” not “what happened in the previous 5 rows.”

For rolling hourly baselines or trailing-week comparisons, ROWS is often cleaner because each row already represents a fixed bucket.

One related thing that came up in the comments on the previous post and is worth surfacing here: the timestamp you key off matters more than people realize. Authorization time and settlement time can be days apart. If you use settlement time on a velocity or impossible-travel check, charges from days ago suddenly look like they happened “now” in the warehouse, and the windows blow up with false positives. Use auth time. Most card systems have both columns; the home-grown ones don’t always make the distinction obvious.


3. FILTER for conditional aggregation

Most people start with CASE WHEN inside SUM or COUNT:

sum(CASE WHEN amount > 100 THEN 1 ELSE 0 END)

FILTER does the same thing with less noise:

SELECT
  merchant_id,
  count(*) FILTER (WHERE amount > 100) AS high_value_count,
  count(*) FILTER (WHERE timestamp::time BETWEEN '00:00' AND '06:00') AS off_hours_count
FROM transactions
GROUP BY merchant_id;

Works inside window functions too:

sum(amount) FILTER (WHERE merchant_category = 'gas_station')
  OVER (
    PARTITION BY cardholder_id ORDER BY timestamp
    ROWS BETWEEN 100 PRECEDING AND CURRENT ROW
  ) AS rolling_gas_spend

Mostly just easier to read once queries get large.

Support varies though. Postgres, Snowflake, DuckDB and ClickHouse support it. MySQL still doesn’t. BigQuery handles this a little differently with IF inside the aggregate instead.


4. FIRST_VALUE / LAST_VALUE / NTH_VALUE

These are useful anytime you need one row’s value carried across the whole partition.

Example: tagging every transaction with the first merchant a cardholder ever used.

SELECT
  cardholder_id,
  timestamp,
  merchant_id,
  first_value(merchant_id) OVER (
    PARTITION BY cardholder_id ORDER BY timestamp
    ROWS BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING
  ) AS first_merchant_ever,
  last_value(merchant_id) OVER (
    PARTITION BY cardholder_id ORDER BY timestamp
    ROWS BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING
  ) AS most_recent_merchant
FROM transactions;

The LAST_VALUE behavior trips people up constantly.

Without the explicit frame, LAST_VALUE usually returns the current row instead of the actual last row in the partition because the default frame stops at the current row.

NTH_VALUE works the same way conceptually. Useful for questions like “what was the third transaction this cardholder ever made?“


5. LEAD chains for sequence detection

LAG gets most of the attention because analysts usually look backwards more than forwards.

LEAD becomes useful once you’re looking for sequences instead of isolated events.

SELECT
  cardholder_id,
  timestamp,
  merchant_id,
  lead(merchant_id, 1) OVER w AS next_merchant_1,
  lead(merchant_id, 2) OVER w AS next_merchant_2,
  lead(merchant_id, 3) OVER w AS next_merchant_3,
  lead(timestamp, 3) OVER w - timestamp AS time_to_4th_tx
FROM transactions
WINDOW w AS (PARTITION BY cardholder_id ORDER BY timestamp);

Then you can filter on the sequence itself:

WHERE next_merchant_1 <> merchant_id
  AND next_merchant_2 <> merchant_id
  AND next_merchant_3 <> merchant_id
  AND time_to_4th_tx < INTERVAL '90 seconds'

A lot of card-testing behavior starts showing up in shapes like this.


6. Gap-and-island

Gap-and-island is basically the standard SQL trick for finding streaks or runs of related activity.

One common version looks like this:

SELECT
  cardholder_id,
  timestamp,
  merchant_id,
  amount,
  sum(
    CASE
      WHEN merchant_id <> lag(merchant_id) OVER w
      THEN 1
      ELSE 0
    END
  ) OVER w AS island_id
FROM transactions
WINDOW w AS (PARTITION BY cardholder_id ORDER BY timestamp);

Every time the merchant changes, the island ID increments. Depending on the engine, this is also one of those patterns where query planners can suddenly get a lot more expensive than you expected.

Once you have that grouping value, streak analysis gets much easier:

WITH islands AS (
  -- query above
)
SELECT
  cardholder_id,
  merchant_id,
  island_id,
  count(*) AS streak_length
FROM islands
GROUP BY 1, 2, 3
ORDER BY streak_length DESC;

This kind of thing is useful for spotting behavior shifts. Somebody with a year-long coffee-shop routine suddenly spending every day at gas stations is at least worth a second look.


7. Conditional running totals

CASE WHEN inside SUM OVER ends up doing a huge amount of work in fraud analysis.

Simple example:

SELECT
  cardholder_id,
  timestamp,
  approved,
  sum(CASE WHEN NOT approved THEN 1 ELSE 0 END) OVER (
    PARTITION BY cardholder_id
    ORDER BY timestamp
    RANGE BETWEEN INTERVAL '24 hours' PRECEDING AND CURRENT ROW
  ) AS declines_last_24h
FROM transactions;

Decline clusters are usually worth paying attention to. A surprising amount of card testing shows up as repeated declines followed by a successful approval somewhere later in the sequence.

The reset-on-condition versions get more complicated and usually end up combining multiple windows together. Useful pattern, but also one of the easier places to write something unreadable by accident.


8. PERCENT_RANK and NTILE

These are useful anytime fixed thresholds stop making sense globally.

WITH monthly_spend AS (
  SELECT cardholder_id, sum(amount) AS month_spend
  FROM transactions
  WHERE timestamp >= current_date - INTERVAL '30 days'
  GROUP BY 1
),
buckets AS (
  SELECT *, ntile(10) OVER (ORDER BY month_spend) AS spend_decile
  FROM monthly_spend
)
SELECT * FROM buckets WHERE spend_decile = 10;

A top-spend cardholder usually needs very different fraud thresholds than somebody barely using the card.

NTILE gives you buckets. PERCENT_RANK gives you the actual percentile position instead.


Putting it together

The reason these patterns matter is that they compose well.

Most individual fraud rules are actually pretty simple. The complexity comes from layering multiple weak signals together without turning the query into procedural spaghetti.

Window functions let you keep a surprising amount of this logic in straight SQL before things spill over into procedural code.

Most of the examples here are standard SQL. QUALIFY is still warehouse-specific, and FILTER support depends on the engine. If your warehouse doesn’t support QUALIFY, the fallback is usually just another CTE layer.

Next post is probably fraud-ring detection. Same general idea, just applied to graph-shaped problems instead of straight transaction streams.

If you’ve got a specific window-function shape that gave you trouble or a fraud pattern you’d like to see written up, the comments on the previous post are still active. Happy to take requests.