Skip to content

Commit d0d3322

Browse files
committed
Address Marcos' comments 2
1 parent 5148199 commit d0d3322

File tree

1 file changed

+7
-7
lines changed

1 file changed

+7
-7
lines changed

doc/developer/design/20230110_window_functions.md

Lines changed: 7 additions & 7 deletions
Original file line numberDiff line numberDiff line change
@@ -116,9 +116,9 @@ The current way of executing window functions is to put entire window partitions
116116
We'll use several approaches to solve the many cases mentioned in [Goals](#goals):
117117

118118
1. We'll use [DD's prefix_sum](https://github.com/TimelyDataflow/differential-dataflow/blob/master/src/algorithms/prefix_sum.rs) with some tricky sum functions and some generalizations.
119-
2. We'll use a special-purpose rendering for LAG/LEAD of offset 1 with no IGNORE NULLS, which will be simpler and more efficient than Prefix Sum.
119+
2. We'll use a [special-purpose rendering](#Special-rendering-for-LAG-and-LEAD) for LAG/LEAD of offset 1 with no IGNORE NULLS, which will be simpler and more efficient than Prefix Sum.
120120
3. As an extension of 1., we'll use a generalization of DD's prefix sum to arbitrary intervals (i.e., not just prefixes).
121-
4. We'll transform away window functions in some special cases (e.g., to TopK, or a simple grouped aggregation + self-join)
121+
4. We'll transform away window functions in some special cases (e.g., to TopK, or a simple grouped aggregation + self-join).
122122
5. Initially, we will resort to the old window function implementation in some cases, but this should become less and less over time. I think it will be possible to eventually implement all window function usage with the above 1.-4. approaches, but it will take time to get there.
123123

124124
### Getting window functions from SQL to the rendering
@@ -141,7 +141,7 @@ from cities;
141141
142142
To avoid creating a new enum variant in MirRelationExpr, we will recognize the above pattern during the MIR-to-LIR lowering, and create a new LIR enum variant for window functions. I estimate this pattern recognition to need about 15-20 if/match statements. It can happen that this pattern recognition approach turns out to be too brittle: we might accidentally leave out cases when the pattern is slightly different due to unrelated MIR transforms, plus we might break it from time to time with unrelated MIR transform changes. If this happens, then we might reconsider creating a new MIR enum variant later. (Which would be easier after the optimizer refactoring/cleanup.) For an extended discussion on alternative representations in HIR/MIR/LIR, see the [Representing window functions in each of the IRs](#Representing-window-functions-in-each-of-the-IRs) section.
143143
144-
Also, we will want to entirely transform away certain window function patterns; most notable is the ROW_NUMBER-to-TopK transform. For this, we need to canonicalize scalar expressions, which I think we usually do in MIR. This means that transforming away these window function patterns should happen on MIR. This will start by again recognizing the above general windowing pattern, and then performing pattern recognition of the TopK pattern.
144+
Also, we will want to entirely transform away certain window function patterns; most notable is the ROW_NUMBER-to-TopK transform. For this, we need to canonicalize scalar expressions, which I think we usually do in MIR. This means that transforming away these window function patterns should happen on MIR. This will start by, again, recognizing the above general windowing pattern, and then performing pattern recognition of the TopK-expressed-with-ROW_NUMBER pattern.
145145
146146
### Prefix Sum
147147
@@ -317,11 +317,11 @@ SELECT state, name, pop,
317317
FROM cities;
318318
```
319319

320-
These also operate based on a **frame**, similarly to window aggregations. (The above example query doesn't specify a frame, therefore it uses the default frame: from the beginning of the partition to the current row) They can be similarly implemented to window aggregations, i.e., we could “sum” up the relevant interval (that is not necessarily a prefix) with an appropriate sum function.
320+
These also operate based on a **frame**, similarly to window aggregations. (The above example query doesn't specify a frame, therefore it uses the default frame: from the beginning of the partition to the current row.)
321321

322-
Alternatively, we could make these a bit faster (except for NTH_VALUE) if we just find the index of the relevant end of the interval (i.e., left end for FIRST_VALUE), and then self-join.
322+
These could be implemented similarly to window aggregations, i.e., we could “sum” up the relevant interval (that is not necessarily a prefix) with an appropriate sum function. However, we will use a faster way to implement them (except for NTH_VALUE): we just find the index of the relevant end of the frame interval (i.e., left end for FIRST_VALUE), and then self-join. (This will happen in the MIR-to-LIR lowering, since finding the end of the interval is not expressible in MIR, as it is the same operation as finding the ends of frames for window aggregations.)
323323

324-
(And there are some special cases when we can transform away the window function usage: FIRST_VALUE with UNBOUNDED PRECEDING and LAST_VALUE with UNBOUNDED FOLLOWING should be transformed to just a (non-windowed) grouped aggregation + self-join instead of prefix sum trickery. Also, similarly for the case when there is no ORDER BY.)
324+
There are also some special cases where we can transform away the window function usage: FIRST_VALUE with UNBOUNDED PRECEDING and LAST_VALUE with UNBOUNDED FOLLOWING should be transformed to just a Top1 on the PARTITION BY key + a self-join on the same key instead of prefix sum trickery. This approach also works for the case when there is no ORDER BY, since in this case an entire partition is a single peer group.
325325

326326
----------------------
327327

@@ -392,7 +392,7 @@ How many bits we should chop off in one step involves a similar trade-off as a h
392392

393393
This will reduce the time overhead of `aggregate`. It will also reduce the memory overhead of `aggregate` by reducing the memory need of the internal operations, but it won't reduce the total output size of `aggregate`.
394394

395-
#### Special rendering for LAG/LEAD
395+
#### Special rendering for LAG and LEAD
396396

397397
Instead of prefix sum, we will have a special rendering for LAG/LEAD: A similar iteration to `aggregate` will chop off 6 bits of the indexes in each step, but the `reduce` logic will simply perform the LAG/LEAD on those elements that went into one invocation of the logic (instead of summing intervals). It can perform the LAG on all but the first element of the list of elements that go into a single invocation of the logic. The first element it will just send onwards to later steps. Therefore, the output will include two kinds of values: one will be final LAG values, and the other will be values that are still waiting for their LAG results. These special values will be met up with the last elements of the input list of the `reduce` logic of the next step.
398398

0 commit comments

Comments
 (0)