Skip to content

Aggregate combinators examples #3446

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 30 commits into
base: main
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
30 commits
Select commit Hold shift + click to select a range
946f78d
add combinator skeletons
Blargian Mar 4, 2025
aecedfa
add aggThrow and histogram combinators
Blargian Mar 4, 2025
4ca296c
improvements to structure
Blargian Mar 4, 2025
6f5ba54
add analysisOfVariance
Blargian Mar 4, 2025
eba0caf
backtick combinator names
Blargian Mar 4, 2025
bfc3aa5
add any
Blargian Mar 4, 2025
8fc222b
add anyHeavy
Blargian Mar 4, 2025
5d9f84b
add anyLast
Blargian Mar 4, 2025
e7a73ff
add approx_top_k
Blargian Mar 4, 2025
ff33ca4
do only most commonly used ones
Blargian Mar 4, 2025
a6d636f
remove autogenerated aggregate combinators exhaustive list progress i…
Blargian Mar 4, 2025
07c199e
add a few common aggregate combinators
Blargian Mar 4, 2025
60d825d
add more combinators
Blargian Mar 6, 2025
b9d12b9
update all with descriptions, add quantilesTimingIf
Blargian Mar 6, 2025
9b9be24
add argMinIf
Blargian Mar 6, 2025
791d562
add argMaxIf
Blargian Mar 6, 2025
0302e93
add sumArray
Blargian Mar 6, 2025
81306ae
contrast sumArray with sum(arraySum(arr)) in example and adhere to 80…
Blargian Mar 6, 2025
080337c
add quantilesTimingArrayIf (example not working)
Blargian Mar 6, 2025
fe3fd28
improve example for quantilesTimingArrayIf
Blargian Mar 6, 2025
8801b74
add minMap, maxMap, sumMap, avgMap
Blargian Mar 6, 2025
9635bac
add explicit headers
Blargian Mar 6, 2025
b178de6
Merge branch 'main' of https://github.com/ClickHouse/clickhouse-docs …
Blargian Mar 17, 2025
da8c096
update combinator examples, add SimpleState
Blargian Mar 17, 2025
a11be27
Merge branch 'main' of https://github.com/ClickHouse/clickhouse-docs …
Blargian Apr 21, 2025
4773a6b
add more examples
Blargian Apr 21, 2025
a426db7
add aggregate combinator function names to exceptions
Blargian Apr 21, 2025
4e953fe
Add missing code block language
Blargian Apr 21, 2025
d5c2ca0
add titles to frontmatter
Blargian Apr 22, 2025
d4c8987
add aggregate combinator names to aspell-dict.txt
Blargian Apr 22, 2025
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
61 changes: 61 additions & 0 deletions docs/guides/examples/aggregate_function_combinators/anyIf.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,61 @@
---
slug: '/examples/aggregate-function-combinators/anyIf'
title: 'anyIf'
description: 'Example of using the anyIf combinator'
keywords: ['any', 'if', 'combinator', 'examples', 'anyIf']
sidebar_label: 'anyIf'
---

# anyIf {#avgif}

## Description {#description}

The [`If`](/sql-reference/aggregate-functions/combinators#-if) combinator can be applied to the [`any`](/sql-reference/aggregate-functions/reference/any)
aggregate function to select the first encountered element from a given column
that matches the given condition.

## Example Usage {#example-usage}

In this example, we'll create a table that stores sales data with success flags,
and we'll use `anyIf` to select the first `transaction_id`s which are above and
below an amount of 200.

We first create a table and insert data into it:

```sql title="Query"
CREATE TABLE sales(
transaction_id UInt32,
amount Decimal(10,2),
is_successful UInt8
)
ENGINE = MergeTree()
ORDER BY tuple();

INSERT INTO sales VALUES
(1, 100.00, 1),
(2, 150.00, 1),
(3, 155.00, 0),
(4, 300.00, 1),
(5, 250.50, 0),
(6, 175.25, 1);
```

```sql
SELECT
anyIf(transaction_id, amount < 200) as tid_lt_200,
anyIf(transaction_id, amount > 200) as tid_gt_200
FROM sales;
```

The `avgIf` function will calculate the average amount only for rows where `is_successful = 1`.
In this case, it will average the amounts: 100.50, 200.75, 300.00, and 175.25.

```response title="Response"
┌─tid_lt_200─┬─tid_gt_200─┐
│ 1 │ 4 │
└────────────┴────────────┘
```

## See also {#see-also}
- [`avg`](/sql-reference/aggregate-functions/reference/avg)
- [`If combinator`](/sql-reference/aggregate-functions/combinators#-if)
61 changes: 61 additions & 0 deletions docs/guides/examples/aggregate_function_combinators/argMaxIf.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,61 @@
---
slug: '/examples/aggregate-function-combinators/argMaxIf'
title: 'argMaxIf'
description: 'Example of using the argMaxIf combinator'
keywords: ['argMax', 'if', 'combinator', 'examples', 'argMaxIf']
sidebar_label: 'argMaxIf'
---

# argMaxIf {#argmaxif}

## Description {#description}

The [`If`](/sql-reference/aggregate-functions/combinators#-if) combinator can be applied to the [`argMax`](/sql-reference/aggregate-functions/reference/argmax)
function to find the value of `arg` that corresponds to the maximum value of `val` for rows where the condition is true,
using the `argMaxIf` aggregate combinator function.

The `argMaxIf` function is useful when you need to find the value associated with
the maximum value in a dataset, but only for rows that satisfy a specific
condition.

## Example Usage {#example-usage}

In this example, we'll use a sample dataset of product sales to demonstrate how
`argMaxIf` works. We'll find the product name that has the highest price, but
only for products that have been sold at least 10 times.

```sql title="Query"
CREATE TABLE product_sales
(
product_name String,
price Decimal32(2),
sales_count UInt32
) ENGINE = Memory;

INSERT INTO product_sales VALUES
('Laptop', 999.99, 10),
('Phone', 499.99, 15),
('Tablet', 299.99, 0),
('Watch', 199.99, 5),
('Headphones', 79.99, 20);

SELECT argMaxIf(product_name, price, sales_count >= 10) as most_expensive_popular_product
FROM product_sales;
```

The `argMaxIf` function will return the product name that has the highest price
among all products that have been sold at least 10 times (sales_count >= 10).
In this case, it will return 'Laptop' since it has the highest price (999.99)
among the popular products.

```response title="Response"
┌─most_expensi⋯lar_product─┐
1. │ Laptop │
└──────────────────────────┘
```

## See also {#see-also}
- [`argMax`](/sql-reference/aggregate-functions/reference/argmax)
- [`argMin`](/sql-reference/aggregate-functions/reference/argmin)
- [`argMinIf`](/examples/aggregate-function-combinators/argMinIf)
- [`If combinator`](/sql-reference/aggregate-functions/combinators#-if)
65 changes: 65 additions & 0 deletions docs/guides/examples/aggregate_function_combinators/argMinIf.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,65 @@
---
slug: '/examples/aggregate-function-combinators/argMinIf'
title: 'argMinIf'
description: 'Example of using the argMinIf combinator'
keywords: ['argMin', 'if', 'combinator', 'examples', 'argMinIf']
sidebar_label: 'argMinIf'
---

# argMinIf {#argminif}

## Description {#description}

The [`If`](/sql-reference/aggregate-functions/combinators#-if) combinator can be applied to the [`argMin`](/sql-reference/aggregate-functions/reference/argmin)
function to find the value of `arg` that corresponds to the minimum value of `val` for rows where the condition is true,
using the `argMinIf` aggregate combinator function.

The `argMinIf` function is useful when you need to find the value associated
with the minimum value in a dataset, but only for rows that satisfy a specific
condition.

## Example Usage {#example-usage}

In this example, we'll create a table that stores product prices and their timestamps,
and we'll use `argMinIf` to find the lowest price for each product when it's in stock.

```sql title="Query"
CREATE TABLE product_prices(
product_id UInt32,
price Decimal(10,2),
timestamp DateTime,
in_stock UInt8
) ENGINE = Log;

INSERT INTO product_prices VALUES
(1, 10.99, '2024-01-01 10:00:00', 1),
(1, 9.99, '2024-01-01 10:05:00', 1),
(1, 11.99, '2024-01-01 10:10:00', 0),
(2, 20.99, '2024-01-01 11:00:00', 1),
(2, 19.99, '2024-01-01 11:05:00', 1),
(2, 21.99, '2024-01-01 11:10:00', 1);

SELECT
product_id,
argMinIf(price, timestamp, in_stock = 1) as lowest_price_when_in_stock
FROM product_prices
GROUP BY product_id;
```

The `argMinIf` function will find the price that corresponds to the earliest timestamp for each product,
but only considering rows where `in_stock = 1`. For example:
- Product 1: Among in-stock rows, 10.99 has the earliest timestamp (10:00:00)
- Product 2: Among in-stock rows, 20.99 has the earliest timestamp (11:00:00)

```response title="Response"
┌─product_id─┬─lowest_price_when_in_stock─┐
1. │ 1 │ 10.99 │
2. │ 2 │ 20.99 │
└────────────┴────────────────────────────┘
```

## See also {#see-also}
- [`argMin`](/sql-reference/aggregate-functions/reference/argmin)
- [`argMax`](/sql-reference/aggregate-functions/reference/argmax)
- [`argMaxIf`](/examples/aggregate-function-combinators/argMaxIf)
- [`If combinator`](/sql-reference/aggregate-functions/combinators#-if)
53 changes: 53 additions & 0 deletions docs/guides/examples/aggregate_function_combinators/avgIf.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,53 @@
---
slug: '/examples/aggregate-function-combinators/avgIf'
title: 'avgIf'
description: 'Example of using the avgIf combinator'
keywords: ['avg', 'if', 'combinator', 'examples', 'avgIf']
sidebar_label: 'avgIf'
---

# avgIf {#avgif}

## Description {#description}

The [`If`](/sql-reference/aggregate-functions/combinators#-if) combinator can be applied to the [`avg`](/sql-reference/aggregate-functions/reference/avg)
function to calculate the arithmetic mean of values for rows where the condition is true,
using the `avgIf` aggregate combinator function.

## Example Usage {#example-usage}

In this example, we'll create a table that stores sales data with success flags,
and we'll use `avgIf` to calculate the average sale amount for successful transactions.

```sql title="Query"
CREATE TABLE sales(
transaction_id UInt32,
amount Decimal(10,2),
is_successful UInt8
) ENGINE = Log;

INSERT INTO sales VALUES
(1, 100.50, 1),
(2, 200.75, 1),
(3, 150.25, 0),
(4, 300.00, 1),
(5, 250.50, 0),
(6, 175.25, 1);

SELECT
avgIf(amount, is_successful = 1) as avg_successful_sale
FROM sales;
```

The `avgIf` function will calculate the average amount only for rows where `is_successful = 1`.
In this case, it will average the amounts: 100.50, 200.75, 300.00, and 175.25.

```response title="Response"
┌─avg_successful_sale─┐
1. │ 193.88 │
└─────────────────────┘
```

## See also {#see-also}
- [`avg`](/sql-reference/aggregate-functions/reference/avg)
- [`If combinator`](/sql-reference/aggregate-functions/combinators#-if)
65 changes: 65 additions & 0 deletions docs/guides/examples/aggregate_function_combinators/avgMap.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,65 @@
---
slug: '/examples/aggregate-function-combinators/avgMap'
title: 'avgMap'
description: 'Example of using the avgMap combinator'
keywords: ['avg', 'map', 'combinator', 'examples', 'avgMap']
sidebar_label: 'avgMap'
---

# avgMap {#avgmap}

## Description {#description}

The [`Map`](/sql-reference/aggregate-functions/combinators#-map) combinator can be applied to the [`avg`](/sql-reference/aggregate-functions/reference/avg)
function to calculate the arithmetic mean of values in a Map according to each key, using the `avgMap`
aggregate combinator function.

## Example Usage {#example-usage}

In this example, we'll create a table that stores status codes and their counts for different timeslots,
where each row contains a Map of status codes to their corresponding counts. We'll use
`avgMap` to calculate the average count for each status code within each timeslot.

```sql title="Query"
CREATE TABLE metrics(
date Date,
timeslot DateTime,
status Map(String, UInt64)
) ENGINE = Log;

INSERT INTO metrics VALUES
('2000-01-01', '2000-01-01 00:00:00', (['a', 'b', 'c'], [15, 25, 35])),
('2000-01-01', '2000-01-01 00:00:00', (['c', 'd', 'e'], [45, 55, 65])),
('2000-01-01', '2000-01-01 00:01:00', (['d', 'e', 'f'], [75, 85, 95])),
('2000-01-01', '2000-01-01 00:01:00', (['f', 'g', 'g'], [105, 115, 125]));

SELECT
timeslot,
avgMap(status),
FROM metrics
GROUP BY timeslot;
```

The `avgMap` function will calculate the average count for each status code within each timeslot. For example:
- In timeslot '2000-01-01 00:00:00':
- Status 'a': 15
- Status 'b': 25
- Status 'c': (35 + 45) / 2 = 40
- Status 'd': 55
- Status 'e': 65
- In timeslot '2000-01-01 00:01:00':
- Status 'd': 75
- Status 'e': 85
- Status 'f': (95 + 105) / 2 = 100
- Status 'g': (115 + 125) / 2 = 120

```response title="Response"
┌────────────timeslot─┬─avgMap(status)───────────────────────┐
1. │ 2000-01-01 00:01:00 │ {'d':75,'e':85,'f':100,'g':120} │
2. │ 2000-01-01 00:00:00 │ {'a':15,'b':25,'c':40,'d':55,'e':65} │
└─────────────────────┴──────────────────────────────────────┘
```

## See also {#see-also}
- [`avg`](/sql-reference/aggregate-functions/reference/avg)
- [`Map combinator`](/sql-reference/aggregate-functions/combinators#-map)
26 changes: 26 additions & 0 deletions docs/guides/examples/aggregate_function_combinators/avgMerge.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,26 @@
---
slug: '/examples/aggregate-function-combinators/avgMerge'
title: 'avgMerge'
description: 'Example of using the avgMerge combinator'
keywords: ['avg', 'merge', 'combinator', 'examples', 'avgMerge']
sidebar_label: 'avgMerge'
---

# avgMerge {#avgMerge}

## Description {#description}

The [`Merge`](/sql-reference/aggregate-functions/combinators#-state) combinator
can be applied to the [`avg`](/sql-reference/aggregate-functions/reference/avg)
function to produce a final result by combining partial aggregate states.

## Example Usage {#example-usage}

The `Merge` combinator is closely related to the `State` combinator. Refer to
["avgState example usage"](/examples/aggregate-function-combinators/avgState/#example-usage)
for an example of both `avgMerge` and `avgState`.

## See also {#see-also}
- [`avg`](/sql-reference/aggregate-functions/reference/avg)
- [`Merge`](/sql-reference/aggregate-functions/combinators#-merge)
- [`MergeState`](/sql-reference/aggregate-functions/combinators#-mergestate)
Loading