Skip to content

Commit 7df73cb

Browse files
authored
Dedicated pages of aggregations (#89)
1 parent 1af35cb commit 7df73cb

26 files changed

+3608
-760
lines changed

apl/aggregation-function/avg.mdx

+152
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,152 @@
1+
---
2+
title: avg
3+
description: 'This page explains how to use the avg aggregation function in APL.'
4+
---
5+
6+
The `avg` aggregation in APL calculates the average value of a numeric field across a set of records. You can use this aggregation when you need to determine the mean value of numerical data, such as request durations, response times, or other performance metrics. It is useful in scenarios such as performance analysis, trend identification, and general statistical analysis.
7+
8+
When to use `avg`:
9+
10+
- When you want to analyze the average of numeric values over a specific time range or set of data.
11+
- For comparing trends, like average request duration or latency across HTTP requests.
12+
- To provide insight into system or user performance, such as the average duration of transactions in a service.
13+
14+
## For users of other query languages
15+
16+
If you come from other query languages, this section explains how to adjust your existing queries to achieve the same results in APL.
17+
18+
<AccordionGroup>
19+
<Accordion title="Splunk SPL users">
20+
21+
In Splunk SPL, the `avg` function works similarly, but the syntax differs slightly. Here’s how to write the equivalent query in APL.
22+
23+
<CodeGroup>
24+
```sql Splunk example
25+
| stats avg(req_duration_ms) by status
26+
```
27+
28+
```kusto APL equivalent
29+
['sample-http-logs']
30+
| summarize avg(req_duration_ms) by status
31+
```
32+
</CodeGroup>
33+
34+
</Accordion>
35+
<Accordion title="ANSI SQL users">
36+
37+
In ANSI SQL, the `avg` aggregation is used similarly, but APL has a different syntax for structuring the query.
38+
39+
<CodeGroup>
40+
```sql SQL example
41+
SELECT status, AVG(req_duration_ms)
42+
FROM sample_http_logs
43+
GROUP BY status
44+
```
45+
46+
```kusto APL equivalent
47+
['sample-http-logs']
48+
| summarize avg(req_duration_ms) by status
49+
```
50+
</CodeGroup>
51+
52+
</Accordion>
53+
</AccordionGroup>
54+
55+
## Usage
56+
57+
### Syntax
58+
59+
```kusto
60+
summarize avg(ColumnName) [by GroupingColumn]
61+
```
62+
63+
### Parameters
64+
65+
- **ColumnName**: The numeric field you want to calculate the average of.
66+
- **GroupingColumn** (optional): A column to group the results by. If not specified, the average is calculated over all records.
67+
68+
### Returns
69+
70+
- A table with the average value for the specified field, optionally grouped by another column.
71+
72+
## Use case examples
73+
74+
<Tabs>
75+
<Tab title="Log analysis">
76+
77+
This example calculates the average request duration for HTTP requests, grouped by status.
78+
79+
**Query**
80+
81+
```kusto
82+
['sample-http-logs']
83+
| summarize avg(req_duration_ms) by status
84+
```
85+
86+
[Run in Playground](https://play.axiom.co/axiom-play-qf1k/explorer?initForm=%7B%22apl%22%3A%22%5B'sample-http-logs'%5D%5Cn%7C%20summarize%20avg(req_duration_ms)%20by%20status%22%7D)
87+
88+
**Output**
89+
90+
| status | avg_req_duration_ms |
91+
|--------|---------------------|
92+
| 200 | 350.4 |
93+
| 404 | 150.2 |
94+
95+
This query calculates the average request duration (in milliseconds) for each HTTP status code.
96+
97+
</Tab>
98+
<Tab title="OpenTelemetry traces">
99+
100+
This example calculates the average span duration for each service to analyze performance across services.
101+
102+
**Query**
103+
104+
```kusto
105+
['otel-demo-traces']
106+
| summarize avg(duration) by ['service.name']
107+
```
108+
109+
[Run in Playground](https://play.axiom.co/axiom-play-qf1k/explorer?initForm=%7B%22apl%22%3A%22%5B'otel-demo-traces'%5D%5Cn%7C%20summarize%20avg(duration)%20by%20%5B'service.name'%5D%22%7D)
110+
111+
**Output**
112+
113+
| service.name | avg_duration |
114+
|-----------------------|--------------|
115+
| frontend | 500ms |
116+
| cartservice | 250ms |
117+
118+
This query calculates the average duration of spans for each service.
119+
120+
</Tab>
121+
<Tab title="Security logs">
122+
123+
In security logs, you can calculate the average request duration by country to analyze regional performance trends.
124+
125+
**Query**
126+
127+
```kusto
128+
['sample-http-logs']
129+
| summarize avg(req_duration_ms) by ['geo.country']
130+
```
131+
132+
[Run in Playground](https://play.axiom.co/axiom-play-qf1k/explorer?initForm=%7B%22apl%22%3A%22%5B'sample-http-logs'%5D%5Cn%7C%20summarize%20avg(req_duration_ms)%20by%20%5B'geo.country'%5D%22%7D)
133+
134+
**Output**
135+
136+
| geo.country | avg_req_duration_ms |
137+
|-------------|---------------------|
138+
| US | 400.5 |
139+
| DE | 250.3 |
140+
141+
This query calculates the average request duration for each country from where the requests originated.
142+
143+
</Tab>
144+
</Tabs>
145+
146+
## List of related aggregations
147+
148+
- [**sum**](/apl/aggregation-function/sum): Use `sum` to calculate the total of a numeric field. This is useful when you want the total of values rather than their average.
149+
- [**count**](/apl/aggregation-function/count): The `count` function returns the total number of records. It’s useful when you want to count occurrences rather than averaging numerical values.
150+
- [**min**](/apl/aggregation-function/min): The `min` function returns the minimum value of a numeric field. Use this when you’re interested in the smallest value in your dataset.
151+
- [**max**](/apl/aggregation-function/max): The `max` function returns the maximum value of a numeric field. This is useful for finding the largest value in the data.
152+
- [**stdev**](/apl/aggregation-function/stdev): This function calculates the standard deviation of a numeric field, providing insight into how spread out the data is around the mean.

apl/aggregation-function/avgif.mdx

+149
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,149 @@
1+
---
2+
title: avgif
3+
description: 'This page explains how to use the avgif aggregation function in APL.'
4+
---
5+
6+
The `avgif` aggregation function in APL allows you to calculate the average value of a field, but only for records that satisfy a given condition. This function is particularly useful when you need to perform a filtered aggregation, such as finding the average response time for requests that returned a specific status code or filtering by geographic regions. The `avgif` function is highly valuable in scenarios like log analysis, performance monitoring, and anomaly detection, where focusing on subsets of data can provide more accurate insights.
7+
8+
## For users of other query languages
9+
10+
If you come from other query languages, this section explains how to adjust your existing queries to achieve the same results in APL.
11+
12+
<AccordionGroup>
13+
<Accordion title="Splunk SPL users">
14+
15+
In Splunk, you achieve similar functionality using the combination of a `stats` function with conditional filtering. In APL, `avgif` provides this filtering inline as part of the aggregation function, which can simplify your queries.
16+
17+
<CodeGroup>
18+
```sql Splunk example
19+
| stats avg(req_duration_ms) by id where status = "200"
20+
```
21+
22+
```kusto APL equivalent
23+
['sample-http-logs']
24+
| summarize avgif(req_duration_ms, status == "200") by id
25+
```
26+
</CodeGroup>
27+
28+
</Accordion>
29+
<Accordion title="ANSI SQL users">
30+
31+
In ANSI SQL, you can use a `CASE` statement inside an `AVG` function to achieve similar behavior. APL simplifies this with `avgif`, allowing you to specify the condition directly.
32+
33+
<CodeGroup>
34+
```sql SQL example
35+
SELECT id, AVG(CASE WHEN status = '200' THEN req_duration_ms ELSE NULL END)
36+
FROM sample_http_logs
37+
GROUP BY id
38+
```
39+
40+
```kusto APL equivalent
41+
['sample-http-logs']
42+
| summarize avgif(req_duration_ms, status == "200") by id
43+
```
44+
</CodeGroup>
45+
46+
</Accordion>
47+
</AccordionGroup>
48+
49+
## Usage
50+
51+
### Syntax
52+
53+
```kusto
54+
summarize avgif(expr, predicate) by grouping_field
55+
```
56+
57+
### Parameters
58+
59+
- **`expr`**: The field for which you want to calculate the average.
60+
- **`predicate`**: A boolean condition that filters which records are included in the calculation.
61+
- **`grouping_field`**: (Optional) A field by which you want to group the results.
62+
63+
### Returns
64+
65+
The function returns the average of the values from the `expr` field for the records that satisfy the `predicate`. If no records match the condition, the result is `null`.
66+
67+
## Use case examples
68+
69+
<Tabs>
70+
<Tab title="Log analysis">
71+
72+
In this example, you calculate the average request duration for HTTP status 200 in different cities.
73+
74+
**Query**
75+
76+
```kusto
77+
['sample-http-logs']
78+
| summarize avgif(req_duration_ms, status == "200") by ['geo.city']
79+
```
80+
81+
[Run in Playground](https://play.axiom.co/axiom-play-qf1k/explorer?initForm=%7B%22apl%22%3A%22%5B%27sample-http-logs%27%5D%20%7C%20summarize%20avgif%28req_duration_ms%2C%20status%20%3D%3D%20%22200%22%29%20by%20%5B%27geo.city%27%5D%22%7D)
82+
83+
**Output**
84+
85+
| geo.city | avg_req_duration_ms |
86+
|------------|---------------------|
87+
| New York | 325 |
88+
| London | 400 |
89+
| Tokyo | 275 |
90+
91+
This query calculates the average request duration (`req_duration_ms`) for HTTP requests that returned a status of 200 (`status == "200"`), grouped by the city where the request originated (`geo.city`).
92+
93+
</Tab>
94+
<Tab title="OpenTelemetry traces">
95+
96+
In this example, you calculate the average span duration for traces that ended with HTTP status 500.
97+
98+
**Query**
99+
100+
```kusto
101+
['otel-demo-traces']
102+
| summarize avgif(duration, status == "500") by ['service.name']
103+
```
104+
105+
[Run in Playground](https://play.axiom.co/axiom-play-qf1k/explorer?initForm=%7B%22apl%22%3A%22%5B%27otel-demo-traces%27%5D%20%7C%20summarize%20avgif%28duration%2C%20status%20%3D%3D%20%22500%22%29%20by%20%5B%27service.name%27%5D%22%7D)
106+
107+
**Output**
108+
109+
| service.name | avg_duration |
110+
|-----------------------|--------------|
111+
| checkoutservice | 500ms |
112+
| frontend | 600ms |
113+
| cartservice | 475ms |
114+
115+
This query calculates the average span duration (`duration`) for traces where the status code is 500 (`status == "500"`), grouped by the service name (`service.name`).
116+
117+
</Tab>
118+
<Tab title="Security logs">
119+
120+
In this example, you calculate the average request duration for failed HTTP requests (status code 400 or higher) by country.
121+
122+
**Query**
123+
124+
```kusto
125+
['sample-http-logs']
126+
| summarize avgif(req_duration_ms, toint(status) >= 400) by ['geo.country']
127+
```
128+
129+
[Run in Playground](https://play.axiom.co/axiom-play-qf1k/explorer?initForm=%7B%22apl%22%3A%22%5B%27sample-http-logs%27%5D%20%7C%20summarize%20avgif%28req_duration_ms%2C%20toint%28status%29%20%3E%3D%20400%29%20by%20%5B%27geo.country%27%5D%22%7D)
130+
131+
**Output**
132+
133+
| geo.country | avg_req_duration_ms |
134+
|---------------|---------------------|
135+
| USA | 450 |
136+
| Canada | 500 |
137+
| Germany | 425 |
138+
139+
This query calculates the average request duration (`req_duration_ms`) for failed HTTP requests (`status >= 400`), grouped by the country of origin (`geo.country`).
140+
141+
</Tab>
142+
</Tabs>
143+
144+
## List of related aggregations
145+
146+
- [**minif**](/apl/aggregation-function/minif): Returns the minimum value of an expression, filtered by a predicate. Use when you want to find the smallest value for a subset of data.
147+
- [**maxif**](/apl/aggregation-function/maxif): Returns the maximum value of an expression, filtered by a predicate. Use when you are looking for the largest value within specific conditions.
148+
- [**countif**](/apl/aggregation-function/countif): Counts the number of records that match a condition. Use when you want to know how many records meet a specific criterion.
149+
- [**sumif**](/apl/aggregation-function/sumif): Sums the values of a field that match a given condition. Ideal for calculating the total of a subset of data.

0 commit comments

Comments
 (0)