Skip to content

Commit 6c07d2e

Browse files
committed
Merge branch 'main' into mano/rewrite-get-started
2 parents 0b6dd2b + 3fb3f26 commit 6c07d2e

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

74 files changed

+1610
-524
lines changed
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,162 @@
1+
---
2+
title: percentiles_array
3+
description: 'This page explains how to use the percentiles_array function in APL.'
4+
---
5+
6+
Use the `percentiles_array` aggregation function in APL to calculate multiple percentile values over a numeric expression in one pass. This function is useful when you want to understand the distribution of numeric data points, such as response times or durations, by summarizing them at several key percentiles like the 25th, 50th, and 95th.
7+
8+
You can use `percentiles_array` to:
9+
10+
- Analyze latency or duration metrics across requests or operations.
11+
- Identify performance outliers.
12+
- Visualize percentile distributions in dashboards.
13+
14+
## For users of other query languages
15+
16+
If you come from other query languages, this section explains how to adjust your existing queries to achieve the same results in APL.
17+
18+
<AccordionGroup>
19+
<Accordion title="Splunk SPL users">
20+
21+
In Splunk, you typically calculate percentiles one at a time using the `perc` function. To get multiple percentiles, you repeat the function with different percentile values. In APL, `percentiles_array` lets you specify multiple percentiles in a single function call and returns them as an array.
22+
23+
<CodeGroup>
24+
```sql Splunk example
25+
... | stats perc95(duration), perc50(duration), perc25(duration) by service
26+
```
27+
28+
```kusto APL equivalent
29+
['otel-demo-traces']
30+
| summarize percentiles_array(duration, 25, 50, 95) by ['service.name']
31+
```
32+
</CodeGroup>
33+
34+
</Accordion>
35+
<Accordion title="ANSI SQL users">
36+
37+
Standard SQL typically lacks a built-in function to calculate multiple percentiles in a single operation. Instead, you use `PERCENTILE_CONT` or `PERCENTILE_DISC` with `WITHIN GROUP`, repeated for each desired percentile. In APL, `percentiles_array` simplifies this with a single function call that returns all requested percentiles as an array.
38+
39+
<CodeGroup>
40+
```sql SQL example
41+
SELECT
42+
service,
43+
PERCENTILE_CONT(0.25) WITHIN GROUP (ORDER BY duration) AS p25,
44+
PERCENTILE_CONT(0.50) WITHIN GROUP (ORDER BY duration) AS p50,
45+
PERCENTILE_CONT(0.95) WITHIN GROUP (ORDER BY duration) AS p95
46+
FROM traces
47+
GROUP BY service
48+
```
49+
50+
```kusto APL equivalent
51+
['otel-demo-traces']
52+
| summarize percentiles_array(duration, 25, 50, 95) by ['service.name']
53+
```
54+
</CodeGroup>
55+
56+
</Accordion>
57+
</AccordionGroup>
58+
59+
## Usage
60+
61+
### Syntax
62+
63+
```kusto
64+
percentiles_array(Field, Percentile1, Percentile2, ...)
65+
```
66+
67+
### Parameters
68+
69+
- `Field` is the name of the field for which you want to compute percentile values.
70+
- `Percentile1`, `Percentile2`, ... are numeric percentile values between 0 and 100.
71+
72+
### Returns
73+
74+
An array of numbers where each element is the value at the corresponding percentile.
75+
76+
## Use case examples
77+
78+
<Tabs>
79+
<Tab title="Log analysis">
80+
81+
Use `percentiles_array` to understand the spread of request durations per HTTP method, highlighting performance variability.
82+
83+
**Query**
84+
85+
```kusto
86+
['sample-http-logs']
87+
| summarize percentiles_array(req_duration_ms, 25, 50, 95) by method
88+
```
89+
90+
[Run in Playground](https://play.axiom.co/axiom-play-qf1k/query?initForm=%7B%22apl%22%3A%22%5B'sample-http-logs'%5D%20%7C%20summarize%20percentiles_array(req_duration_ms%2C%2025%2C%2050%2C%2095)%20by%20method%22%7D)
91+
92+
**Output**
93+
94+
| method | P25 | P50 | P95 |
95+
|--------|-----------|-----------|----------|
96+
| GET | 0.3981 ms | 0.7352 ms | 1.981 ms |
97+
| POST | 0.3261 ms | 0.7162 ms | 2.341 ms |
98+
| PUT | 0.3324 ms | 0.7772 ms | 1.341 ms |
99+
| DELETE | 0.2332 ms | 0.4652 ms | 1.121 ms |
100+
101+
This query calculates the 25th, 50th, and 95th percentiles of request durations for each HTTP method. It helps identify performance differences between different methods.
102+
103+
</Tab>
104+
<Tab title="OpenTelemetry traces">
105+
106+
Use `percentiles_array` to analyze the distribution of span durations by service to detect potential bottlenecks.
107+
108+
**Query**
109+
110+
```kusto
111+
['otel-demo-traces']
112+
| summarize percentiles_array(duration, 50, 90, 99) by ['service.name']
113+
```
114+
115+
[Run in Playground](https://play.axiom.co/axiom-play-qf1k/query?initForm=%7B%22apl%22%3A%22%5B'otel-demo-traces'%5D%20%7C%20summarize%20percentiles_array(duration%2C%2050%2C%2090%2C%2099)%20by%20%5B'service.name'%5D%22%7D)
116+
117+
**Output**
118+
119+
| service.name | P50 | P90 | P99 | P99 |
120+
|-----------------------|----------|-----------|-----------|-----------|
121+
| recommendationservice | 1.96 ms | 2.965 ms | 3.477 ms | 3.477 ms |
122+
| frontendproxy | 3.767 ms | 13.101 ms | 39.735 ms | 39.735 ms |
123+
| shippingservice | 2.119 ms | 3.085 ms | 9.739 ms | 9.739 ms |
124+
| checkoutservice | 1.454 ms | 12.342 ms | 29.542 ms | 29.542 ms |
125+
126+
This query shows latency patterns across services by computing the median, 90th, and 99th percentile of span durations.
127+
128+
</Tab>
129+
<Tab title="Security logs">
130+
131+
Use `percentiles_array` to assess outlier response times per status code, which can reveal abnormal activity or service issues.
132+
133+
**Query**
134+
135+
```kusto
136+
['sample-http-logs']
137+
| summarize percentiles_array(req_duration_ms, 50, 95, 99) by status
138+
```
139+
140+
[Run in Playground](https://play.axiom.co/axiom-play-qf1k/query?initForm=%7B%22apl%22%3A%22%5B'sample-http-logs'%5D%20%7C%20summarize%20percentiles_array(req_duration_ms%2C%2050%2C%2095%2C%2099)%20by%20status%22%7D)
141+
142+
**Output**
143+
144+
| status | P50 | P95 | P99 |
145+
|--------|-----------|----------|----------|
146+
| 200 | 0.7352 ms | 1.981 ms | 2.612 ms |
147+
| 201 | 0.7856 ms | 1.356 ms | 2.234 ms |
148+
| 301 | 0.8956 ms | 1.547 ms | 2.546 ms |
149+
| 500 | 0.6587 ms | 1.856 ms | 2.856 ms |
150+
151+
This query helps identify whether requests resulting in errors (like 500) are significantly slower than successful ones.
152+
153+
</Tab>
154+
</Tabs>
155+
156+
## List of related functions
157+
158+
- [avg](/apl/aggregation-function/avg): Returns the average value. Use it when a single central tendency is sufficient.
159+
- [percentile](/apl/aggregation-function/percentile): Returns a single percentile value. Use it when you only need one percentile.
160+
- [percentile_if](/apl/aggregation-function/percentileif): Returns a single percentile value for the records that satisfy a condition.
161+
- [percentiles_arrayif](/apl/aggregation-function/percentiles-arrayif): Returns an array of percentile values for the records that satisfy a condition.
162+
- [sum](/apl/aggregation-function/sum): Returns the sum of a numeric column.
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,156 @@
1+
---
2+
title: percentiles_arrayif
3+
description: 'This page explains how to use the percentiles_array function in APL.'
4+
---
5+
6+
Use `percentiles_arrayif` to calculate approximate percentile values for a numeric expression when a certain condition evaluates to true. This function is useful when you want an array of percentiles instead of a single percentile. You can use it to understand data distributions in scenarios such as request durations, event processing times, or security alert severities, while filtering on specific criteria.
7+
8+
## For users of other query languages
9+
10+
If you come from other query languages, this section explains how to adjust your existing queries to achieve the same results in APL.
11+
12+
<AccordionGroup>
13+
<Accordion title="Splunk SPL users">
14+
15+
In Splunk SPL, you often use statistical functions such as `perc<percent>` or `percN()` to compute percentile estimates. In APL, you use `percentiles_arrayif` and provide a predicate to define which records to include in the computation.
16+
17+
<CodeGroup>
18+
```sql Splunk example
19+
index=main sourcetype=access_combined
20+
| stats perc90(req_duration_ms) AS p90, perc99(req_duration_ms) AS p99
21+
```
22+
23+
```kusto APL equivalent
24+
['sample-http-logs']
25+
| summarize Dist=percentiles_arrayif(req_duration_ms, dynamic([90, 99]), status == '200')
26+
```
27+
</CodeGroup>
28+
29+
</Accordion>
30+
<Accordion title="ANSI SQL users">
31+
32+
In ANSI SQL, you often use window functions like `PERCENTILE_DISC` or `PERCENTILE_CONT` or write multiple `CASE` expressions for conditional aggregation. In APL, you can achieve similar functionality with `percentiles_arrayif` by passing the numeric field and condition to the function.
33+
34+
<CodeGroup>
35+
```sql SQL example
36+
SELECT
37+
PERCENTILE_DISC(0.90) WITHIN GROUP (ORDER BY req_duration_ms) AS p90,
38+
PERCENTILE_DISC(0.99) WITHIN GROUP (ORDER BY req_duration_ms) AS p99
39+
FROM sample_http_logs
40+
WHERE status = '200';
41+
```
42+
43+
```kusto APL equivalent
44+
['sample-http-logs']
45+
| summarize Dist=percentiles_arrayif(req_duration_ms, dynamic([90, 99]), status == '200')
46+
```
47+
</CodeGroup>
48+
49+
</Accordion>
50+
</AccordionGroup>
51+
52+
# Usage
53+
54+
## Syntax
55+
56+
```kusto
57+
percentiles_arrayif(Field, Array, Condition)
58+
```
59+
60+
## Parameters
61+
62+
- `Field` is the name of the field for which you want to compute percentile values.
63+
- `Array` is a dynamic array of one or more numeric percentile values (between 0 and 100).
64+
- `Condition` is a Boolean expression that indicates which records to include in the calculation.
65+
66+
## Returns
67+
68+
The function returns an array of percentile values for the records that satisfy the condition. The position of each returned percentile in the array matches the order in which it appears in the function call.
69+
70+
## Use case examples
71+
72+
<Tabs>
73+
<Tab title="Log analysis">
74+
75+
You can use `percentiles_arrayif` to analyze request durations in HTTP logs while filtering for specific criteria, such as certain HTTP statuses or geographic locations.
76+
77+
**Query**
78+
79+
```kusto
80+
['sample-http-logs']
81+
| summarize percentiles_arrayif(req_duration_ms, dynamic([50, 90, 95, 99]), status == '200') by bin_auto(_time)
82+
```
83+
84+
[Run in Playground](https://play.axiom.co/axiom-play-qf1k/query?initForm=%7B%22apl%22%3A%22%5B'sample-http-logs'%5D%20%7C%20summarize%20percentiles_arrayif(req_duration_ms%2C%20dynamic(%5B50%2C%2090%2C%2095%2C%2099%5D)%2C%20status%20%3D%3D%20'200')%20by%20bin_auto(_time)%22%7D)
85+
86+
**Output**
87+
88+
| percentiles_req_duration_ms |
89+
|--------------------------|
90+
| 0.7352 ms |
91+
| 1.691 ms |
92+
| 1.981 ms |
93+
| 2.612 ms |
94+
95+
96+
This query filters records to those with a status of 200 and returns the percentile values for the request durations.
97+
98+
</Tab>
99+
<Tab title="OpenTelemetry traces">
100+
101+
Use `percentiles_arrayif` to track performance of spans and filter on a specific service operation. This lets you quickly gauge how request durations differ for incoming traffic.
102+
103+
**Query**
104+
105+
```kusto
106+
['otel-demo-traces']
107+
| summarize percentiles_arrayif(duration, dynamic([50, 90, 99, 99]), ['method'] == "POST") by bin_auto(_time)
108+
```
109+
110+
[Run in Playground](https://play.axiom.co/axiom-play-qf1k/query?initForm=%7B%22apl%22%3A%22%5B'otel-demo-traces'%5D%20%7C%20summarize%20percentiles_arrayif(duration%2C%20dynamic(%5B50%2C%2090%2C%2099%2C%2099%5D)%2C%20%5B'method'%5D%20%3D%3D%20'POST')%20by%20bin_auto(_time)%22%7D)
111+
112+
**Output**
113+
114+
| percentiles_duration |
115+
|---------------------------|
116+
| 5.166 ms |
117+
| 25.18 ms |
118+
| 71.996 ms |
119+
120+
This query returns the percentile values for span durations for requests with the POST method.
121+
122+
</Tab>
123+
<Tab title="Security logs">
124+
125+
You can focus on server issues by filtering for specific status codes, then see how request durations are distributed in those scenarios.
126+
127+
**Query**
128+
129+
```kusto
130+
['sample-http-logs']
131+
| summarize percentiles_arrayif(req_duration_ms, dynamic([50, 90, 95, 99]), status startswith '5') by bin_auto(_time)
132+
```
133+
134+
[Run in Playground](https://play.axiom.co/axiom-play-qf1k/query?initForm=%7B%22apl%22%3A%22%5B'sample-http-logs'%5D%20%7C%20summarize%20percentiles_arrayif(req_duration_ms%2C%20dynamic(%5B50%2C%2090%2C%2095%2C%2099%5D)%2C%20status%20startswith%20'5')%20by%20bin_auto(_time)%22%7D)
135+
136+
**Output**
137+
138+
| percentiles_req_duration_ms |
139+
|---------------------------|
140+
| 0.7352 ms |
141+
| 1.691 ms |
142+
| 1.981 ms |
143+
| 2.612 ms |
144+
145+
This query calculates percentile values for request durations that return a status code starting with 5 which means server error.
146+
147+
</Tab>
148+
</Tabs>
149+
150+
## List of related functions
151+
152+
- [avg](/apl/aggregation-function/avg): Returns the average of a numeric column.
153+
- [percentile](/apl/aggregation-function/percentile): Returns a single percentile value.
154+
- [percentile_if](/apl/aggregation-function/percentileif): Returns a single percentile value for the records that satisfy a condition.
155+
- [percentiles_array](/apl/aggregation-function/percentiles-array): Returns an array of percentile values for all rows.
156+
- [sum](/apl/aggregation-function/sum): Returns the sum of a numeric column.

apl/aggregation-function/statistical-functions.mdx

+4-2
Original file line numberDiff line numberDiff line change
@@ -28,10 +28,12 @@ The table summarizes the aggregation functions available in APL. Use all these a
2828
| [min](/apl/aggregation-function/min) | Returns the minimum value across the group. |
2929
| [minif](/apl/aggregation-function/minif) | Returns the minimum of an expression in records for which the predicate evaluates to true. |
3030
| [percentile](/apl/aggregation-function/percentile) | Calculates the requested percentiles of the group and produces a timeseries chart. |
31-
| [percentileif](/apl/aggregation-function/percentileif) | Calculates the requested percentiles of the field for the rows where the predicate evaluates to true. |
31+
| [percentileif](/apl/aggregation-function/percentileif) | Calculates the requested percentiles of the field for the rows where the predicate evaluates to true. |
32+
| [percentiles_array](/apl/aggregation-function/percentiles-array) | Returns an array of numbers where each element is the value at the corresponding percentile. |
33+
| [percentiles_arrayif](/apl/aggregation-function/percentiles-arrayif) | Returns an array of percentile values for the records that satisfy the condition. |
3234
| [rate](/apl/aggregation-function/rate) | Calculates the rate of values in a group per second. |
3335
| [stdev](/apl/aggregation-function/stdev) | Calculates the standard deviation of an expression across the group. |
34-
| [stdevif](/apl/aggregation-function/stdevif) | Calculates the standard deviation of an expression in records for which the predicate evaluates to true. |
36+
| [stdevif](/apl/aggregation-function/stdevif) | Calculates the standard deviation of an expression in records for which the predicate evaluates to true. |
3537
| [sum](/apl/aggregation-function/sum) | Calculates the sum of an expression across the group. |
3638
| [sumif](/apl/aggregation-function/sumif) | Calculates the sum of an expression in records for which the predicate evaluates to true. |
3739
| [topk](/apl/aggregation-function/topk) | calculates the top values of an expression across the group in a dataset. |

0 commit comments

Comments
 (0)