Aggregation is the process of summarizing a set of metrics within Impact Framework. IF supports two main types of aggregation: time series (horizontal) aggregation, which condenses a time series into a single value, and tree (vertical) aggregation, which aggregates metrics across components in a hierarchical structure. Aggregation is configured in the manifest file and is crucial for understanding overall impacts. For detailed configuration options, see the Aggregation documentation at https://if.greensoftware.foundation/docs/aggregation.
Aggregation within the Impact Framework is the essential process of summarising a set of metrics to provide a more holistic view of the environmental impact. IF supports two primary types of aggregation: time series aggregation (also known as horizontal aggregation) and tree aggregation (also known as vertical aggregation).
-
Time series (horizontal) aggregation focuses on taking a time series of metric values for a single component and condensing it down into a single number that represents the entire observation period. For example, if you have a time series of hourly energy consumption for a server over a 24-hour period, time series aggregation could sum these values to provide the total energy consumed by that server for the day. The specific method of aggregation (e.g., sum, average, min, max) can depend on the nature of the metric being aggregated and the insight you wish to gain. The result of horizontal aggregation is typically added as a new field called
aggregated
to the node whose time series was processed. -
Tree (vertical) aggregation operates across the hierarchical structure of components within the manifest file's tree. Where multiple child components under a parent node have their own time series data for a particular metric, tree aggregation combines these time series together into a single summary time series that is then associated with the parent node. This aggregation happens element-wise across the child components' time series for each timestep. For instance, if a web application has multiple front-end server components, their individual hourly carbon emissions time series could be aggregated (e.g., summed) to produce a single hourly carbon emissions time series representing the total impact of the front-end tier. The result of vertical aggregation is often a new array of output observations named
outputs
at the parent level, containing the aggregated metrics along with timestamps and durations.
The Impact Framework's aggregate
feature is built-in, meaning you do not need to explicitly initialise it as a plugin. To configure aggregation, you simply add a small aggregate
section to your manifest file. This configuration allows you to specify the metrics
(an array of metric names, like 'carbon' or 'energy') that you want to aggregate and the type
of aggregation to perform, which can be time
, component
, or both
. The choice of aggregation method (e.g., sum, average) is often implicitly determined by the metric itself or can be influenced by the type of plugin used in the pipeline. Understanding aggregation is crucial for gaining a high-level overview of environmental impacts across different parts of your software system and over time. For detailed configuration options, see the Aggregation documentation.
With multi-observation and multi-component IMPs, it is very important to be able to aggregate values across time and across components. Impact Framework has a feature that enables this.
There are two types of aggregation - time aggregation and component aggregation.
Time aggregation takes each inputs
time series and aggregates the time series down to a single value. For example, if your time series has three carbon values, [1, 2, 3]
than your aggregate value is 1 + 2 + 3 = 6
.
Component aggregation takes all the time series from all the different components at a particular level in the tree and aggregates them into a single time series. For example, if you have two components with carbon values [1, 2, 3]
and [4, 5, 6]
, your aggregate is an array of values [1+4, 2+5, 3+6] = [5, 7, 9]
.
The method used to aggregate is configurable, so that you do not inappropriately sum values that should not be summed, for example constants and proportions. We cover this in the breakout exercise 7 (below).
Toggling aggregation in Impact Framework is as easy as adding three lines of config to the IMP’s context, as follows:
aggregation:
metrics:
- carbon
- sci
type: both
Any metrics you want to aggregate are defined in metrics
. You can choose to do time aggregation, component aggregation or both.
The result is a new yaml field, called aggregated
for each component, with the time-aggregated values for each component:
aggregated:
carbon: 2418.125783086251
sci: 0.12854046793484886
Where you have a parent component composed of multiple children, aggregate
adds a new ouputs
array to the parent, so that every component in the tree has an outputs
array, and it is always the aggregated time series of all of its children.