Context Processor

Status
Stability	[alpha]: traces, metrics, logs
Distributions	contrib
Warnings	Identity Conflict
Code Owners	@jriguera

Description

The context processor modifies context metadata of a span, log, or metric. Please refer to config.go for the config spec.

Typical use cases:

Be able to dynamically define tenants for Mimir/Cortex.
Dynamically define metadata attributes in the context, to offer a link to pass resource attribute to extensions
Change metadata generated from the receivers

Configuration

It takes a list of actions which are performed in order specified in the config. The supported actions are:

insert: Inserts a new attribute in input data where the key does not already exist.
update: Updates an attribute in input data where the key does exist.
upsert: Performs insert or update. Inserts a new attribute in input data where the key does not already exist and updates an attribute in input data where the key does exist.
delete: Deletes an attribute from the input data.

For the actions insert, update and upsert,

key is required
value and/or from_attribute are required
action is required.

  # Key specifies the attribute to act upon.
- key: <key>
  action: {insert, update, upsert}
  # Value specifies the value to populate for the key.
  value: <value>

  # Key specifies the attribute to act upon.
- key: <key>
  action: {insert, update, upsert}
  # FromAttribute specifies the attribute from the context metadata to use to populate
  # the value. If the attribute doesn't exist, value is used.
  from_attribute: <other key>
  value: <value>

For the delete action,

key is required
action: delete is required.

# Key specifies the attribute to act upon.
- key: <key>
  action: delete

The list of actions can be composed to create rich scenarios, such as back filling attribute, copying values to a new key, redacting sensitive information. The following is a sample configuration.

processors:
  context/example:
    actions:
    - action: upsert
      key: tenant
      value: anonymous
      from_attribute: service.name
    - action: delete
      key: tenant

Usage

It is highly recommended to use this processor with groupbyattrs processor, potentially the batch processor can be used. This is a example configuration:

extensions:
  headers_setter:
    headers:
      - action: update
        key: X-Scope-OrgID
        from_context: x-scope-orgid

receivers:
  otlp:
    protocols:
      grpc:
        include_metadata: true
      http:
        include_metadata: true


processors:
  batch/tenant:
    send_batch_size: 1000
    send_batch_max_size: 2000
    metadata_keys:
    - x-scope-orgid
    timeout: 1s

  groupbyattrs/tenant:
    keys: [tenant]

  context/tenant:
    actions:
    - action: upsert
      key: x-scope-orgid
      value: anonymous
      from_attribute: tenant

  groupbyattrs:
    # Create groups where all telemetry in each group will be for the same tenant.
    # Useful in case that attributes are at datapoint level and they will be moved to resouce level
    # This is an example! In this case all metrics with datapoint.attribute source_id will belong
    # to the same tenant
    keys: [source_id]

  # In this example, each tenant has its own namespace. Data can come from different clusters!
  transform/tenant:
    error_mode: ignore
    metric_statements:
      - context: resource
        statements:
          - set(resource.cache["tenant"], "anonymous")
          - set(resource.cache["tenant"], resource.attributes["k8s.namespace.name"])
          - set(resource.attributes["tenant"], resource.cache["tenant"]) where resource.attributes["tenant"] == ""
    log_statements:
      - context: resource
        statements:
          - set(resource.cache["tenant"], "anonymous")
          - set(resource.cache["tenant"], resource.attributes["k8s.namespace.name"])
          - set(resource.attributes["tenant"], resource.cache["tenant"]) where resource.attributes["tenant"] == ""
    trace_statements:
      - context: resource
        statements:
          - set(resource.cache["tenant"], "anonymous")
          - set(resource.cache["tenant"], resource.attributes["k8s.namespace.name"])
          - set(resource.attributes["tenant"], resource.cache["tenant"]) where resource.attributes["tenant"] == ""

exporters:
  prometheusremotewrite/mimir:
    endpoint: "http://mimir-gateway/api/v1/push"
    resource_to_telemetry_conversion:
      enabled: true
    auth:
      authenticator: headers_setter

  otlphttp/loki:
    endpoint: "http://loki-gateway/loki/otlp/v1/logs"
    tls:
      insecure: true
    auth:
      authenticator: headers_setter

  otlp/tempo:
    endpoint: "dns:///tempo-distributor-discovery.ns.svc.cluster.local:4317"
    compression: "gzip"
    tls:
      insecure: true
    auth:
      authenticator: headers_setter


pipelines:
  metrics:
    receivers: [otlp]
    # groupbyattrs may not be needed if the attribute to map the tenant is at resource level!!
    # if the attribute is at lower level (eg spans, datapoints, loglines), then it will be mandatory in order to
    # create group with the attribute at resource level.
    # For example, if the data comes from filelog receiver and each file belongs to its own tenant, groupbyattrs is not needed.
    processors: [groupbyattrs, transform/tenant, groupbyattrs/tenant, context/tenant, batch/tenant]
    exporters: [prometheusremotewrite/mimir]
  logs: 
    receivers: [otlp]
    # groupbyattrs may not be needed if the attribute to map the tenant is at resource level!!
    # if the attribute is at lower level (eg spans, datapoints, loglines), then it will be mandatory in order to
    # create group with the attribute at resource level.
    # For example, if the data comes from filelog receiver and each file belongs to its own tenant, groupbyattrs is not needed.
    processors: [groupbyattrs, transform/tenant, groupbyattrs/tenant, context/tenant, batch/tenant]
    exporters: [otlphttp/loki]
  traces:
    receivers: [otlp]
    # groupbyattrs may not be needed if the attribute to map the tenant is at resource level!!
    # if the attribute is at lower level (eg spans, datapoints, loglines), then it will be mandatory in order to
    # create group with the attribute at resource level.
    # For example, if the data comes from filelog receiver and each file belongs to its own tenant, groupbyattrs is not needed.
    processors: [groupbyattrs, transform/tenant, groupbyattrs/tenant, context/tenant, batch/tenant]
    exporters: [otlp/tempo]

  extensions: 
  - headers_setter

This processor can be combined with the k8sattributes processor to assign tenant based on annotations given to pods or namespaces. The pipeline would be:

    processors: [k8sattributes, filter/by-annotations, transform/get-tenant-from-annotations, groupbyattrs/tenant, context/tenant]

Warnings

In general, the Context processor is a very safe processor to use, but depending on the attribute used for the tenant and the receiver it can cause a lot of fragmentation which can affect performance sending data to the next system. The recomendation is used together with Group by Attributes processor and Batch processor

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!