Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Remove provider configuration from HttpMetricsLayerBuilder. #150

Merged
merged 3 commits into from
Jan 7, 2025

Conversation

fourfs
Copy link
Contributor

@fourfs fourfs commented Jan 6, 2025

To preface the below proposal, thank you for this library! It already helps immensely to instrument axum servers.

This PR stems from a fork I had to create to reduce responsibility of the library and make it suitable for a personal project.

As-is, the HttpMetricsLayerBuilder::new().build() function implicitly sets the global meter provider:

global::set_meter_provider(provider.clone());

This feels vastly out of scope of a middleware and complicates a number of concerns:

  • Most of the HttpMetricsLayerBuilder API has to do with indirectly configuring the provider, instead of setting up the metrics layer.
  • The signature of HttpMetricsLayerBuilder has to account for generics to support custom opentelemetry_sdk::metrics::reader::MetricReader.
  • Most importantly, applications using this library are forced to either configure their global meter provider through this creates' builder API, or work around it by explicitly initializing this library before initializing the global provider.

I suspect this design choice stems from convenience for minimal applications and probably supports a majority of dependents.
However, I can't help but find this blurry separation of concerns detrimental to wider adoption.

As per opentelemetry crate Global Metrics API:

Application owners have the responsibility to set the global meter provider.
[...]
3rd party middleware or library code can be written against this generic API and not constrain users to a specific implementation choice.

This library seems to fall squarely within the intended usage of the global API.
The added value of ease of use for users could be achieved through documentation, rather than implicit configuration within the library. WDYT?

@ttys3
Copy link
Owner

ttys3 commented Jan 6, 2025

thanks, the changes LGTM

However, as you can see, the build part of the global provider actually contains quite a lot of code. In other words, this crate also serves partly to simplify the initialization of the global provider in the app. I wonder if this part could perhaps be separated into a helper crate?

@fourfs
Copy link
Contributor Author

fourfs commented Jan 6, 2025

I'm not sure how much the provider specific init logic belongs in any particular crate per se.
There isn't that much to abstract over the opentelemetry_sdk::metrics::SdkMeterProvider::builder() without loosing potentially important configuration details and a "sane defaults" minimal configuration seems to be more in the realm of good docs / examples than library code.

That being said, those examples could be provided as part of the crate, perhaps in an examples/ directory. I'm thinking 2 minimal hello world servers, one for prom and one for otlp. Do you think that makes sense?

Skimming through the current impl, the two main pieces of extra logic beyond using otel API directly are:

  1. parsing some env vars/builder opts to provide a resource definition:
    let mut resource = vec![];
    let ns = env::var("INSTANCE_NAMESPACE").unwrap_or_default();
    if !ns.is_empty() {
    resource.push(KeyValue::new(SERVICE_NAMESPACE, ns.clone()));
    }
    let instance_ip = env::var("INSTANCE_IP").unwrap_or_default();
    if !instance_ip.is_empty() {
    resource.push(KeyValue::new(SERVICE_INSTANCE, instance_ip));
    }
    if let Some(service_name) = self.service_name.clone() {
    // `foo.ns`
    if !ns.is_empty() && !service_name.starts_with(format!("{}.", &ns).as_str()) {
    resource.push(KeyValue::new(SERVICE_NAME, format!("{}.{}", service_name, &ns)));
    } else {
    resource.push(KeyValue::new(SERVICE_NAME, service_name));
    }
    }
    if let Some(service_version) = self.service_version.clone() {
    resource.push(KeyValue::new(SERVICE_VERSION, service_version));
    }
    let res = Resource::from_detectors(
    Duration::from_secs(6),
    vec![
    // set service.name from env OTEL_SERVICE_NAME > env OTEL_RESOURCE_ATTRIBUTES > option_env! CARGO_BIN_NAME > unknown_service
    Box::new(SdkProvidedResourceDetector),
    // detect res from env OTEL_RESOURCE_ATTRIBUTES (resources string like key1=value1,key2=value2,...)
    Box::new(EnvResourceDetector::new()),
    // set telemetry.sdk.{name, language, version}
    Box::new(TelemetryResourceDetector),
    ],
    );
    let res = if !resource.is_empty() {
    res.merge(&mut Resource::new(resource))
    } else {
    res
    };
    let res = res.merge(&mut Resource::new(self.target_labels.clone()));
    let mut builder = SdkMeterProvider::builder().with_resource(res);
  2. parsing some more env vars to select the exporter protocol:
    /// init otlp metrics exporter
    /// read from env var:
    /// OTEL_EXPORTER_OTLP_METRICS_ENDPOINT, OTEL_EXPORTER_OTLP_METRICS_HEADERS,OTEL_EXPORTER_OTLP_METRICS_TIMEOUT
    /// ref https://github.com/tokio-rs/tracing-opentelemetry/blob/5e3354ec24debcfbf856bfd1eb7022459dca1e6a/examples/opentelemetry-otlp.rs#L32
    fn build_otlp(&self) -> PeriodicReader {
    let protocol = match env::var("OTEL_EXPORTER_OTLP_METRICS_PROTOCOL")
    .ok()
    .or(env::var("OTEL_EXPORTER_OTLP_PROTOCOL").ok())
    {
    Some(val) => val,
    None => "http/protobuf".to_string(),
    };
    let exporter = if protocol.starts_with("http") {
    opentelemetry_otlp::MetricExporter::builder()
    .with_http()
    .with_temporality(Temporality::default())
    .build()
    .unwrap()
    } else {
    opentelemetry_otlp::MetricExporter::builder()
    .with_tonic()
    .with_temporality(Temporality::default())
    .build()
    .unwrap()
    };
    PeriodicReader::builder(exporter, opentelemetry_sdk::runtime::Tokio)
    .with_interval(std::time::Duration::from_secs(30))
    .build()
    }

    From the above, I think setting up the provider in 2. has ample explanation between opentelemetry-rust crate examples and respective exporter implementations. There is also a lot of detail there, I wouldn't go beyond a minimal example as to not confuse anyone.
    For 1. we could point to https://opentelemetry.io/docs/specs/semconv/resource/#service and fill out the example appropriately.

@ttys3
Copy link
Owner

ttys3 commented Jan 7, 2025

Using examples is certainly good. However, as a document, it might be better because what everyone sees at first glance is likely the document, rather than searching for examples.

I think we can merge this first, and the document can be submitted as another PR.

@ttys3 ttys3 merged commit 5fa718c into ttys3:main Jan 7, 2025
1 check passed
@ttys3
Copy link
Owner

ttys3 commented Jan 7, 2025

I've creates another issue #153

PR is welcome if you have time @fourfs

@fourfs
Copy link
Contributor Author

fourfs commented Jan 8, 2025

I've creates another issue #153

PR is welcome if you have time @fourfs

Awesome! I'll take a jab at it over the weekend most likely, maybe earlier if I find extra time

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants