-
Notifications
You must be signed in to change notification settings - Fork 1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Associate multiple values with the same tag key #1432
Comments
You can get this benefit by using multiple counters.
Is there any reason this wouldn't work? (Unless you are trying to tie this metric to some sort of requests total and this action would inflate that count, in which I would recommend you create a separate counter to track that) |
For a single key that might be a solution. However imagine a case where you add different tags along the lifetime of a request and in the end only once this metric is reported. e.g. some service_1 called for metric_value_1 and some other service_2 called for metric_value_2. In this case, we end up with multiple tag keys that have multiple values which means an exponential number of metrics to report instead of only one. Another example that we cannot use suggested approach is: |
That is how tags work if you keep making unique combinations of key/values. There is one time series for each unique metric name/type and tag key/value. Is the difference that you're expecting previous tag key/values to not be reported if they haven't been recorded recently? Otherwise I'm not understanding the issue here. You can make metrics with the same tag name and different tag values using the current API already. Could you illustrate the issue with a code sample to make it more clear? |
Hello, I think the use case is pretty simple that we are facing, no need for a code sample. We would like to be able to report a metric with the following tags: key:value_1, key:value_2. This is currently the case with DataDog client where you don't need to actually provide a key/value pair at all. But if the string value you provide includes a : (colon), the left hand side of the string is treated as a key the right hand side as value. Currently this is not supported because in the dedup method of Tags class, this is prevented. I have tried sending metrics with this use case and only 1 of the tags is sent, even captured network packets to make sure that there is no problem with DataDog monitoring. |
Thank you for helping me understand. I didn't realize we were deduplicating tags by key name always. I was able to reproduce the issue now. |
Hi @shakuzen, has a fix for this use case been implemented? We've also have multiples tags with different values for the same key. |
I am also still curious if we have a solution for this. |
Any updates on this? |
I'd also like to upvote this. We have a number of use cases where our data-model is multi-valued, and it would be nice to represent each value as a tag. If we understand the way tags are persisted (e.g. split off into separate independent time-series) it seems like this should be feasible - excepting deduping on input |
Sorry for the lack of update on this. While I understand the technical limitation we have right now in Micrometer and that Datadog allows such a tagging model, I'm not sure we can easily justify supporting this for Datadog when it isn't portable to other backends and we would have to deduplicate for reporting to all other backends. Just so everyone is clear in this issue, you can already record metrics that have different possible values for the same tag. For example,
From this you can derive that the
While not a realistic example, I suppose this would be interpreted as 3 GET requests, 2 POST requests, and 4 requests where the method was POST and GET. I am surprised there are this many people with use cases for this. Again, the lack of portability and breaking change required for this is a significant hurdle to get over. We would have to allow this in common API, which affects all users. |
Our use case is that we use cumulative distribution bins for latency metrics, so e.g. a request that takes 34 msec would have the following tags: This allows for an easier setup of monitors in datadog, whereby unacceptable requests can be queried as e.g. |
@datibbaw The use case you have describe sounds similar to what we do with histogram buckets. In registries that support percentile histograms (Atlas, Prometheus, Wavefront) we ship those when the configuration is enabled. If using a registry that doesn't support that, you can still set |
Hi did you try this ? thread |
@shakuzen Sorry for the tardy reply on this. Would it be accurate to say that your suggestion is effectively to define the SLO within the app itself? I'm not sure how heavy it would be to create a gauge for each bucket, but it sounds rather impactful. The good part of having range tags done in the way that I'm proposing (albeit working only for dd afaict) is that the metric production and SLO definition can be separate from each other, and we're not burning away at the number of metrics that we would have to keep around for each SLO. Maybe I didn't understand you correctly, and I'm not intimately familiar with atlas, prometheus, etc. (but we're not likely to move away from dd either); if I'm mistaken in my interpretation, please correct me! |
Here is our use case (DataDog): We are monitoring metrics that track how many times we update records in a third party system. We are interested in the success/failure rate of these updates at both the object level and field level. An important caveat is that both full object updates and partial updates are supported by the third party system. We want to include tags on our metrics of both the object name (ex.
@shakuzen Would it be possible to have the deduplication be configurable or abstracted out to the backend implementation in use? The lack of support for this tagging model seems like a limitation of the other backends, not vice versa, therefore I would think the other backends implementation should be responsible for the deduplication if it's required there. |
@patrickbadley Does this help to answer your question? #1432 (comment)
I think I see this differently (btw is there another backend that does this?), to me, not being able to do this helps to make things simpler I guess. In your case, I think instead of listing the field names, you can track a flag about all of them to see which one was updated, and which one was not e.g.:
You can (should?) do this:
To me the second one seems more explicit (you can see what was updates as well as what was not), and to me it seems easier to read it, aggregate the values and reason about what happened. |
@jonatan-ivanov Thank you for the prompt response. I had read #1432 (comment) before posting my comment, but felt my post may identify a real world use case for this feature. Before moving on to responding to your comments, there have been multiple comments and reactions to this request indicating that multiple users would like this functionality. Would it be difficult to add a simple configuration option where users can opt-out of the dedup() method call for tags? I'm not familiar enough with the micrometer code to propose a PR, but if we could set an env var or set some other config value and then have the Tags class do something like this:
that would be a low cost solution to our problem. A more robust solution may be to use the strategy pattern or move the dedup call somewhere else in the logic where the backend can determine if the tags need to be deduped or not. Regarding your other comments, I appreciate the potential workaround, we may end up implementing something along the lines of your suggestion barring support from micrometer for multiple values for the same tag which we still prefer. Without getting too in the weeds of our implementation, we wouldn't be able to easily know what fields are NOT getting synced, so we would only be publishing I do not know if another backend allows multiple values for a single tag like datadog, I have only worked with DataDog. I don't agree with your comment about this not being a limitation of other backends by not supporting this as "making things simpler". To take the argument to an extreme (at the risk of sounding like a jerk 😬), it would be "simpler" to not have a tag feature at all, but that doesn't make the service better. Thanks again for you thoughtful response. |
I think so, mainly because of two reasons:
If you "just" remove dedup (please feel free to try it out), you will have two registry.counter("test.counter", "a", "0", "a", "1").increment();
registry.counter("test.counter", "a", "1", "a", "0").increment(); But you really should have just one
I think we have the same amount of cardinality with both approaches, we did not reduce/increase the amount of data (time series), just presented it in a different format.
This is not as extreme as you would think since backends did not have tags in the past (Graphite, Ganglia, JMX, Etsy StatsD still don't). Since you still need multiple time series, what these "hierarchical" backends did was adding the "tags" (dimensions) to the name, e.g.:
Handling this is not simple: sometimes only the values were available ( Let me talk to @shakuzen about this on Monday. I still feel this is asking for trouble but at least I can provide a hack for you: |
While considering hacks, I would also like to point out that I have a suggested patch that would accomplish this idea very specifically for supplying multiple tags with the same name; you can find the implementation at #2764. Note that my approach wouldn't avoid dedup at all, it just changes the rules by which tags are compared, and this could theoretically be done differently for just one backend ... in fact, it would only work in one backend, so it already falls in the category of advanced usage. |
@datibbaw I'm not considering the hack I mentioned above. I provided it for @patrickbadleyupstart, he can consider doing it if he wants to. I guess you both read this comment from Tommy:
Because of this and the things I mentioned in my previous comment, I think I would consider that solution a "hack" but I think it's less risky than a dedup on/off feature. We will discuss this on Monday and will get back to you. |
@jonatan-ivanov I was wondering if you were able to discuss #2764 with @shakuzen and had any update on this issue? Thank you. |
Sorry for the late reply. Right now we are not planning to do this since one of the main ideas behind Micrometer is the ability to switch between metrics backend with minimal pain. This would encourage users adding lots of Datadog-specific code in their codebase which will make switching significantly harder and the code less portable. This could also open the door for misusage which can open the door for other issues too. |
Hi @jonatan-ivanov, I understand that you are trying to stick to a core value of your project, but I don't think this change has all of the negative consequences you mention.
I hope you change your mind and add this simple change so that my team, and others, can leverage the full feature set provided by Datadog without negatively impacting any other users of your library. |
Hello,
We are currently using DataDog for reporting our metrics and wanted to replace it with micrometer and statsd using DATADOG flavor. However, we found out that a use case currently working on DataDog is not supported in micrometer.
DataDog allows to have a single value or key:value format in the reports, see documentation.
This also allows us to report a metric with the same key repeating with different values.
e.g. metric_key:metric_value_1, metric_key:metric_value_2
I have come across issue #1170. In our case, we receive a list of values from input and we want to report these values individually with the same tag key, which in return allows us to use graphs like sum/avg by metric_key.
I am well aware that not all reporting APIs support this but maybe the individual API integrations should take into account this use case where it is valid for some integrations like DataDog.
Best regards,
Cuneyt
The text was updated successfully, but these errors were encountered: