Skip to content

Preliminary support for tracing workflow activity #97

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 23 commits into from
Jul 17, 2025
Merged

Preliminary support for tracing workflow activity #97

merged 23 commits into from
Jul 17, 2025

Conversation

mxgrey
Copy link
Contributor

@mxgrey mxgrey commented Jul 10, 2025

This PR introduces basic support for tracing the activity of operations in a workflow. I call it preliminary because there are a few details that could be improved upon:

  • OperationRef is being used to uniquely identify the operations mentioned in the trace, but this might not be the most ergonomic representation for external consumers such as UIs. This is worth discussing further.
  • This does not introduce tracing for the Listen operation, although any operation downstream of Listen will be traced.
  • This does not provide an ergonomic API for tracing workflows built with the native API (traces are only added automatically via the diagram builder), although there is nothing to stop a user from adding Trace components manually.
  • This does not provide any indication of how long an operation is active after it has started; it only triggers an event each time the operation starts.

In the interest of unblocking progress on the tracing capabilities of the workflow editor, I think the above issues can be left for follow-up work. The capabilities introduced in this PR should be enough to move forward with a proof of concept in the workflow editor.

Note that, related to this comment, I've changed the name: fields for node and section builders to default_display_text:, and I've added display text override options for all diagram operations.

To get a sense of how to read the traces, I recommend looking at the record_traces system in the trace.rs test module.

@mxgrey mxgrey requested a review from koonpeng July 10, 2025 14:57
@mxgrey mxgrey added this to PMC Board Jul 10, 2025
@github-project-automation github-project-automation bot moved this to Inbox in PMC Board Jul 10, 2025
@mxgrey mxgrey moved this from Inbox to In Review in PMC Board Jul 10, 2025
mxgrey added 4 commits July 10, 2025 22:58
Signed-off-by: Michael X. Grey <[email protected]>
Signed-off-by: Michael X. Grey <[email protected]>
Signed-off-by: Michael X. Grey <[email protected]>
Signed-off-by: Michael X. Grey <[email protected]>
@@ -3,6 +3,10 @@
"title": "Diagram",
"type": "object",
"properties": {
"default_trace": {
"description": "Whether the operations in the workflow should be traced by default.\n Being traced means each operation will emit an event each time it is\n triggered. You can decide whether that event contains the serialized\n message data that triggered the operation.\n\n If bevy_impulse is not compiled with the \"trace\" feature then any attempt\n to turn tracing on will result in a [`DiagramErrorCode::TraceFeatureDisabled`].",
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If an app does not have the trace feature, then it will cause the schema to diverge since true is no longer a valid value. I think it would be better if either

  • Always have tracing available.
  • If trace is an optional feature, do not throw an error, instead just print a warning.

I think we should also add a field to DiagramElementRegistry to indicate if tracing is supported.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't want to make tracing always required since it does add slight overhead to some potentially very hot code paths.

I wasn't sure whether to go with an error or a warning. I decided to go with error in case the user's workflow strongly depends on tracing being available, but you make a good point that the schema becomes invalid if we do it that way.

I like the idea of specifying the support in the DiagramElementRegistry. I think at that point we can just silently ignore if the user asks for tracing when we don't support it, because we've already communicated that it won't be available. At that point it's up to the front-end to make sure the user knows this.

I've made the relevant changes in this commit: 725be7c

Comment on lines 379 to 381
builder: Some(Arc::clone(builder)),
config: Some(Arc::clone(config)),
display_text: Some(Arc::clone(display_text)),
Copy link
Collaborator

@koonpeng koonpeng Jul 14, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not super aware of the best practices when it comes to Arc, but I think generally for a constructor function, it is more preferred to take a value and do the clone on the caller side.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's highly situational as there are advantages and disadvantages both ways. For example, if you always take by Arc<T> but then don't actually capture the value due to conditional logic, then you've forced the user to do a clone when it wasn't needed.

In this case, these methods are being called in many places, so if we took by Arc<T> then we're just needlessly forcing ourselves to put .clone() or Arc::clone(_) at all of those call sites. We also know in all of those cases that a clone will be necessary because none of those call sites can hand over ownership of the Arc<T> to this function. The only advantage to making it an Arc<T> argument is that if the call site can hand over ownership, then you prevent some unnecessary reference counting, but we know this is never the case for these methods, so we get no benefit from it.

Given the lack of any benefits for using Arc<T> arguments, I'd like to stick with &Arc<T>.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think of it in another way. There is no branch in this constructor function that we do not clone. Taking &Arc has no advantage, even if the callee can give ownership, it will still be cloned (though the compiler can probably do some optimization). I think we also cannot make the assumption that all callee of this function will never be able to give ownership. This is a pub function, so that extends to all potential downstream crates. The only disadvantage I see is that it requires current callee to do an explicit clone, which imo is the rustic pattern anyway.

Comment on lines 379 to 381
builder: Some(Arc::clone(builder)),
config: Some(Arc::clone(config)),
display_text: Some(Arc::clone(display_text)),
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think of it in another way. There is no branch in this constructor function that we do not clone. Taking &Arc has no advantage, even if the callee can give ownership, it will still be cloned (though the compiler can probably do some optimization). I think we also cannot make the assumption that all callee of this function will never be able to give ownership. This is a pub function, so that extends to all potential downstream crates. The only disadvantage I see is that it requires current callee to do an explicit clone, which imo is the rustic pattern anyway.

@mxgrey
Copy link
Contributor Author

mxgrey commented Jul 17, 2025

Taking &Arc has no advantage

If we go this route then we'll end up passing in Arc<JsonMessage> so then this concern would be gone.

mxgrey added 3 commits July 17, 2025 03:39
Signed-off-by: Michael X. Grey <[email protected]>
Signed-off-by: Michael X. Grey <[email protected]>
@mxgrey mxgrey merged commit f769a61 into main Jul 17, 2025
5 checks passed
@mxgrey mxgrey deleted the trace branch July 17, 2025 13:44
@github-project-automation github-project-automation bot moved this from In Review to Done in PMC Board Jul 17, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
Status: Done
Development

Successfully merging this pull request may close these issues.

2 participants