Skip to content

Fix dispatch dict passed to MergeStrategy not getting updated for future fetch's #174

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 6 commits into from
Jul 29, 2025

Conversation

Marenz
Copy link
Contributor

@Marenz Marenz commented Jul 16, 2025

  • We are also switching to use the same now while evaluating other dispatches in the MergeStrategy, but this is just an enhancement and not a fix
  • We still used logging without "path" in some places, that was fixed

@Copilot Copilot AI review requested due to automatic review settings July 16, 2025 11:45
@Marenz Marenz requested a review from a team as a code owner July 16, 2025 11:45
@github-actions github-actions bot added part:tooling Affects the development tooling (CI, deployment, dependency management, etc.) part:dispatcher Affects the high-level dispatcher interface labels Jul 16, 2025
Copilot

This comment was marked as outdated.

Marenz added 2 commits July 16, 2025 17:37
Signed-off-by: Mathias L. Baumann <[email protected]>
Signed-off-by: Mathias L. Baumann <[email protected]>
@github-actions github-actions bot added the part:docs Affects the documentation label Jul 16, 2025
@llucax llucax requested a review from Copilot July 17, 2025 11:56
Copilot

This comment was marked as outdated.

Copy link
Contributor

@llucax llucax left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As usual, the PR and commit title and description could use some improvement. It is hard to review when they say so little. Could you please please please put a bit more thought into those? Please 🙏 ? For the PR description just copying the copilot review summary is a big improvement.

For example in this PR it is mostly irrelevant that you are using started_at, that tells me almost nothing. What is important is you are fixing a bug caused by not having a consistent now used in different path of the code to decide on how to merge dispatches.

@Marenz Marenz changed the title Use started_at Fix dispatch dict passed to MergeStrategy not getting updated for future fetch's Jul 21, 2025
Requires v0.11.1 of `frequenz-client-dispatch` for the new `started_at` method.

Signed-off-by: Mathias L. Baumann <[email protected]>
@Marenz
Copy link
Contributor Author

Marenz commented Jul 21, 2025

commit title and description

tbh, I disagree. The commit titles and descriptions are all perfectly fine imo.

PR, yeah, okay, that grew a bit and needed updating.

I also don't think a required dependency update necessarily needs to be mentioned if it is so minor as here, and the fact that its part of that particular commit is in itself a documentation of its need, at least that was my perspective....

I don't really see how any of the other commit titles & content is lacking anything to make them understandable.

@Marenz
Copy link
Contributor Author

Marenz commented Jul 21, 2025

  • Updated to print all other active dispatch ids
  • Updated to mention the need of the version update in the according PR description.
  • Updated another commit description to be even more clear

Copilot

This comment was marked as outdated.

@llucax
Copy link
Contributor

llucax commented Jul 21, 2025

Without wanting to make a huge thing about this, I will just comment on this one, as I think it can bring some clarity:

I also don't think a required dependency update necessarily needs to be mentioned if it is so minor as here, and the fact that its part of that particular commit is in itself a documentation of its need, at least that was my perspective....

I've seen many times people that sneak an unrelated change into a commit just because it was there. And I don't even think that's obscenely wrong, I can live with that when there is intention. The problem is when it was a mistake, and you were working on something else that needed the bump, but this PR doesn't really need it, and that dependency bump can have real world impact.

So what it is important for me is declaring intention. If the bump was needed by the commit, that's good to know, and good to know why, even if it could be inferred from the diff, just as an extra verification. If the bump was just done because one is lazy about putting it in a separate commit, and the bump is harmless, that should also be explicitly stated, so I can actually do a review, which is about detecting issues early.

This is my main point. I want to be able to tell between mistakes and intended changes.

Copilot

This comment was marked as outdated.

Copy link
Contributor

@llucax llucax left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, just one more suggestion, removing auto-merge in case you want to change it.

@@ -366,8 +366,8 @@ async def _fetch(self, timer: Timer) -> None:
"""
self._initial_fetch_event.clear()

old_dispatches = self._dispatches
self._dispatches = {}
old_dispatches = set(self._dispatches.keys())
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
old_dispatches = set(self._dispatches.keys())

Comment on lines 406 to 416
for dispatch_id in old_dispatches - new_dispatches.keys():
try:
dispatch = self._dispatches.pop(dispatch_id)
_logger.debug("Deleted dispatch: %s", dispatch)
await self._update_dispatch_schedule_and_notify(None, dispatch, timer)

# Set deleted only here as it influences the result of dispatch.started
# which is used in the func call above ^
dispatch._set_deleted() # pylint: disable=protected-access
await self._lifecycle_events_tx.send(Deleted(dispatch=dispatch))
except KeyError:
_logger.warning(
"Inconsistency in cache detected. Tried to delete non-existing dispatch %s",
dispatch_id,
)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is usually recommended to narrow the try/except to only the code you expect to fail, otherwise you could catch other failures that maybe are not OK to ignore. For example if some code in await self._lifecycle_events_tx.send(Deleted(dispatch=dispatch)) (suppose send() uses some internal cache and there is a bug that also make it raise a KeyError).

Also if old_dispatches == self._dispatches.keys(), maybe it makes more sense to use self._dispatches.keys() directly now, I see in the new code old_dispatches is only being used here. This would make the code more obvious and actually guarantee the pop() can't fail, old_dispatches and self._dispatches can't get out of sync.

So I suggest removing old_dispatches and just do:

Suggested change
for dispatch_id in old_dispatches - new_dispatches.keys():
try:
dispatch = self._dispatches.pop(dispatch_id)
_logger.debug("Deleted dispatch: %s", dispatch)
await self._update_dispatch_schedule_and_notify(None, dispatch, timer)
# Set deleted only here as it influences the result of dispatch.started
# which is used in the func call above ^
dispatch._set_deleted() # pylint: disable=protected-access
await self._lifecycle_events_tx.send(Deleted(dispatch=dispatch))
except KeyError:
_logger.warning(
"Inconsistency in cache detected. Tried to delete non-existing dispatch %s",
dispatch_id,
)
# We make a copy because we mutate self._dispatch.keys() inside the loop
for dispatch_id in frozenset(self._dispatches.keys() - new_dispatches.keys()):
dispatch = self._dispatches.pop(dispatch_id)
_logger.debug("Deleted dispatch: %s", dispatch)
await self._update_dispatch_schedule_and_notify(None, dispatch, timer)
# Set deleted only here as it influences the result of dispatch.started
# which is used in the func call above ^
dispatch._set_deleted() # pylint: disable=protected-access
await self._lifecycle_events_tx.send(Deleted(dispatch=dispatch))

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure if the copy is necessary, as we are using the - which I guess produces a copy in itself, but I'm not 100% sure. Also making it a frozenset guarantees our items being iterated will never change, so it makes everything more explicit.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

and actually guarantee the pop() can't fail,

I don't think that's actually true. It can still fail, the maps can be modified between any call to await.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh, OK ,yeah, then maybe it is worth a comment in the try, as it might not be obvious to the casual reader.

@llucax llucax disabled auto-merge July 23, 2025 08:47
llucax
llucax previously approved these changes Jul 23, 2025
Copilot

This comment was marked as outdated.

@Marenz Marenz requested a review from Copilot July 28, 2025 09:50
Copilot

This comment was marked as outdated.

@Marenz Marenz requested a review from Copilot July 28, 2025 10:19
Copilot

This comment was marked as outdated.

@Marenz Marenz requested a review from Copilot July 28, 2025 13:09
Copy link

@Copilot Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This PR fixes a critical bug where the dispatch dictionary passed to MergeStrategy was not being updated for future fetches, causing merge strategies to operate on stale data. Additionally, it enhances consistency by using the same timestamp across dispatch evaluations and improves logging consistency.

  • Fixed dispatch dictionary not being updated after fetch operations
  • Enhanced merge strategy to use consistent timestamp for all dispatch evaluations
  • Improved logging consistency by using proper logger paths

Reviewed Changes

Copilot reviewed 6 out of 6 changed files in this pull request and generated 2 comments.

Show a summary per file
File Description
src/frequenz/dispatch/_merge_strategies.py Added proper logger, enhanced filtering logic with consistent timestamp usage
src/frequenz/dispatch/_dispatch.py Added started_at method to accept explicit timestamp parameter
src/frequenz/dispatch/_bg_service.py Fixed dispatch dictionary update logic to preserve state between fetches
src/frequenz/dispatch/_actor_dispatcher.py Improved log messages for actor lifecycle events
pyproject.toml Updated dependency version for frequenz-client-dispatch
RELEASE_NOTES.md Updated release notes to reflect bug fixes

shsms
shsms previously approved these changes Jul 29, 2025
llucax
llucax previously approved these changes Jul 29, 2025
Copy link
Contributor

@llucax llucax left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM too, one small optional comment about maybe adding a comment to the try for pop() as it might not be obvious.

By using a workflow that avoids reassigning the dispatches dict.

When `fetch()`, which runs periodically, would assign
a fresh map to `self._dispatches` the filter would
not see the latest dispatches, as it would still
reference the old map.

Signed-off-by: Mathias L. Baumann <[email protected]>
@Marenz Marenz dismissed stale reviews from llucax and shsms via 7762602 July 29, 2025 08:07
@Marenz Marenz enabled auto-merge July 29, 2025 08:07
@Marenz Marenz disabled auto-merge July 29, 2025 08:13
@Marenz Marenz merged commit 49ebc13 into frequenz-floss:v0.x.x Jul 29, 2025
5 checks passed
@Marenz Marenz deleted the new_dispatchy branch July 29, 2025 08:13
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
part:dispatcher Affects the high-level dispatcher interface part:docs Affects the documentation part:tooling Affects the development tooling (CI, deployment, dependency management, etc.)
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants