Skip to content

Conversation

bisgaard-itis
Copy link
Contributor

@bisgaard-itis bisgaard-itis commented Sep 10, 2025

What do these changes do?

  • This PR fixes a bug which is an unhandled exception in the caching service-layer function for function jobs. It does so by refactoring that service layer.
  • After several discussions with @wvangeit it has become clear that the FunctionJobService should depend on the Celery client. This is because e.g. the status of a function job now depends critically on celery: Before the backend project is created, the status of the job is essentially the status of the celery task which "is responsible" for creating the backend project.
  • On the other hand, I prefer to not open the door to celery tasks triggering other celery tasks until it is absolutely necessary. Because of that I didn't want to add the celery client to the FunctionJobService in the first place.
  • This PR splits the FunctionJobService in two: One part which only depends on the lower service layers of the api-server (jobs and studies) and a second part which in addition depends on the celery task client (TaskManager). See also https://github.com/bisgaard-itis/osparc-simcore/blob/bugfix-use-celery-task-manager-in-function-job-service/services/api-server/docs/api-server.drawio.svg for details
  • The main aim of this splitting is to avoid creating the risk of deadlocks by exposing the celery client inside the celery worker, which could create tasks waiting for other tasks which are behind them in the rabbitmq queue.

Related issue/s

How to test

  • This PR doesn't introduce any new functionality. So running the api-server unit tests is sufficient.

Dev-ops

@bisgaard-itis bisgaard-itis added this to the Cheops milestone Sep 10, 2025
Copy link

codecov bot commented Sep 10, 2025

Codecov Report

❌ Patch coverage is 86.33540% with 22 lines in your changes missing coverage. Please review.
✅ Project coverage is 68.93%. Comparing base (9d30172) to head (d81d40a).
⚠️ Report is 1 commits behind head on master.

❗ There is a different number of reports uploaded between BASE (9d30172) and HEAD (d81d40a). Click for more details.

HEAD has 29 uploads less than BASE
Flag BASE (9d30172) HEAD (d81d40a)
unittests 30 1
Additional details and impacted files
@@             Coverage Diff             @@
##           master    #8352       +/-   ##
===========================================
- Coverage   89.16%   68.93%   -20.23%     
===========================================
  Files        1745      883      -862     
  Lines       68443    39551    -28892     
  Branches      832      175      -657     
===========================================
- Hits        61024    27263    -33761     
- Misses       7197    12231     +5034     
+ Partials      222       57      -165     
Flag Coverage Δ
integrationtests 63.94% <ø> (-0.04%) ⬇️
unittests 91.93% <86.33%> (+3.89%) ⬆️
Components Coverage Δ
pkg_aws_library ∅ <ø> (∅)
pkg_celery_library ∅ <ø> (∅)
pkg_dask_task_models_library ∅ <ø> (∅)
pkg_models_library ∅ <ø> (∅)
pkg_notifications_library ∅ <ø> (∅)
pkg_postgres_database ∅ <ø> (∅)
pkg_service_integration ∅ <ø> (∅)
pkg_service_library ∅ <ø> (∅)
pkg_settings_library ∅ <ø> (∅)
pkg_simcore_sdk 76.75% <ø> (-8.19%) ⬇️
agent ∅ <ø> (∅)
api_server 91.93% <86.33%> (-0.02%) ⬇️
autoscaling ∅ <ø> (∅)
catalog ∅ <ø> (∅)
clusters_keeper ∅ <ø> (∅)
dask_sidecar ∅ <ø> (∅)
datcore_adapter ∅ <ø> (∅)
director ∅ <ø> (∅)
director_v2 78.11% <ø> (-12.88%) ⬇️
dynamic_scheduler ∅ <ø> (∅)
dynamic_sidecar 81.87% <ø> (ø)
efs_guardian ∅ <ø> (∅)
invitations ∅ <ø> (∅)
payments ∅ <ø> (∅)
resource_usage_tracker ∅ <ø> (∅)
storage ∅ <ø> (∅)
webclient ∅ <ø> (∅)
webserver 58.78% <ø> (-29.25%) ⬇️

Continue to review full report in Codecov by Sentry.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 9d30172...d81d40a. Read the comment docs.

🚀 New features to boost your workflow:
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

Copy link
Contributor

mergify bot commented Sep 10, 2025

🧪 CI Insights

Here's what we observed from your CI run for d81d40a.

✅ Passed Jobs With Interesting Signals

Pipeline Job Signal Health on master Retries 🔍 CI Insights 📄 Logs
CI system-tests Base branch is broken, but the job passed. Looks like this might be a real fix 💪 Broken 0 View View

@bisgaard-itis bisgaard-itis self-assigned this Sep 10, 2025
@bisgaard-itis bisgaard-itis marked this pull request as ready for review September 10, 2025 14:30
@bisgaard-itis bisgaard-itis changed the title 🐛 Use celery task manager in function job task client service ♻️🐛 Use celery task manager in function job task client service Sep 10, 2025
Copy link
Contributor

@GitHK GitHK left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, some minor things

Copy link
Member

@pcrespov pcrespov left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thx a lot!

@bisgaard-itis bisgaard-itis enabled auto-merge (squash) September 16, 2025 10:37
@bisgaard-itis
Copy link
Contributor Author

@Mergifyio queue

Copy link
Contributor

mergify bot commented Sep 16, 2025

queue

🛑 Configuration not compatible with a branch protection setting

The branch protection setting Require branches to be up to date before merging is not compatible with max_parallel_checks>1, queue_conditions != merge_conditions and must be unset.

@bisgaard-itis bisgaard-itis added the 🤖-automerge marks PR as ready to be merged for Mergify label Sep 16, 2025
Copy link

@bisgaard-itis bisgaard-itis merged commit 15fd04e into ITISFoundation:master Sep 16, 2025
93 of 95 checks passed
@bisgaard-itis bisgaard-itis deleted the bugfix-use-celery-task-manager-in-function-job-service branch September 16, 2025 14:12
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
🤖-automerge marks PR as ready to be merged for Mergify a:apiserver api-server service Meta-modeling
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants