Skip to content

Conversation

taegyunkim
Copy link
Contributor

@taegyunkim taegyunkim commented May 30, 2025

tl;dr

  1. Use uwsgi>=2.0.30 which has fix: install atexit handlers after apps are initialized unbit/uwsgi#2726
  2. When using uwsgi<2.0.30 with --lazy-apps, also set --skip-atexit, if you see crashes when workers exit.

The following explanation happens on uwsgi<2.0.30

When uwsgi is configured with --lazy-apps, master doesn't load the application, and simply fork() to create workers. Then, the workers load the application. This is different from when it's not set. Without --lazy-apps, master process first load the application then fork().

dd-trace-py profiler uses Python stdlib atexit module to export the profile when the process exits. The code run to export references global static std::string variables. However, when --lazy-apps option is set, they are destructed before when the profiler needs them to export the profile, and this leads to an undefined behavior.

When --lazy-apps is not set the following sequence of events happens

  1. uwsgi master process, initialize Python plugin, loads the application.
  2. The application calls import ddtrace.profiling.auto and this leads all native extensions imported to the Python app. During the process, the constructors for global static variables are called and their destructors would be registered to be called by atexit() when the program exits.
  3. The application also registers Profiler.stop() function to uwsgi python module's atexit attribute.
  4. Then uwsgi calls atexit(uwsgi_plugins_atexit) to register each plugins atexit callbacks, for Python uwsgi_python_atexit() function. And the function will call Py_Finalize()
  5. Master fork()s workers
  6. When SIGTERM is received at the master, it sends SIGINT to its workers
  7. In each worker, uwsgi_python_atexit() is called, invoking Profiler.stop() and Py_Finalize() is called, then destructors for global static variables are called.

When --lazy-apps is set

  1. uwsgi matser process still intializes Python interpreter, but skips loading the application
  2. Then uwsgi calls atexit(uwsgi_plugins_atexit) to register each plugins atexit callbacks, for Python uwsgi_python_atexit() function. And the function will call Py_Finalize()
  3. Master fork()s workers
  4. Workers load the application. The application calls import ddtrace.profiling.auto and this leads all native extensions imported to the Python app. During the process, the constructors for global static variables are called and their destructors would be registered to be called by C atexit() when the program exits.
  5. When --lazy-apps is set, ddtrace simply uses Python's atexit module to call Profiler.stop() when the program exits.
  6. When SIGTERM is received at the master, it sends SIGINT to its workers
  7. In each worker, since global destructors are registered later than uwsgi_python_atexit() , they're called first.
  8. Then Py_Finalize() is called in part of uwsgi_python_atexit() calling Profiler.stop(). Then it will access those destructed global static strings leading to undefined behavior.

To workaround this I have tried three different approaches.

  1. chore(profiling): add ddup_export_to_file() to address problems with uwsgi tests #11125: Simply pass the needed std::string from Python side to native when exporting pprof file for test
  2. chore: leak uploader config to fix crashes on uwsgi tests #13520: Leak the strings needed for exporting profile to prevent them from destructed
  3. chore(profiling): per upload config #13535: Bundle all needed strings needed for exporting profile, and create them whenever we try to upload.

The first approach only addresses issues from tests.

The second approach mimics what's suggested in pytorch/pytorch#42125, but since there are a few other global static variables we have throughout the profiler, this still showed crashes from test.

And we got similar results from the third approach, as we did from the second.

Another way to completely work around this problem is setting --skip-atexit as suggested in ansible/ansible-ai-connect-service#248.

When --skip-atexit is set, uwsgi uses _exit() instead of exit(), and the former doesn't call any functions registered with atexit() or on_exit(). So, the static variable destructors are never called, and the app no longer crashes.

The profiler also doesn't flush the profile via atexit() when --skip-atexit is set, but I'd argue losing a single profile at the end is better than crashing the application even if the application is exiting. So we'd need to require our customers to set --skip-atexit when --lazy-apps is set.

Checklist

  • PR author has checked that all the criteria below are met
  • The PR description includes an overview of the change
  • The PR description articulates the motivation for the change
  • The change includes tests OR the PR description describes a testing strategy
  • The PR description notes risks associated with the change, if any
  • Newly-added code is easy to change
  • The change follows the library release note guidelines
  • The change includes or references documentation updates if necessary
  • Backport labels are set (if applicable)

Reviewer Checklist

  • Reviewer has checked that all the criteria below are met
  • Title is accurate
  • All changes are related to the pull request's stated goal
  • Avoids breaking API changes
  • Testing strategy adequately addresses listed risks
  • Newly-added code is easy to change
  • Release note makes sense to a user of the library
  • If necessary, author has acknowledged and discussed the performance implications of this PR as reported in the benchmarks PR comment
  • Backport labels are set in a manner that is consistent with the release branch maintenance policy

@taegyunkim taegyunkim requested review from a team as code owners May 30, 2025 15:39
@taegyunkim taegyunkim added the Profiling Continous Profling label May 30, 2025
Copy link
Contributor

github-actions bot commented May 30, 2025

CODEOWNERS have been resolved as:

releasenotes/notes/profiling-uwsgi-skip-atexit-2d931caef96837b1.yaml    @DataDog/apm-python
tests/profiling_v2/test_uwsgi.py                                        @DataDog/profiling-python
docs/advanced_usage.rst                                                 @DataDog/python-guild

Copy link
Contributor

github-actions bot commented May 30, 2025

Bootstrap import analysis

Comparison of import times between this PR and base.

Summary

The average import time from this PR is: 236 ± 3 ms.

The average import time from base is: 237 ± 3 ms.

The import time difference between this PR and base is: -1.0 ± 0.1 ms.

Import time breakdown

The following import paths have shrunk:

ddtrace.auto 1.862 ms (0.79%)
ddtrace.bootstrap.sitecustomize 1.192 ms (0.51%)
ddtrace.bootstrap.preload 1.192 ms (0.51%)
ddtrace.internal.remoteconfig.client 0.633 ms (0.27%)
ddtrace 0.671 ms (0.28%)
ddtrace.internal._unpatched 0.024 ms (0.01%)

@pr-commenter
Copy link

pr-commenter bot commented May 30, 2025

Benchmarks

Benchmark execution time: 2025-06-03 16:14:30

Comparing candidate commit d03bfb7 in PR branch taegyunkim/uwsgi-atexit-hook with baseline commit 4607d9e in branch main.

Found 0 performance improvements and 2 performance regressions! Performance is the same for 503 metrics, 3 unstable metrics.

scenario:iastaspectsospath-ospathnormcase_aspect

  • 🟥 execution_time [+413.433ns; +466.402ns] or [+12.000%; +13.538%]

scenario:telemetryaddmetric-1-distribution-metric-1-times

  • 🟥 execution_time [+285.944ns; +328.294ns] or [+9.833%; +11.289%]

Copy link
Contributor

@nsrip-dd nsrip-dd left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice work fixing the problem upstream!

@taegyunkim taegyunkim merged commit 22ded2c into main Jun 3, 2025
312 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Profiling Continous Profling
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants