-
Notifications
You must be signed in to change notification settings - Fork 23
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add monitor_django_management_command #482
base: master
Are you sure you want to change the base?
Add monitor_django_management_command #482
Conversation
7b543c3
to
4e4c320
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks. Is this something you've tested using (2U private link) https://2u-internal.atlassian.net/wiki/spaces/ENG/pages/1173618788/Running+Datadog+in+devstack?
a37899a
to
f14d319
Compare
d39da3f
to
62e8964
Compare
Yes, I have |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks!
from django.core.management import execute_from_command_line | ||
from edx_django_utils.monitoring import monitor_django_management_command | ||
with monitor_django_management_command(sys.argv[1]): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is it expected that this context manager will be added to manage.py in some public Django apps? I'm concerned about the potential overhead, in particular in terms of import time. For instance, edx-platform is notorious for its very slow manage.py. If we decorate every manage.py with this context manager, we will be pulling all the imports from edx_django_utils/monitoring/init.py in every command.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I measured the import time using time.perf_counter()
, and here are the results from multiple runs:
- 0.198723 seconds
- 0.227716 seconds
- 0.096847 seconds
- 0.150504 seconds
- 0.146623 seconds
- 0.126719 seconds
These numbers suggest that the import overhead is relatively low—typically well under a quarter of a second. Given this, I don’t anticipate a significant impact. That said, I’m happy to explore optimizations or alternatives. Let me know what you think!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Are you considering adding this context manager to some public Django apps?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@regisb: Are these times ok with you? If not, we can update the example code with a comment that checks ENABLE_CUSTOM_MANAGEMENT_COMMAND_MONITORING
(without referencing edx-django-utils), and decides whether to import or not.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For reference, this is the output of a simple command in edx-toggles on my computer:
$ time ./manage.py shell -c "print()"
real 0m0,394s
user 0m0,335s
sys 0m0,059s
So that 0.15s overhead is actually ~35% of the total time. In absolute terms that's not a big deal, but this added lag is really a code smell: it's an indication that we are adding a whole lot of complexity on the path of every single manage.py command. And I'm not quite convinced about the usefulness of that feature, since I'm guessing that 99% of Open edX platforms out there don't define OPENEDX_TELEMETRY.
This is why I was asking whether you intend to add this piece of code to public Django apps. If this feature is limited to 2U-only apps, then I don't have any issue. But if the goal is to add this context manager to multiple apps, then my opinion would be that this is too much bloat.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Our intention is to add this into Open edX services. For now, we will integrate it only with course discovery.
Other teams can integrate this if they want monitoring support for their management commands.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it's a bad pattern to import large libraries in manage.py. If I wanted to monitor manage.py commands, I think I would avoid modifying the manage.py code, and instead use a CLI similar to myinstrumenter ./manage.py <args>
.
Plus, there's nothing special about the monitor_django_management_command
: this function is just a thin wrapper on top of function_trace
. So I think that neither the changes to manage.py nor the added function to edx_django_utils are really necessary. The added bloat, in the form of two new cross-app Django settings and custom code on the path of every call to manage.py are just not worth it.
But I'm just one guy with an opinion here. Feel free to discuss this with other people and ignore my comment.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I understand the concern about adding bloat and modifying manage.py
, but we already use a CLI (ddtrace-run). Even using this CLI, Datadog does not automatically instrument Django management commands, unlike New Relic, which already offers this functionality.
We raised a Datadog support ticket (#1990940) to explore options, as we want to avoid adding custom instrumentation for every command.
Then we opted to use function_trace
. However, the problem we're facing is that the function_trace
command takes a single argument that is used for both the operation
and resource
names.
This limitation means that management commands do not get a distinct, consistent operation_name
(e.g., django.command
), which is necessary for proper observability.
To address this, we implemented monitor_django_management_command()
. This packages all the boilerplate logic, ensuring minimal modifications to manage.py
while achieving the desired instrumentation.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have updated the example code to only import if ENABLE_CUSTOM_MANAGEMENT_COMMAND_MONITORING
is enabled
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@regisb: I don't know of a way to inject the code we need from outside manage.py
, but we can explore that a little more to see if we missed anything.
If we do need to add this monitoring capability to services in general, this new thin wrapper is also wrapping other calls, like function_trace
, that are a part of our larger abstraction layer to various monitoring tools. I think it makes sense to document in one place and use the abstraction layer. For those that don't require monitoring, this will be disabled by default.
21e7350
to
73efbbc
Compare
monitoring_enabled = getattr(settings, "ENABLE_CUSTOM_MANAGEMENT_COMMAND_MONITORING", False) | ||
if monitoring_enabled and len(sys.argv) > 1: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What is a scenario where len(sys.argv) > 1
, and would you really want to drop monitoring altogether, or would you want to just pass a different (static) value to monitor_django_management_command
?
@hamza-56: Please don't squash new commits until this is ready to merge again. Thank you. |
Add monitor_django_management_command
monitor_django_management_command
creates a monitoring span usingfunction_trace
,allowing the execution of the command to be tracked explicitly in monitoring tools.
It also sets the resource name to the given
name
usingset_monitoring_transaction_name
.