Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Unasyncify Codegen #99

Open
wants to merge 4 commits into
base: main
Choose a base branch
from
Open

Unasyncify Codegen #99

wants to merge 4 commits into from

Conversation

rtpg
Copy link

@rtpg rtpg commented Oct 29, 2024

This DEP follows discussion in the forum followed by a reference implementation and circulating this DEP in a pre-draft state to several members of the community.

I do not have a Shepherd for this, and would appreciate anyone willing to be it for this.

I am tracking my reference implementation in this PR, which is a total mess at the moment but might be useful to curious people.

generate_unasynced() are where chunks of work relevant to this PR are.

@carltongibson
Copy link
Member

carltongibson commented Oct 29, 2024

Thanks for starting this @rtpg — very cool.

"Code gen" is a loaded term in Django's history, so this'll need some consideration 😅

There's a lot I need to play with to be able to comment properly here, but I wanted to leave an initial reply. An ACK at least, better than silence.

Good work 🏅


Specification
=============
A new module is added, ``django.utils.codegen``. Inside it are several decorators:
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we want this to be a public API that third party modules can use? That would be an interesting extension in the future, but perhaps initially we could move the generate_ functions into a ._unasyncify sub-module to mark them as private, and keep the from_codegen decorator inside the .codegen module?

Moreover, there may be future codegen-related things we add, so moving it to a specific sub-module can keep things cleaner?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  • My intention is for this module to be for Django internal usage, and did not mean to imply this was meant to be consumed by third parties
  • I am comfortable with generate_ functions being in a separate ._unasyncify module. I have no strong opinion with having from_codegen being in .codegen or in ._unasyncify
  • I am also comfortable with django.utils._codegen being the namespace here in general

(To further elaborate my feelings regarding third party consumption: I think third party availability should occur after at least two Django internals people work on a third party lib and think "I really want the async codegen from Django". Don't want to couple these internals to publicly facing behavior until we are very confident in this being right)


This module also includes the following:

* ``ASYNC_TRUTH_MARKER``
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This name is a bit of a mouthful and isn’t super descriptive. Maybe ASYNC_CODEGEN would be better?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am comfortable with ASYNC_CODEGENas a name (though I feel like it's a bit odd... IN_TYPING is clearly about being in the typing context, ASYNC_CODEGEN doesn't feel like that to me).

Maybe ASYNC_CONTEXT? Though context has become a loaded term. IN_ASYNC_VARIANT would be another proposal that feels right to me (I have been using "variant" a lot to describe this problem space)

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ASYNC_CONTEXT is pretty good - context is overloaded, but I think it would be pretty obvious from the context (😅).

In the vein of IS_TYPING: why not just IS_ASYNC? IN_ASYNC_VARIANT and ASYNC_CODEGEN may be more literally correct, but I think what we’re looking for here is not necessarily the most dictionary-correct term but a term that has a good “developer experience”: something as short as possible that contains the keyword ASYNC, allowing it to “fade into the background” and not cause too much noise.

This is ``True``. But during codegen this gets replaced with ``False``. This allows us to easily mix async and sync "branches" within the same implementation for the times we need it.
This is a bit similar to ``typing.TYPE_CHECKING``.

A new script is added, ``scripts/run_codegen.sh``, that scans Django's source tree for functions that need unasync codegen applied.
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we make this primarily Python module/command rather than a shell script? This would be more extensible and might allow us to automatically run it when executing the test suite during development to have a better iteration loop?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It would be quite straightforward to make this either a Django management command (though then it would show up in management command lists....) or just a plain-old argparse-y script.

Given libCST itself is Python I'm sure I can just call directly to the library's own tooling rather than any subprocess shenanigans

* ``if ASYNC_TRUTH_MARKER`` blocks are flattened in the sync variant
Concreteley, this means that the following::

if ASYNC_TRUTH_MARKER:
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we make it an error if the conditional is anything except checking this attribute (I.e. if … and something_else), or are there any use cases for this?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't believe it's worth worrying about trying to figure out "odd" statements (especially given that when you start doing things like generating if False and something_else you'll get into lint-capturing territory), unless we identify things that are both clearly "wrong" and also things that we would be susceptible to accidentally do.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants