Skip to content

feat(bzlmod): add a flag to control if the parallel downloading is used #1854

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
4 changes: 3 additions & 1 deletion CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -66,7 +66,9 @@ A brief description of the categories of changes:
`experimental_index_url_overrides` to `pip.parse` for using the bazel
downloader. If you see any issues, report in
[#1357](https://github.com/bazelbuild/rules_python/issues/1357). The URLs for
the whl and sdist files will be written to the lock file.
the whl and sdist files will be written to the lock file. Controlling whether
the downloading of metadata is done in parallel can be done using
`parallel_download` attribute.

[0.XX.0]: https://github.com/bazelbuild/rules_python/releases/tag/0.XX.0
[python_default_visibility]: gazelle/README.md#directive-python_default_visibility
Expand Down
18 changes: 18 additions & 0 deletions python/private/bzlmod/pip.bzl
Original file line number Diff line number Diff line change
Expand Up @@ -199,6 +199,7 @@ def _create_whl_repos(module_ctx, pip_attr, whl_map, whl_overrides, simpleapi_ca
auth_patterns = pip_attr.auth_patterns,
),
cache = simpleapi_cache,
parallel_download = pip_attr.parallel_download,
)

major_minor = _major_minor_version(pip_attr.python_version)
Expand Down Expand Up @@ -539,6 +540,23 @@ hubs can be created, and each program can use its respective hub's targets.
Targets from different hubs should not be used together.
""",
),
"parallel_download": attr.bool(
doc = """\
The flag allows to make use of parallel downloading feature in bazel 7.1 and above
when the bazel downloader is used. This is by default enabled as it improves the
performance by a lot, but in case the queries to the simple API are very expensive
or when debugging authentication issues one may want to disable this feature.

NOTE, This will download (potentially duplicate) data for multiple packages if
there is more than one index available, but in general this should be negligible
because the simple API calls are very cheap and the user should not notice any
extra overhead.

If we are in synchronous mode, then we will use the first result that we
find in case extra indexes are specified.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"indices"?

EDIT: Nevermind! Looks like both work 🤦

""",
default = True,
),
"python_version": attr.string(
mandatory = True,
doc = """
Expand Down
16 changes: 4 additions & 12 deletions python/private/pypi_index.bzl
Original file line number Diff line number Diff line change
Expand Up @@ -23,7 +23,7 @@ load(":auth.bzl", "get_auth")
load(":envsubst.bzl", "envsubst")
load(":normalize_name.bzl", "normalize_name")

def simpleapi_download(ctx, *, attr, cache):
def simpleapi_download(ctx, *, attr, cache, parallel_download = True):
"""Download Simple API HTML.

Args:
Expand All @@ -49,6 +49,7 @@ def simpleapi_download(ctx, *, attr, cache):
undesirable because additions to the PyPI index would not be
reflected when re-evaluating the extension unless we do
`bazel clean --expunge`.
parallel_download: A boolean to enable usage of bazel 7.1 non-blocking downloads.

Returns:
dict of pkg name to the parsed HTML contents - a list of structs.
Expand All @@ -60,17 +61,8 @@ def simpleapi_download(ctx, *, attr, cache):

download_kwargs = {}
if bazel_features.external_deps.download_has_block_param:
download_kwargs["block"] = False

# Download in parallel if possible. This will download (potentially
# duplicate) data for multiple packages if there is more than one index
# available, but that is the price of convenience. However, that price
# should be mostly negligible because the simple API calls are very cheap
# and the user should not notice any extra overhead.
#
# If we are in synchronous mode, then we will use the first result that we
# find.
#
download_kwargs["block"] = not parallel_download

# NOTE @aignas 2024-03-31: we are not merging results from multiple indexes
# to replicate how `pip` would handle this case.
async_downloads = {}
Expand Down
Loading