Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

generate a multi-level PyPi index as per PEP 503 #1

Draft
wants to merge 5 commits into
base: main
Choose a base branch
from

Conversation

tdyas
Copy link
Contributor

@tdyas tdyas commented Dec 2, 2024

PEP 503 describes a multi-level PyPi index with a root index file with just links to the supported "projects" and project-specific index files with the actual links to release files.

Update generate_index.py to generate a multi-level index and modify the test accordingly.

@tdyas tdyas requested review from benjyw, huonw and thejcannon December 2, 2024 01:26
@tdyas
Copy link
Contributor Author

tdyas commented Dec 2, 2024

I tried to solve with dumb-pypi, but that tool cannot handle asset URLs which do not share the same prefix.

Copy link

@huonw huonw left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Makes sense to me, thanks for picking it up


def main(args):
parser = argparse.ArgumentParser()
parser.add_argument("--url-path-prefix", default="/", action="store")
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My read of pep-0503 is that the URLs must always be exactly /<project>/, i.e. customising this --url-path-prefix value may not work at all, and so potentially shouldn't be exposed?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My read of PEP 503 is that they are the full path with trailing /. And checking PyPi itself shows that the links include the prefix since they are /simple/PKG_NAME/. (For example, look at curl https://pypi.org/simple/ -o pypi.html.)

packages = defaultdict(lambda: defaultdict(list))

for asset in pants_wheel_assets:
name, version, build_tag, tags = parse_wheel_filename(asset.name)
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

PEP 503 seems to suggest we MUST be using the normalised names in various places. Does this guarantee name is the normalised package name? Or maybe the wheel file names are guaranteed to be normalised?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link

@benjyw benjyw left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Curious what prompted this?

@tdyas
Copy link
Contributor Author

tdyas commented Dec 2, 2024

Curious what prompted this?

Slack request from a user: https://pantsbuild.slack.com/archives/C046T6T9U/p1732700284097479

@tdyas
Copy link
Contributor Author

tdyas commented Dec 2, 2024

I pulled the test out to #2 and added a GitHub Actions workflow for it. Once that other PR lands, I'll update this PR accordingly.

@tdyas tdyas force-pushed the tdyas/multilevel_index branch from 676e8e3 to 31acff8 Compare December 2, 2024 03:23
@tdyas
Copy link
Contributor Author

tdyas commented Dec 2, 2024

@huonw / @benjyw: scie-pants has a reference to the single index file on https://wheels.pantsbuild.org/simple/. https://github.com/pantsbuild/scie-pants/blob/be453bb700dfe1728efdc41a68c41c266144bc58/tools/src/scie_pants/pants_version.py#L77

Is this something which has the potential to break if this PR lands?

@tdyas tdyas marked this pull request as draft December 2, 2024 03:34
@tdyas
Copy link
Contributor Author

tdyas commented Dec 2, 2024

Back to draft to prevent landing this PR until after we determine any effects on scie-pants.

@huonw
Copy link

huonw commented Dec 2, 2024

Ah, hm, good question. I suspect it's probably fine for new versions of scie-pants: a few lines later it is suggested that code path is only used for pants < 2, and I think this index file only supports pants >= 2, so the wheels referenced here shouldn't be useful.

https://github.com/pantsbuild/scie-pants/blob/be453bb700dfe1728efdc41a68c41c266144bc58/tools/src/scie_pants/pants_version.py#L81

Things I don't know:

  • if changing this will cause modern scie-pants to start crashing when installing very old Pants versions
  • if there's older still-in-use versions of scie-pants for which this might actually matter functionally (i.e. these find-links are actually used)

@tdyas
Copy link
Contributor Author

tdyas commented Dec 2, 2024

Ah, hm, good question. I suspect it's probably fine for new versions of scie-pants: a few lines later it is suggested that code path is only used for pants < 2, and I think this index file only supports pants >= 2, so the wheels referenced here shouldn't be useful.

scie-pants supports installing Pants v1? That seems odd to me.

@huonw
Copy link

huonw commented Dec 2, 2024

Yeah, in theory it does: one-stop-shop for running all Pantses. There's even a test of running 1.30.5rc1: https://github.com/pantsbuild/scie-pants/blob/be453bb700dfe1728efdc41a68c41c266144bc58/package/src/test.rs#L830-L851

In practice, just now, I've not been able to get this running locally (e.g. in a x86-64 linux docker container) due to errors like: ERROR: Could not build wheels for thriftpy2, which is required to install pyproject.toml-based projects.

Installing pantsbuild.pants==1.30.4 into a virtual environment at /root/.cache/nce/45436524b7fa68f7a41eb7ac53bc7bf9441def442ddb1ac1d5610
4ae54ed442d/bindings/venvs/1.30.4
  Installing build dependencies ... done
  WARNING: Missing build requirements in pyproject.toml for thriftpy2>=0.4.0 from https://files.pythonhosted.org/packages/f8/3a/d983b26d
f17583a3cc865a9e1737bb8faacfa1e16e3ed17353ef48847e6b/thriftpy2-0.5.2.tar.gz (from py-zipkin==0.20.0->pantsbuild.pants==1.30.4).
  WARNING: The project does not specify a build backend, and pip cannot fall back to setuptools without 'wheel'.
  Getting requirements to build wheel ... done
  Preparing metadata (pyproject.toml) ... done
  error: subprocess-exited-with-error
  
  × Building wheel for thriftpy2 (pyproject.toml) did not run successfully.
  │ exit code: 1
  ╰─> See above for output.
  
  note: This error originates from a subprocess, and is likely not a problem with pip.
  Building wheel for thriftpy2 (pyproject.toml) ... error
  ERROR: Failed building wheel for thriftpy2
ERROR: Could not build wheels for thriftpy2, which is required to install pyproject.toml-based projects

[notice] A new release of pip available: 22.3.1 -> 24.3.1
[notice] To update, run: python -m pip install --upgrade pip

The version of pip mentioned there seems to come from https://github.com/pantsbuild/scie-pants/blob/be453bb700dfe1728efdc41a68c41c266144bc58/tools/src/scie_pants/install_pants.py#L70

Not sure why there's the difference between scie-pants CI (which is passing) vs. running locally.

I've posted pantsbuild/scie-pants#431 to simulate this change and see what scie-pants' CI says.

@benjyw
Copy link

benjyw commented Dec 3, 2024

I'd definitely like to understand the full implications before merging. Can we not generate both indexes? The new one can go to some new location, since people who want to use it are aware of it. That said, scie-pants could perhaps also benefit from the faster lookup?

@tdyas
Copy link
Contributor Author

tdyas commented Jan 4, 2025

I'd definitely like to understand the full implications before merging. Can we not generate both indexes? The new one can go to some new location, since people who want to use it are aware of it. That said, scie-pants could perhaps also benefit from the faster lookup?

We can generate both indices easily enough. It's all just HTML. The question which arises to me then is whether pip will still want to read through all of the links, or whether if the complaint links are first in the file, will it stop at that point and retrieve the sub-page for the relevant Pants packages.

@tdyas
Copy link
Contributor Author

tdyas commented Jan 4, 2025

The new one can go to some new location, since people who want to use it are aware of it.

Good idea. Should it still be under wheels.pantsbuild.org then or a new subdomain of pantsbuild.org?

@benjyw
Copy link

benjyw commented Jan 5, 2025

The new one can go to some new location, since people who want to use it are aware of it.

Good idea. Should it still be under wheels.pantsbuild.org then or a new subdomain of pantsbuild.org?

Same domain, different path? Is that feasible?

@tdyas
Copy link
Contributor Author

tdyas commented Jan 5, 2025

Good idea. Should it still be under wheels.pantsbuild.org then or a new subdomain of pantsbuild.org?

Same domain, different path? Is that feasible?

Very easy to do. This "hosting site" is just a bunch of static files.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants