Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

BLD: providing binary wheels on PyPI #41

Open
tylerjereddy opened this issue Jun 30, 2022 · 9 comments
Open

BLD: providing binary wheels on PyPI #41

tylerjereddy opened this issue Jun 30, 2022 · 9 comments

Comments

@tylerjereddy
Copy link
Contributor

@namehta4 @NaderAlAwar

Since pykokkos-base is not actually needed to build pykokkos (distinction of genuine build time vs. runtime dependencies), and because PEP518-based pip installs will build pykokkos on its own in an isolated env before installing it to a local user env, regardless of how we install pykokkos with pip, it will still be up to the user to provide a suitable version of pykokkos-base in their environment (the same would apply for providing a suitable version of NumPy when working with SciPy for example).

So, in the pip/PyPI ecosystem, I suspect the only way for us to reduce build/install friction is to:

The latter would likely be a substantial lift, and I'm not sure how we'd handle OMP, CUDA backend library shipping, though some libs like pytorch or tensorflow could likely be used as inspiration for that.

@jrmadsen
Copy link
Contributor

jrmadsen commented Jun 30, 2022

I'm not sure how we'd handle OMP, CUDA backend library shipping, though some libs like pytorch or tensorflow could likely be used as inspiration for that.

Create conda-forge packaging.

tylerjereddy added a commit to tylerjereddy/pykokkos that referenced this issue Jun 30, 2022
* modernize our `pip install` by using the PEP518
`pyproject.toml` configuration, which basically switches
us to the modern process where `pip` will first build
a wheel in an isolated env and then install that binary wheel
into the current user env

* note that unfortunately this doesn't really make dealing
with the friction of needing the complex `pykokkos-base`
package any easier, as I describe in
kokkos/pykokkos-base#41

* we don't actually need `pykokkos-base` to *build*
`pykokkos`, so much like a user needs to provide a suitable
version of NumPy when working with SciPy, we need the user
to provide a suitable version of `pykokkos-base` *separately*
from the build/install of `pykokkos`; I suspect the best
way to make this easier would be to provide `pykokkos-base`
wheels on PyPI, which I don't see at the moment (likely a substantial
lift, but could be worth it)
@tylerjereddy
Copy link
Contributor Author

Conda and PyPI are two completely different ecosystems, and mixing them is not really recommended.

@namehta4
Copy link

namehta4 commented Jun 30, 2022 via email

@jrmadsen
Copy link
Contributor

jrmadsen commented Jun 30, 2022

Conda and PyPI are two completely different ecosystems, and mixing them is not really recommended.

@tylerjereddy

Yes but when it comes to mixing C++ and Python, pip is not designed to handle compilation and build variants well at all. Pip is a python-specific package manager. Conda is a generic package manager. This makes a big difference when it comes to handling compilation and build variants. Thus, I've found that using pip for source installs and conda for pre-built is the best thing to do

@jrmadsen
Copy link
Contributor

jrmadsen commented Jun 30, 2022

I thnk the problem here is to create a one-line pip install for pykokkos which will install pykokkos-base as a dependency

@namehta4

I am pretty sure that you could add pykokkos-base in your requirements.txt and then in your pykokkos setup.py (very early on) you could do os.environ["PYKOKKOS_BASE_SETUP_ARGS"] = "<cmake arguments>" so that when the requirements get resolved, pykokkos-base will inherit that environment variable. AFAIK, the pip build isolation just limits the scope of which packages you can import but doesn't clear the environment

@tylerjereddy
Copy link
Contributor Author

tylerjereddy commented Jun 30, 2022

SciPy and NumPy both mix C++ and Python (and Fortran for SciPy) and provide solutions in both ecosystems. I think tensorflow and pytorch do binaries in both ecosystems as well. If you don't want to provide a solution in the PyPI ecosystem fair enough, but there certainly are PEPs/standards for doing these kinds of things, it just takes time and @NaderAlAwar wanted us to look into it a bit.

I think the problem here is to create a
one-line pip install for pykokkos which will install pykokkos-base as a
dependency, which I was unable to do

pip install is meant to install a single package into the user env by design, so that isn't going to work really. A user may have their own pykokkos-base they won't want to supersede, etc., so that's why I'm suggesting that we use a PyPI wheel to at least simplify the manual provision of a binary. If @jrmadsen wants to document/encourage a conda install in an env that is otherwise PyPI based, fair enough, though I can't recommend that in general.

I am pretty sure that you could add pykokkos-base in your requirements.txt and then in your pykokkos setup.py (very early on) you could do os.environ["PYKOKKOS_BASE_SETUP_ARGS"] = "" so that when the requirements get resolved, pykokkos-base will inherit that environment variable. AFAIK, the pip build isolation just limits the scope of which packages you can import but doesn't clear the environment

I wouldn't recommend this--you're polluting the user environment with packages they didn't ask for, which is part of what the various PEPs/standards are designed to protect against.

@jrmadsen
Copy link
Contributor

jrmadsen commented Jun 30, 2022

SciPy and NumPy both mix C++ and Python (and Fortran for SciPy) and provide solutions in both ecosystems.

Yes, but they don't have to deal with multiple backends causing build variants.

I think tensorflow and pytorch do binaries in both ecosystems as well.

Yes but those are wheels without any acceleration. If anything, I think it would be better to just create a "new packages" like pykokkos-base-openmp which are wheels for variants.

@tylerjereddy
Copy link
Contributor Author

Yes, but they don't have to deal with multiple backends causing build variants.

There are variants in the linear algebra backends the libraries are built against. At least for SciPy that currently means choosing a variant at the moment and just shipping that as the default (i.e., OpenBLAS instead of the reference imlementation or the Intel MKL stuff).

conda is indeed currently better for swapping backends, but there's still a tendency to serve both communities/ecosystems because they're both large, etc.

@jrmadsen
Copy link
Contributor

jrmadsen commented Jun 30, 2022

I just did a pip install pykokkos-base and ran into an issue because it defaulted to enabling CUDA and I didn't have nvcc in my path. All I had to do was do export PYKOKKOS_BASE_SETUP_ARGS="-DENABLE_CUDA=OFF". The quickest, easiest fix to make it less painful would be to just not default to CUDA because once that was fixed, installing from source took less than 3 minutes.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants