Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

enhancement to help kernel provisioners introspect notebooks for their dependencies #1055

Open
itcarroll opened this issue Feb 17, 2025 · 4 comments

Comments

@itcarroll
Copy link

itcarroll commented Feb 17, 2025

Multiple communities are interested in embedding environment data within a Jupyter notebook's metadata (1, 2, 3, 4, 5). The goal being to have tools that prepare an environment equipped to execute a given notebook. One recent thread identified the new kernel provisioning abstraction as a key part of achieving this goal. I am a data scientist, not a developer, but it looks to me like achieving this goal requires something not present in the kernel provisioning concept.

Do you envision any standard way that the kernel provisioner should get metadata from a notebook to which it's attached? The kernel.json has metadata.kernel_provisioner.config but there's no specified field in the notebook metadata:

{
    "cells": [...],
    "metadata": {
        "kernelspec": {
            "display_name": "Python 3 (ipykernel)",
            "language": "python",
            "name": "python3",
            # Standardize a keyword in here that kernel provisioners would check for data?
        }
    }
}

There are not many kernel provisioners in the wild, but I don't see anything stopping them from introspecting notebook metadata. I think providing a standard place to do that would foster synergies towards achieving the long-standing goal of self-contained, reproducible notebooks. Thank you for considering!

@davidbrochart
Copy link
Member

I think that would require a JEP. In the meantime, have you looked at juv?

@itcarroll
Copy link
Author

Thanks, the juv project is news to me.

Can you say whether reading environment data from notebooks is an intended use of kernel provisioners? Do you think it's worth having that data in a standard location?

@davidbrochart
Copy link
Member

Hmm I cannot really tell since I was not involved in the development of the kernel provisioners. Maybe @kevin-bates can answer?

@kevin-bates
Copy link
Member

Hi @itcarroll. Kernel provisioners are responsible for provisioning the environment within which the kernel process runs. They are essentially agnostic about the actual notebook being executed. While they do have access to the kernel's specification (kernelspec) there is no mechanism in place for accessing the physical content of the notebook and, frankly, some provisioners may not have access to that in the first place.

This seems very similar to kernel parameterization in which there would be metadata from within the notebook that is conveyed to provisioners. On first launch, the notebook might be seeded with a set of defaults that are then "customized" via the UI, afterwhich these inputs are persisted with the notebook for future use. Since a notebook could be launched using different provisioners, the provisioners would make a "best effort" in resolving the parameters.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants