-
Notifications
You must be signed in to change notification settings - Fork 384
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat: top level imports #3779
base: main
Are you sure you want to change the base?
feat: top level imports #3779
Conversation
The latest updates on your projects. Learn more about Vercel for Git ↗︎
|
for more information, see https://pre-commit.ci
@dmadisetti what's the matter with hashing top-level functions in |
@leventov caching is dependent on the runtime of the app. This PR is more to expose cells as usable functions to be exported from other module + some tweaks to make notebooks look more "pythonic" for linters. When marimo first loads this file, no runtime has been initialized. Cell level caching is going to be coupled more with changes in |
Eh. Just noticed import cells without a return are not liked by ruff. That was a bit of a last minute choice to try and clean up the whitespace- I'll put it back in |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice, getting closer to the ideal of reusable code!
Most of the below is discussion — doesn't need to be immediately addressed, but should be addressed before top level functions are enabled.
But also issues, since a missing dep could stop the notebook from running at all (goes against the "batteries included" ethos).
Yea, this is an issue for sure. As long as the user has marimo
installed, marimo edit nb.py
should always work, no matter if top-level imports are missing.
This can be mitigated with static analysis over just an import (markdown does this for instance), or marimo can re-serialize the notebook in the "safe" form, if it comes across issues in import.
The former option sounds better. I wonder if we should define the file format to consist of three sections:
- A user-defined section, containing arbitrary text (the "header"), except for perhaps a special delimiter token.
- A generated section containing top-level imports, if they are missing from the user-defined section, followed by a special delimiter.
- Today's generated section:
import marimo
__generated_with = ...
app = marimo.App()
@app.function
def foo():
...
@app.cell
def bar():
...
In this way, marimo's Python file reader would simply skip sections (1) and (2) (based on the presence of the delimiter token), and programmatically read section 3 as it does today. If the delimiter were missing (user edited the file, or wrote from scratch), marimo would try to read the file programmatically as it does today. Just one proposal, and maybe this is similar to what you've implemented, but I do think it's worth it to write a specification for this very concretely and to document it in the codebase.
I think we should also very clearly define and document what is okay for the user to edit, and how, and what is not okay. One proposal: section 1 is fine to edit arbitrarily (except for a special delimiter?); section 2 should not be edited; section 3's cell and function definitions can be edited, cells and functions can be added, and cells and functions can be removed.
marimo/_ast/codegen.py
Outdated
if cell.import_workspace.is_import_block: | ||
# maybe a bug, but import_workspace.imported_defs does not | ||
# contain the information we need. | ||
toplevel_imports |= cell.defs | ||
if toplevel_fn: | ||
# TODO: Consider fn="imports" for @app.imports? | ||
# Distinguish that something is special about the block | ||
# Also remove the "return" in this case. | ||
definitions[idx] = to_general_functiondef(cell, names[idx]) | ||
else: | ||
definitions[idx] = to_functiondef(cell, names[idx]) | ||
import_blocks.append(code.strip()) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If only import blocks are used, then in the below, foo
won't get saved as a function. I can see this being a bit confusing for users. I'm wondering if imports could be saved top-level even if they weren't in import blocks.
cell:
import random
...
Another cell
def foo():
return random.randint(0, 43)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes they can- but I think restricting to import only blocks makes sense. Consider the following block:
@app.cell
def _(run_button):
mo.stop(run_button.value)
import something_very_expensive_with_side_effects
marimo/_ast/codegen.py
Outdated
# notice to separate the imports from the rest of the code. | ||
filecontents = [NOTICE, ""] | ||
|
||
filecontents.append("\n\n".join(import_blocks)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should imports be added unconditionally, as you've written, or should imports only be added if they are used in top-level functions?
One thought, if imports are added to the top of the file unconditionally, perhaps we should remove their corresponding defs from cell signatures, so that code completion in editors works better. However, maybe the right thing to do is just bite the bullet and write editor plugins / an LSP-like thing that handle completions for marimo notebook files, in which case my suggestion here is moot.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also, can we ruff format
the import section?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
perhaps we should remove their corresponding defs from cell signatures, so that code completion in editors works better
Yep, this PR already does this
Also, can we ruff format the import section?
Yes, I'm leaning towards removing the statement block, stripping comments and formatting the imports.
Thoughts?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think that sounds good ... also see my response to your import guard idea.
I was struggling with this because I recognized having the many imports mixed with comments seemed to leave the notebook feeling a little messy, and more confusing to the intro user. I also think that for the most part, the current serialization is great. I wonder if part of the UI is a "library mode" flag which is required before activating this. Means we don't have to communicate this information to the casual user, and the user looking for the functionality of exports, reuse, and linting will take the time to understand "library mode". But also, here's another potential serialization that makes these "sections" a bit more evident: # Header comments
"""Doc strings allowed too"""
import marimo
if marimo.import_guard():
# Note these imports reflect the cell content below.
# Editing this block will not change the notebook imports.
import io
import textwrap
import typing
from pathlib import Path
import marimo as mo
__generated_with = "0.11.2"
app = marimo.App(_toplevel_fn=True)
... Which also mitigates potential breakage, since |
Yea, appreciate your attention to the intro user. Hmm, I'd prefer not to introduce a library mode if possible, but can consider it. As an alternative I think the # Header comments
"""Doc strings allowed too"""
import marimo
if marimo.import_guard():
import numpy as np
__generated_with = "0.11.2"
app = marimo.App(_toplevel_fn=True)
@app.function
def my_function():
return np.random.randn(10, 10)
...
for
```python
from my_notebook import my_function to work, with marimo._ast.block_imports(): # makes import_guard() evaluate to False.
# load the notebook ... Not sure yet if this is a good idea. Just brainstorming ... |
|
📝 Summary
Followup to #3755 for #2293 allowing for "top level imports"
For completion of #2293, I thin UI changes and needed for enabling this behavior. Notably:
+ docs
This also increases security risk since code is run outside of runtime. This was always possible, but now marimo can save in a format that could skip the marimo runtime all together on restart.
There are opportunities here. marimo could lean into this, and leverage external code running as a chance to hook in (almost a plugin system for free)
But also issues, since a missing dep could stop the notebook from running at all (goes against the "batteries included" ethos). This can be mitigated with static analysis over just an import (markdown does this for instance), or marimo can re-serialize the notebook in the "safe" form, if it comes across issues in import.
🔍 Description of Changes
Includes a bit of a refactor to codegen since there were a fair amount of changes.
Allows top level imports of "import only" cells. The contents are pasted at the top of the file, with a bit of care not to break header extraction.
Top level refs (this includes
@app.function
s) are ignored in the signatures. E.g.Since I was also in there, I added static analysis to ignore returning dangling defs.
LMK if too far reaching and we can break it up/ refactor. A bit more opinionated than the last PR
Test border more on being more smoke tests than unit tests, but hit the key issues I was worried about. I can break them down more granularly if needed. Also LMK if you can think of some more edgecases.
📜 Reviewers
@akshayka OR @mscolnick