feat: top level imports #3779

dmadisetti · 2025-02-13T05:08:55Z

📝 Summary

Followup to #3755 for #2293 allowing for "top level imports"

For completion of #2293, I thin UI changes and needed for enabling this behavior. Notably:

Indicate when in function mode (maybe top level import too)
Provide hints when pushed out of function mode
Maybe allow the user to opt out of function mode?

+ docs

This also increases security risk since code is run outside of runtime. This was always possible, but now marimo can save in a format that could skip the marimo runtime all together on restart.

There are opportunities here. marimo could lean into this, and leverage external code running as a chance to hook in (almost a plugin system for free)

But also issues, since a missing dep could stop the notebook from running at all (goes against the "batteries included" ethos). This can be mitigated with static analysis over just an import (markdown does this for instance), or marimo can re-serialize the notebook in the "safe" form, if it comes across issues in import.

🔍 Description of Changes

Includes a bit of a refactor to codegen since there were a fair amount of changes.
Allows top level imports of "import only" cells. The contents are pasted at the top of the file, with a bit of care not to break header extraction.

# Normal headers are retained
# Use a notice to denote where generated imports start
# Notice maybe needs some copy edit

# 👋 This file was generated by marimo. You can edit it, and tweak
# things- just be conscious that some changes may be overwritten if opened in
# the editor. For instance top level imports are derived from a cell, and not
# the top of the script. This notice signifies the beginning of the generated
# import section.

# Could also make this app.imports? But maybe increasing surface area for no reason
import numpy
# Note, import cells intentionally do not have a `return`
# for static analysis feature below

import marimo


__generated_with = "0.11.2"
app = marimo.App(_toplevel_fn=True)


@app.cell
def import_cell():
    # Could also make this app.imports? But maybe increasing surface area for no reason
    import numpy
    # Note, import cells intentionally do not have a `return`
    # for static analysis feature below

Top level refs (this includes @app.functions) are ignored in the signatures. E.g.

import marimo as mo

# ...

@app.cell
def md_cell():
    mo.md("Hi")
    return

Since I was also in there, I added static analysis to ignore returning dangling defs.

@app.cell
def cell_with_dangling_def():
    a = 1
    b = 2
    return (a,) # No longer returns b since it's not used anywhere. Allowing for linters like ruff to complain.

@app.cell
def ref_cell(a):
    a + 1
    return

LMK if too far reaching and we can break it up/ refactor. A bit more opinionated than the last PR

Test border more on being more smoke tests than unit tests, but hit the key issues I was worried about. I can break them down more granularly if needed. Also LMK if you can think of some more edgecases.

📜 Reviewers

@akshayka OR @mscolnick

vercel · 2025-02-13T05:09:00Z

The latest updates on your projects. Learn more about Vercel for Git ↗︎

Name	Status	Preview	Comments	Updated (UTC)
marimo-docs	🛑 Canceled (Inspect)			Feb 14, 2025 0:31am
marimo-storybook	✅ Ready (Inspect)	Visit Preview	💬 Add feedback	Feb 14, 2025 0:31am

for more information, see https://pre-commit.ci

…orts

…dm/toplevel-imports

leventov · 2025-02-13T09:13:13Z

@dmadisetti what's the matter with hashing top-level functions in hash.py, Hasher class and surrounding logic? Will it be treated as "normal" top-level/"pure" functions whose code is hashed in the module_hash calculation, with no special treatment needed? Or maybe serialize_and_dequeue_content_refs() can add a check if the function is app.fn for fast-track instead of calling is_pure_function()? Or, is_pure_function() should be changed itself to add such a fast-track? Should/could the code hash of app.fns be saved in a field of the corresponding Cell such that it doesn't need to be re-computed?

marimo/_ast/codegen.py

dmadisetti · 2025-02-13T15:16:21Z

@leventov caching is dependent on the runtime of the app. This PR is more to expose cells as usable functions to be exported from other module + some tweaks to make notebooks look more "pythonic" for linters. When marimo first loads this file, no runtime has been initialized.

Cell level caching is going to be coupled more with changes in _runtime.executor

dmadisetti · 2025-02-13T15:19:29Z

Eh. Just noticed import cells without a return are not liked by ruff. That was a bit of a last minute choice to try and clean up the whitespace- I'll put it back in

akshayka

Nice, getting closer to the ideal of reusable code!

Most of the below is discussion — doesn't need to be immediately addressed, but should be addressed before top level functions are enabled.

But also issues, since a missing dep could stop the notebook from running at all (goes against the "batteries included" ethos).

Yea, this is an issue for sure. As long as the user has marimo installed, marimo edit nb.py should always work, no matter if top-level imports are missing.

This can be mitigated with static analysis over just an import (markdown does this for instance), or marimo can re-serialize the notebook in the "safe" form, if it comes across issues in import.

The former option sounds better. I wonder if we should define the file format to consist of three sections:

A user-defined section, containing arbitrary text (the "header"), except for perhaps a special delimiter token.
A generated section containing top-level imports, if they are missing from the user-defined section, followed by a special delimiter.
Today's generated section:

import marimo

__generated_with = ...
app = marimo.App()

@app.function
def foo():
  ...

@app.cell
def bar():
  ...

In this way, marimo's Python file reader would simply skip sections (1) and (2) (based on the presence of the delimiter token), and programmatically read section 3 as it does today. If the delimiter were missing (user edited the file, or wrote from scratch), marimo would try to read the file programmatically as it does today. Just one proposal, and maybe this is similar to what you've implemented, but I do think it's worth it to write a specification for this very concretely and to document it in the codebase.

I think we should also very clearly define and document what is okay for the user to edit, and how, and what is not okay. One proposal: section 1 is fine to edit arbitrarily (except for a special delimiter?); section 2 should not be edited; section 3's cell and function definitions can be edited, cells and functions can be added, and cells and functions can be removed.

marimo/_ast/codegen.py

tests/_ast/codegen_data/test_generate_filecontents_toplevel.py

marimo/_ast/codegen.py

akshayka · 2025-02-14T00:14:48Z

marimo/_ast/codegen.py

+            if cell.import_workspace.is_import_block:
+                # maybe a bug, but import_workspace.imported_defs does not
+                # contain the information we need.
+                toplevel_imports |= cell.defs
+                if toplevel_fn:
+                    # TODO: Consider fn="imports" for @app.imports?
+                    # Distinguish that something is special about the block
+                    # Also remove the "return" in this case.
+                    definitions[idx] = to_general_functiondef(cell, names[idx])
+                else:
+                    definitions[idx] = to_functiondef(cell, names[idx])
+                import_blocks.append(code.strip())


If only import blocks are used, then in the below, foo won't get saved as a function. I can see this being a bit confusing for users. I'm wondering if imports could be saved top-level even if they weren't in import blocks.

cell:

import random ...

Another cell

def foo(): return random.randint(0, 43)

Yes they can- but I think restricting to import only blocks makes sense. Consider the following block:

@app.cell def _(run_button): mo.stop(run_button.value) import something_very_expensive_with_side_effects

akshayka · 2025-02-14T00:16:49Z

marimo/_ast/codegen.py

+            # notice to separate the imports from the rest of the code.
+            filecontents = [NOTICE, ""]
+
+    filecontents.append("\n\n".join(import_blocks))


Should imports be added unconditionally, as you've written, or should imports only be added if they are used in top-level functions?

One thought, if imports are added to the top of the file unconditionally, perhaps we should remove their corresponding defs from cell signatures, so that code completion in editors works better. However, maybe the right thing to do is just bite the bullet and write editor plugins / an LSP-like thing that handle completions for marimo notebook files, in which case my suggestion here is moot.

Also, can we ruff format the import section?

perhaps we should remove their corresponding defs from cell signatures, so that code completion in editors works better

Yep, this PR already does this

Also, can we ruff format the import section?

Yes, I'm leaning towards removing the statement block, stripping comments and formatting the imports.
Thoughts?

I think that sounds good ... also see my response to your import guard idea.

dmadisetti · 2025-02-14T02:28:17Z

I think we should also very clearly define and document what is okay for the user to edit, and how, and what is not okay. One proposal: section 1 is fine to edit arbitrarily (except for a special delimiter?); section 2 should not be edited; section 3's cell and function definitions can be edited, cells and functions can be added, and cells and functions can be removed.

I was struggling with this because I recognized having the many imports mixed with comments seemed to leave the notebook feeling a little messy, and more confusing to the intro user. I also think that for the most part, the current serialization is great.

I wonder if part of the UI is a "library mode" flag which is required before activating this. Means we don't have to communicate this information to the casual user, and the user looking for the functionality of exports, reuse, and linting will take the time to understand "library mode".

But also, here's another potential serialization that makes these "sections" a bit more evident:

# Header comments
"""Doc strings allowed too"""

import marimo                                                                                                                                                                                                                     
                                                                                                                                                                                                                                  
if marimo.import_guard():
    # Note these imports reflect the cell content below.                                                                                                                                                                          
    # Editing this block will not change the notebook imports.                                                                                                                                                                            
    import io                                                                                                                                                                                                                     
    import textwrap                                                                                                                                                                                                               
    import typing                                                                                                                                                                                                                 
    from pathlib import Path                                                                                                                                                                                                      
                                                                                                                                                                                                                                  
    import marimo as mo                                                                                                                                                                                                           
                                                                                                                                                                                                                                  
                                                                                                                                                                                                                                  
__generated_with = "0.11.2"                                                                                                                                                                                                       
app = marimo.App(_toplevel_fn=True)                                                                                                                                                                                               
                                          
...

Which also mitigates potential breakage, since marimo.import_guard() could always return False, and still keep linters happy.
I'm sold on reformatting the imports and stripping comments before serialization.

akshayka · 2025-02-14T04:55:39Z

Yea, appreciate your attention to the intro user.

Hmm, I'd prefer not to introduce a library mode if possible, but can consider it. As an alternative I think the import_guard() idea is interesting. But it would need to return True sometimes right? For example, given

# Header comments
"""Doc strings allowed too"""

import marimo                                                                                                                                                                                                                     
                                                                                                                                                                                                                                  
if marimo.import_guard():
  import numpy as np
                                                                                                                                                                                        
                                                                                                                                                                                                                                                                                                                                                                                                                                                                   
                                                                                                                                                                                                                                  
__generated_with = "0.11.2"                                                                                                                                                                                                       
app = marimo.App(_toplevel_fn=True)  

@app.function
def my_function():
  return np.random.randn(10, 10)                                                                                                                                                                                                                                    
...

for

```python
from my_notebook import my_function

to work, import_guard() would need to evaluate as True. Maybe import_guard would by default be True, but perhaps when reading notebook files in marimo, we'd have a context manager:

with marimo._ast.block_imports():  # makes import_guard() evaluate to False.
  # load the notebook ...

Not sure yet if this is a good idea. Just brainstorming ...

for more information, see https://pre-commit.ci

dmadisetti · 2025-02-14T12:29:02Z

import_guard was relatively easy to put in, and we can strip it out. I have import_guard return True for now, but there area a few cases where False might make sense

feat: top level imports

72a1643

[pre-commit.ci] auto fixes from pre-commit.com hooks

022e799

for more information, see https://pre-commit.ci

vercel bot deployed to Preview – marimo-docs February 13, 2025 05:10 View deployment

vercel bot deployed to Preview – marimo-storybook February 13, 2025 05:12 View deployment

dmadisetti added 3 commits February 13, 2025 00:32

fix: nvm imports _should_ have returns

314717a

Merge branch 'main' of github:marimo-team/marimo into dm/toplevel-imp…

14c50f4

…orts

Merge branch 'dm/toplevel-imports' of github:marimo-team/marimo into …

042ce29

…dm/toplevel-imports

mscolnick reviewed Feb 13, 2025

View reviewed changes

marimo/_ast/codegen.py Show resolved Hide resolved

akshayka reviewed Feb 14, 2025

View reviewed changes

temp: import_guard relatively easy to put in

7db678f

vercel bot deployed to Preview – marimo-storybook February 14, 2025 12:24 View deployment

vercel bot deployed to Preview – marimo-docs February 14, 2025 12:25 View deployment

dmadisetti and others added 2 commits February 14, 2025 07:27

Merge branch 'main' into dm/toplevel-imports

8d94f03

[pre-commit.ci] auto fixes from pre-commit.com hooks

8281b78

for more information, see https://pre-commit.ci

vercel bot deployed to Preview – marimo-docs February 14, 2025 12:29 View deployment

vercel bot deployed to Preview – marimo-storybook February 14, 2025 12:31 View deployment

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: top level imports #3779

feat: top level imports #3779

dmadisetti commented Feb 13, 2025

vercel bot commented Feb 13, 2025 •

edited

Loading

leventov commented Feb 13, 2025

dmadisetti commented Feb 13, 2025

dmadisetti commented Feb 13, 2025

akshayka left a comment

akshayka Feb 14, 2025

dmadisetti Feb 14, 2025

akshayka Feb 14, 2025

akshayka Feb 14, 2025

dmadisetti Feb 14, 2025

akshayka Feb 14, 2025

dmadisetti commented Feb 14, 2025

akshayka commented Feb 14, 2025

dmadisetti commented Feb 14, 2025

feat: top level imports #3779

Are you sure you want to change the base?

feat: top level imports #3779

Conversation

dmadisetti commented Feb 13, 2025

📝 Summary

🔍 Description of Changes

📜 Reviewers

vercel bot commented Feb 13, 2025 • edited Loading

leventov commented Feb 13, 2025

dmadisetti commented Feb 13, 2025

dmadisetti commented Feb 13, 2025

akshayka left a comment

Choose a reason for hiding this comment

akshayka Feb 14, 2025

Choose a reason for hiding this comment

dmadisetti Feb 14, 2025

Choose a reason for hiding this comment

akshayka Feb 14, 2025

Choose a reason for hiding this comment

akshayka Feb 14, 2025

Choose a reason for hiding this comment

dmadisetti Feb 14, 2025

Choose a reason for hiding this comment

akshayka Feb 14, 2025

Choose a reason for hiding this comment

dmadisetti commented Feb 14, 2025

akshayka commented Feb 14, 2025

dmadisetti commented Feb 14, 2025

vercel bot commented Feb 13, 2025 •

edited

Loading