Skip to content

Conversation

@jpopesculian
Copy link
Contributor

@jpopesculian jpopesculian commented Oct 7, 2025

Summary by CodeRabbit

  • New Features

    • Added a "dataset marimo" command and corresponding notebook/template to create Marimo datasets with SQL editor and data initialization.
    • Included templated project scaffolding and runtime helpers for validated dataset creation.
  • Chores

    • Added Marimo environment configuration defaults (editor, LSPs, formatting, runtime).
    • Improved Docker build and release workflow, added bake configuration and optimized ignore rules for builds.

✏️ Tip: You can customize this high-level summary in your review settings.

Copy link

@chatgpt-codex-connector chatgpt-codex-connector bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

@jpopesculian jpopesculian force-pushed the feat-kubimo-template branch 7 times, most recently from a9fbf6d to d3b55f9 Compare October 14, 2025 15:53
@jpopesculian jpopesculian force-pushed the feat-kubimo-template branch 2 times, most recently from 36f357b to 65da805 Compare October 21, 2025 11:48
@coderabbitai
Copy link

coderabbitai bot commented Nov 21, 2025

Note

Other AI code review bot(s) detected

CodeRabbit has detected other AI code review bot(s) in this pull request and will avoid duplicating their findings in the review comments. This may lead to a less comprehensive review.

Pre-merge checks and finishing touches

❌ Failed checks (1 warning)
Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 44.44% which is insufficient. The required threshold is 80.00%. You can run @coderabbitai generate docstrings to improve docstring coverage.
✅ Passed checks (2 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title 'feat: add template for kubimo' accurately reflects the primary change: introducing a new Kubimo template with associated Docker configuration, Rust CLI commands, and template files.
✨ Finishing touches
  • 📝 Generate docstrings
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment
  • Commit unit tests in branch feat-kubimo-template

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 5

🧹 Nitpick comments (6)
docker/kubimo/marimo/marimo.toml (1)

40-40: Consider increasing line length for better readability.

The line length is set to 79, which follows PEP 8's historical limit. Modern Python formatters like Black default to 88, and many projects use 100+ for better code readability without excessive wrapping.

If this aligns with your project's style guidelines, consider:

-line_length = 79
+line_length = 88
docker/kubimo/marimo/marimo.css (1)

1-1: Consider self-hosting fonts or documenting the external dependency.

Loading fonts from Google's CDN introduces an external dependency that:

  • May have privacy implications (user IP addresses sent to Google)
  • Could fail if the CDN is unavailable
  • Requires internet connectivity

Consider one of these approaches:

  1. Self-host the fonts within the Docker image
  2. Document this external dependency in the README
  3. Ensure offline functionality with appropriate fallback fonts (currently handled by the sans-serif and monospace fallbacks)
template/assets/dataset_marimo/pyproject.toml.hbs (2)

7-8: Consider using more restrictive version constraints.

The >= constraints allow any future version including major version upgrades, which could introduce breaking changes. Consider using ~= (compatible release) or >=,<major+1 constraints for better stability.

Example:

 dependencies = [
-  "marimo[recommended,lsp]>={{ marimo_version }}",
-  "aqora-cli[pyarrow]>={{ cli_version }}",
+  "marimo[recommended,lsp]~={{ marimo_version }}",
+  "aqora-cli[pyarrow]~={{ cli_version }}",
 ]

Note: ~= allows patch and minor version updates but blocks major version changes (e.g., ~=1.2.3 allows >=1.2.3,<2.0.0).


12-12: Consider pinning dev dependency versions for reproducibility.

The dev dependencies ty and watchdog lack version constraints, which could lead to non-reproducible builds when these packages are updated.

Consider adding minimum versions or using a lock file mechanism:

-dev = ["ty", "watchdog"]
+dev = ["ty>=0.6", "watchdog>=3.0"]
template/src/utils.rs (1)

3-59: Slug/username validators currently accept empty strings

is_slug and is_username both use * in their regexes:

static ref SLUG_REGEX: Regex = Regex::new(r"^[-a-zA-Z0-9_]*$").unwrap();
static ref USERNAME_REGEX: Regex = Regex::new(r"^[a-zA-Z0-9_]*$").unwrap();

This means "" is treated as a valid slug/username. Callers like UseCaseTemplate and DatasetMarimoTemplate already ensure the options are present, but a user can still pass an empty string and have it accepted.

If you want to enforce non‑empty identifiers, consider tightening these to require at least one character (or adding an explicit emptiness check), e.g.:

-        static ref SLUG_REGEX: Regex = Regex::new(r"^[-a-zA-Z0-9_]*$").unwrap();
+        static ref SLUG_REGEX: Regex = Regex::new(r"^[-a-zA-Z0-9_]+$").unwrap();
...
-        static ref USERNAME_REGEX: Regex = Regex::new(r"^[a-zA-Z0-9_]*$").unwrap();
+        static ref USERNAME_REGEX: Regex = Regex::new(r"^[a-zA-Z0-9_]+$").unwrap();

leaving the assert_* helpers unchanged.

template/assets/dataset_marimo/readme.py.hbs (1)

9-9: Remove redundant assignment.

The assignment _ = data serves no purpose since data is already available as a function parameter and is implicitly used by the SQL query (when referencing data table in DuckDB context).

 @app.cell(hide_code=True)
 def _(data, mo, sql_editor):
-    _ = data
     _df = mo.sql(
         f"""
         SET disabled_filesystems = 'LocalFileSystem';
         {sql_editor.value}
         """
     )
     return
📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 3df77d3 and aa67f48.

⛔ Files ignored due to path filters (1)
  • Cargo.lock is excluded by !**/*.lock
📒 Files selected for processing (19)
  • .dockerignore (1 hunks)
  • .github/workflows/cd.yaml (1 hunks)
  • Dockerfile (0 hunks)
  • docker-bake.dev.hcl (1 hunks)
  • docker-bake.hcl (1 hunks)
  • docker/Dockerfile.kubimo (1 hunks)
  • docker/kubimo/marimo/marimo.css (1 hunks)
  • docker/kubimo/marimo/marimo.toml (1 hunks)
  • src/commands/new/dataset_marimo.rs (1 hunks)
  • src/commands/new/mod.rs (2 hunks)
  • src/graphql/get_latest_dataset_version.graphql (1 hunks)
  • template/assets/dataset_marimo/.gitignore.hbs (1 hunks)
  • template/assets/dataset_marimo/pyproject.toml.hbs (1 hunks)
  • template/assets/dataset_marimo/readme.py.hbs (1 hunks)
  • template/src/dataset_marimo.rs (1 hunks)
  • template/src/lib.rs (1 hunks)
  • template/src/registry.rs (3 hunks)
  • template/src/use_case.rs (2 hunks)
  • template/src/utils.rs (1 hunks)
💤 Files with no reviewable changes (1)
  • Dockerfile
🧰 Additional context used
🧬 Code graph analysis (5)
template/src/dataset_marimo.rs (2)
template/src/utils.rs (3)
  • assert_semver (22-27)
  • assert_slug (38-43)
  • assert_username (54-59)
template/src/use_case.rs (4)
  • builder (27-29)
  • validate (37-46)
  • render (31-33)
  • render (48-52)
src/commands/new/dataset_marimo.rs (3)
src/commands/global_args.rs (1)
  • graphql_client (165-174)
src/error.rs (1)
  • format_permission_error (89-97)
template/src/dataset_marimo.rs (1)
  • builder (34-36)
template/src/lib.rs (2)
src/commands/new/dataset_marimo.rs (1)
  • dataset_marimo (31-91)
template/src/registry.rs (1)
  • handlebars (83-84)
template/src/use_case.rs (1)
template/src/utils.rs (3)
  • assert_no_control_chars (67-72)
  • assert_semver (22-27)
  • assert_slug (38-43)
src/commands/new/mod.rs (1)
src/commands/new/dataset_marimo.rs (2)
  • dataset_marimo (31-91)
  • global (60-63)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (13)
  • GitHub Check: Build and publish npm package
  • GitHub Check: Test WASM (chrome)
  • GitHub Check: macos (aarch64)
  • GitHub Check: Build and Push Docker Image
  • GitHub Check: Test Workspace
  • GitHub Check: Test WASM (safari)
  • GitHub Check: linux (aarch64)
  • GitHub Check: linux (x86_64)
  • GitHub Check: macos (x86_64)
  • GitHub Check: windows (x64)
  • GitHub Check: linux (x86)
  • GitHub Check: Test WASM (firefox)
  • GitHub Check: windows (x86)
🔇 Additional comments (17)
template/assets/dataset_marimo/.gitignore.hbs (1)

1-18: LGTM! Comprehensive and well-organized gitignore template.

The ignore patterns appropriately cover Python build artifacts, virtual environments, and marimo/aqora-specific directories. The file is well-commented with logical grouping for maintainability.

src/graphql/get_latest_dataset_version.graphql (1)

1-7: LGTM!

The GraphQL query is well-structured with appropriate non-null constraints on the input parameters and a focused selection set that retrieves only the necessary version information.

.dockerignore (1)

1-6: LGTM!

The ignore patterns appropriately exclude common build artifacts, environment files, and Python-related outputs from the Docker build context, which helps reduce image size and prevent leaking sensitive information.

src/commands/new/mod.rs (1)

1-25: LGTM!

The integration of the DatasetMarimo subcommand follows the existing pattern established by the UseCase subcommand. The subcommand is appropriately hidden, suggesting it's experimental or intended for internal use.

docker/kubimo/marimo/marimo.css (1)

3-17: LGTM!

The CSS variables and conditional layout rules are well-structured. The use of attribute selectors for conditional padding based on data-config-width and data-command attributes provides flexible layout control.

template/src/registry.rs (2)

65-72: LGTM!

The indent helper is well-implemented with proper handling of edge cases (empty strings via unwrap_or("")). The logic correctly preserves the first line's indentation while applying the indent string to subsequent lines.


87-87: LGTM!

The helper registration follows the same pattern as the existing toml_val helper.

docker-bake.dev.hcl (1)

1-7: LGTM!

The HCL configuration correctly defines a TAG variable and docker-metadata-kubimo target for development builds. The variable interpolation syntax is appropriate, and this provides a clean override mechanism for the bake workflow.

docker-bake.hcl (1)

1-11: Bake targets wired correctly for kubimo image

docker-metadata-kubimo as a placeholder plus kubimo inheriting from it matches the metadata-action + bake-action pattern, and group "default" targeting only kubimo keeps the release scope clear. No changes needed here.

template/src/lib.rs (1)

1-4: New template exports look consistent

Exposing dataset_marimo, utils, and re-exporting DatasetMarimoTemplate alongside UseCaseTemplate and RenderError gives a coherent public API for template consumers. No issues from this file alone.

Also applies to: 8-10

template/src/use_case.rs (1)

8-8: Validation refactor to shared helpers looks good

Routing python/CLI version checks through assert_semver and title/competition checks through assert_no_control_chars/assert_slug keeps this builder lean and consistent with other templates. No functional issues spotted here.

Also applies to: 36-45

template/src/dataset_marimo.rs (1)

1-67: DatasetMarimoTemplate builder invariants match CLI usage

The builder setup and validate logic correctly enforce the intended invariants:

  • All version fields (python_version, cli_version, marimo_version) go through assert_semver.
  • When raw_init is absent, owner, local_slug, and version are required and validated via assert_username/assert_slug/assert_semver.
  • When raw_init is present, those fields remain optional, matching the CLI’s --raw vs --slug modes.

OptionExt::flat_ref is a neat way to deal with Option<Option<_>> from strip_option. Overall, this template looks solid.

template/assets/dataset_marimo/readme.py.hbs (5)

1-4: LGTM!

The imports and app configuration are correct. The __generated_with metadata will help track which template version generated the notebook.


19-23: LGTM!

The SQL editor widget is properly configured with a sensible default query and 1-second debounce. The cell correctly displays the widget and returns it for use in dependent cells.


26-34: LGTM!

The data initialization cell correctly handles both custom initialization (via raw_init) and default dataset loading. The conditional template logic and the return statement properly integrate with marimo's reactive execution model.


37-40: LGTM!

The marimo import cell follows the standard pattern for making marimo's API available throughout the notebook as mo.


43-44: LGTM!

The main execution block follows the standard Python pattern for running marimo apps.

Comment on lines +141 to 165
- name: Docker meta kubimo
id: meta-kubimo
uses: docker/metadata-action@v5
with:
images: ghcr.io/aqora-io/cli
flavor: |
latest=false
images: ghcr.io/aqora-io/cli-kubimo
bake-target: docker-metadata-kubimo
tags: |
type=schedule
type=ref,event=branch,prefix=${{ env.TAG_PREFIX }}branch-
type=ref,event=tag,prefix=${{ env.TAG_PREFIX }}tag-
type=ref,event=pr,prefix=${{ env.TAG_PREFIX }}pr-
type=raw,value=latest,prefix=${{ env.TAG_PREFIX }},enable={{is_default_branch}}
- name: Docker build and push
uses: docker/build-push-action@v5
with:
push: ${{ github.event_name != 'pull_request' }}
tags: ${{ steps.meta.outputs.tags }}
labels: ${{ steps.meta.outputs.labels }}
cache-from: type=gha
cache-to: type=gha,mode=max
type=ref,event=branch
type=ref,event=pr
type=semver,pattern={{version}}
type=semver,pattern={{major}}.{{minor}}
type=semver,pattern={{major}}
- name: Build and push
uses: docker/bake-action@v6
with:
set: |
kubimo*.cache-from=type=registry,ref=ghcr.io/aqora-io/cli-kubimo:latest
*.cache-to=type=inline
files: |
./docker-bake.hcl
cwd://${{ steps.meta-kubimo.outputs.bake-file }}
# TODO: change before merging
# push: ${{ github.event_name != 'pull_request' }}
push: true

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Avoid unconditional image pushes from all events

The metadata + bake wiring for the kubimo image looks good now, but:

          # TODO: change before merging
          # push: ${{ github.event_name != 'pull_request' }}
          push: true

With push: true, this job will attempt to push images even for pull_request events, which is usually undesirable (and may fail for forked PRs due to permissions). Recommend switching back to an event-gated push before merging, e.g.:

-          # TODO: change before merging
-          # push: ${{ github.event_name != 'pull_request' }}
-          push: true
+          push: ${{ github.event_name != 'pull_request' }}

so PRs still build but only non‑PR refs publish images.

🤖 Prompt for AI Agents
.github/workflows/cd.yaml lines 141-165: current bake step unconditionally sets
push: true which causes image pushes on pull_request events; change the push
setting to be event-gated (for example set push to ${{ github.event_name !=
'pull_request' }} or only push for branch refs like main/release) so PRs still
build but images are only published for non-PR events or targeted branches.

Comment on lines 1 to 25
FROM ghcr.io/aqora-io/kubimo-marimo:main AS build

USER 0
RUN curl https://sh.rustup.rs -sSf | bash -s -- -y --no-update-default-toolchain
ENV PATH="/root/.cargo/bin:${PATH}"

WORKDIR /build
COPY rust-toolchain .
RUN rustc --version
COPY . .
RUN --mount=type=cache,target=/usr/local/cargo/registry,sharing=locked \
--mount=type=cache,target=/usr/local/cargo/git,sharing=locked \
--mount=type=cache,target=target,sharing=locked \
cargo build --release \
&& cp target/release/aqora /bin/aqora

WORKDIR /workspace
RUN aqora new dataset-marimo user/workspace -v 0.0.0 . && rm readme.py

FROM ghcr.io/aqora-io/kubimo-marimo:main

COPY --chown=1000:1000 ./docker/kubimo/marimo /home/me/.config/marimo
COPY --from=build /bin/aqora /bin/aqora
COPY --chown=1000:1000 --from=build /workspace/* .
RUN uv venv && uv sync
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Ensure workspace copy preserves dotfiles and runs in a well-defined working directory

Two things to tighten up here:

  1. COPY --chown=1000:1000 --from=build /workspace/* . will not copy dotfiles (e.g. .gitignore, .marimo, config directories) and only matches top-level entries. That’s likely to break the generated workspace layout.

  2. The final stage relies on the base image’s default WORKDIR for both the COPY ... . destination and uv venv && uv sync, which makes behavior dependent on the base image and less explicit.

Consider something like:

-FROM ghcr.io/aqora-io/kubimo-marimo:main
-
-COPY --chown=1000:1000 ./docker/kubimo/marimo /home/me/.config/marimo
-COPY --from=build /bin/aqora /bin/aqora
-COPY --chown=1000:1000 --from=build /workspace/* .
-RUN uv venv && uv sync
+FROM ghcr.io/aqora-io/kubimo-marimo:main
+
+WORKDIR /workspace
+COPY --chown=1000:1000 ./docker/kubimo/marimo /home/me/.config/marimo
+COPY --from=build /bin/aqora /bin/aqora
+COPY --chown=1000:1000 --from=build /workspace/ .
+RUN uv venv && uv sync

This preserves dotfiles and nested content from /workspace and makes the venv setup run in the intended project root.

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
FROM ghcr.io/aqora-io/kubimo-marimo:main AS build
USER 0
RUN curl https://sh.rustup.rs -sSf | bash -s -- -y --no-update-default-toolchain
ENV PATH="/root/.cargo/bin:${PATH}"
WORKDIR /build
COPY rust-toolchain .
RUN rustc --version
COPY . .
RUN --mount=type=cache,target=/usr/local/cargo/registry,sharing=locked \
--mount=type=cache,target=/usr/local/cargo/git,sharing=locked \
--mount=type=cache,target=target,sharing=locked \
cargo build --release \
&& cp target/release/aqora /bin/aqora
WORKDIR /workspace
RUN aqora new dataset-marimo user/workspace -v 0.0.0 . && rm readme.py
FROM ghcr.io/aqora-io/kubimo-marimo:main
COPY --chown=1000:1000 ./docker/kubimo/marimo /home/me/.config/marimo
COPY --from=build /bin/aqora /bin/aqora
COPY --chown=1000:1000 --from=build /workspace/* .
RUN uv venv && uv sync
FROM ghcr.io/aqora-io/kubimo-marimo:main AS build
USER 0
RUN curl https://sh.rustup.rs -sSf | bash -s -- -y --no-update-default-toolchain
ENV PATH="/root/.cargo/bin:${PATH}"
WORKDIR /build
COPY rust-toolchain .
RUN rustc --version
COPY . .
RUN --mount=type=cache,target=/usr/local/cargo/registry,sharing=locked \
--mount=type=cache,target=/usr/local/cargo/git,sharing=locked \
--mount=type=cache,target=target,sharing=locked \
cargo build --release \
&& cp target/release/aqora /bin/aqora
WORKDIR /workspace
RUN aqora new dataset-marimo user/workspace -v 0.0.0 . && rm readme.py
FROM ghcr.io/aqora-io/kubimo-marimo:main
WORKDIR /workspace
COPY --chown=1000:1000 ./docker/kubimo/marimo /home/me/.config/marimo
COPY --from=build /bin/aqora /bin/aqora
COPY --chown=1000:1000 --from=build /workspace/ .
RUN uv venv && uv sync
🤖 Prompt for AI Agents
In docker/Dockerfile.kubimo around lines 1-25, the final-stage COPY uses
"/workspace/*" which skips dotfiles and nested content and the stage relies on
the base image's default WORKDIR; change to set an explicit WORKDIR for the
final stage (e.g., WORKDIR /home/me/workspace or /app) before copying, and
replace the COPY with a recursive dotfile-preserving form (copy the entire
directory contents, e.g., COPY --chown=1000:1000 --from=build /workspace/. .) so
hidden files and subdirectories are preserved and the subsequent RUN uv venv &&
uv sync executes in the intended project root.

code_editor_font_size = 14
dataframes = "rich"
reference_highlighting = false
custom_css = ["/home/me/.config/marimo/marimo.css"]
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🔴 Critical

Fix hardcoded user-specific path.

The custom CSS path /home/me/.config/marimo/marimo.css is hardcoded with a specific username that won't work in containerized or multi-user environments.

Consider one of these approaches:

  • Use a container-relative path like /home/marimo/.config/marimo/marimo.css or /app/config/marimo.css
  • Use an environment variable for the home directory: ${HOME}/.config/marimo/marimo.css
  • Place the CSS file in a standard location that's consistent across deployments

Apply this diff to use a more generic path:

-custom_css = ["/home/me/.config/marimo/marimo.css"]
+custom_css = ["/home/marimo/.config/marimo/marimo.css"]
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
custom_css = ["/home/me/.config/marimo/marimo.css"]
custom_css = ["/home/marimo/.config/marimo/marimo.css"]
🤖 Prompt for AI Agents
In docker/kubimo/marimo/marimo.toml at line 70, the custom_css entry uses a
hardcoded user path (/home/me/...) which will fail in containerized or
multi-user environments; change it to a generic, deployable path such as
${HOME}/.config/marimo/marimo.css or a container-relative path like
/app/config/marimo.css (or /home/marimo/.config/marimo/marimo.css) and update
any container mounts or startup env to ensure HOME or the chosen directory
exists and contains marimo.css; if TOML parsing in this app doesn't expand
${HOME}, prefer the fixed container path (/app/config/marimo.css) and document
or mount that file in your Docker setup.

Comment on lines +31 to +90
pub async fn dataset_marimo(args: DatasetMarimo, global: GlobalArgs) -> Result<()> {
let pb = global
.spinner()
.with_message("Creating dataset marimo notebook");

let mut builder = DatasetMarimoTemplate::builder();

let dest = if let Some(raw) = args.raw_init {
builder.raw_init(raw);
args.dest.ok_or_else(|| {
error::user(
"Destination must be provided with raw",
"Please provide --dest",
)
})?
} else {
let slug = args.slug.ok_or_else(|| {
error::user(
"Slug or raw initializer must be provided",
"Please provide either --slug or --raw",
)
})?;
let (owner, local_slug) = slug.split_once('/').ok_or_else(|| {
error::user("Malformed slug", "Expected a slug like: {owner}/{dataset}")
})?;

let version = if let Some(version) = args.version {
version
} else {
global
.graphql_client()
.await?
.send::<GetLatestDatasetVersion>(get_latest_dataset_version::Variables {
owner: owner.to_string(),
local_slug: local_slug.to_string(),
})
.await?
.dataset_by_slug
.and_then(|d| d.latest_version)
.map(|v| v.version)
.ok_or_else(|| {
error::user(
"No version found",
"Please publish a version or specify a draft version",
)
})?
};
builder.owner(owner).local_slug(local_slug).version(version);
args.dest.unwrap_or_else(|| PathBuf::from(local_slug))
};

builder
.render(&dest)
.map_err(|e| format_permission_error("create dataset-marimo", &dest, &e))?;

pb.finish_with_message(format!(
"Created dataset marimo noteboook in '{}'",
dest.display()
));
Ok(())
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Fix typo in success message

The main flow (argument handling, GraphQL lookup, template rendering, and error mapping) looks good. There’s just a small user-facing typo in the final message:

pb.finish_with_message(format!(
-    "Created dataset marimo noteboook in '{}'",
+    "Created dataset marimo notebook in '{}'",
    dest.display()
));

Worth fixing to keep CLI output polished.

🤖 Prompt for AI Agents
In src/commands/new/dataset_marimo.rs around lines 31 to 90, the final spinner
success message contains a misspelling "noteboook"; update the format string
passed to pb.finish_with_message to read "Created dataset marimo notebook in
'{}'" (replace "noteboook" with "notebook") so the CLI output is correct.

Comment on lines +7 to +16
@app.cell(hide_code=True)
def _(data, mo, sql_editor):
_ = data
_df = mo.sql(
f"""
SET disabled_filesystems = 'LocalFileSystem';
{sql_editor.value}
"""
)
return
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

SQL query results are not displayed or returned.

The query result _df is computed but never displayed to the user. In marimo, variables must be either referenced without assignment (to display inline) or returned from the cell to make them available. Currently, users will execute SQL queries but see no output.

Apply this diff to display the query results:

 @app.cell(hide_code=True)
 def _(data, mo, sql_editor):
     _ = data
     _df = mo.sql(
         f"""
         SET disabled_filesystems = 'LocalFileSystem';
         {sql_editor.value}
         """
     )
-    return
+    return (_df,)

Alternatively, if you prefer inline display without returning:

 @app.cell(hide_code=True)
 def _(data, mo, sql_editor):
     _ = data
-    _df = mo.sql(
+    mo.sql(
         f"""
         SET disabled_filesystems = 'LocalFileSystem';
         {sql_editor.value}
         """
     )
     return
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
@app.cell(hide_code=True)
def _(data, mo, sql_editor):
_ = data
_df = mo.sql(
f"""
SET disabled_filesystems = 'LocalFileSystem';
{sql_editor.value}
"""
)
return
@app.cell(hide_code=True)
def _(data, mo, sql_editor):
_ = data
_df = mo.sql(
f"""
SET disabled_filesystems = 'LocalFileSystem';
{sql_editor.value}
"""
)
return (_df,)
Suggested change
@app.cell(hide_code=True)
def _(data, mo, sql_editor):
_ = data
_df = mo.sql(
f"""
SET disabled_filesystems = 'LocalFileSystem';
{sql_editor.value}
"""
)
return
@app.cell(hide_code=True)
def _(data, mo, sql_editor):
_ = data
mo.sql(
f"""
SET disabled_filesystems = 'LocalFileSystem';
{sql_editor.value}
"""
)
return
🤖 Prompt for AI Agents
In template/assets/dataset_marimo/readme.py.hbs around lines 7 to 16, the SQL
result is assigned to _df but never surfaced to the user; update the cell to
either return _df at the end of the function or reference _df (remove the
assignment) so Marimo will display it inline—specifically, after executing
mo.sql(...) either add "return _df" or change the assignment so the cell's last
expression is the dataframe itself (so it's rendered).

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

♻️ Duplicate comments (3)
.github/workflows/cd.yaml (1)

153-165: Address the TODO: gate image pushes to non-PR events.

The workflow still has push: true unconditionally enabled. This will push images on pull request events, which is typically undesirable and may fail on forked PRs.

As noted in previous review comments, change to:

push: ${{ github.event_name != 'pull_request' }}
src/commands/new/dataset_marimo.rs (1)

86-89: Fix typo in success message.

The success message contains "noteboook" instead of "notebook".

Apply this diff:

-    "Created dataset marimo noteboook in '{}'",
+    "Created dataset marimo notebook in '{}'",
template/assets/dataset_marimo/readme.py.hbs (1)

7-16: SQL query results are not displayed.

The query result _df is computed but never shown to the user. The cell should either return (_df,) or reference the dataframe without assignment for inline display.

Apply this diff to return the results:

-    return
+    return (_df,)
🧹 Nitpick comments (1)
template/src/use_case.rs (1)

8-8: Good refactoring to shared validation utilities.

The migration from inline regex validation to shared utility functions (assert_semver, assert_no_control_chars, assert_slug) improves maintainability and eliminates code duplication. This aligns well with the validation approach used in the new dataset_marimo module.

Also applies to: 38-44

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between aa67f48 and c4b937b.

⛔ Files ignored due to path filters (1)
  • Cargo.lock is excluded by !**/*.lock
📒 Files selected for processing (19)
  • .dockerignore (1 hunks)
  • .github/workflows/cd.yaml (1 hunks)
  • Dockerfile (0 hunks)
  • docker-bake.dev.hcl (1 hunks)
  • docker-bake.hcl (1 hunks)
  • docker/Dockerfile.kubimo (1 hunks)
  • docker/kubimo/marimo/marimo.css (1 hunks)
  • docker/kubimo/marimo/marimo.toml (1 hunks)
  • src/commands/new/dataset_marimo.rs (1 hunks)
  • src/commands/new/mod.rs (2 hunks)
  • src/graphql/get_latest_dataset_version.graphql (1 hunks)
  • template/assets/dataset_marimo/.gitignore.hbs (1 hunks)
  • template/assets/dataset_marimo/pyproject.toml.hbs (1 hunks)
  • template/assets/dataset_marimo/readme.py.hbs (1 hunks)
  • template/src/dataset_marimo.rs (1 hunks)
  • template/src/lib.rs (1 hunks)
  • template/src/registry.rs (3 hunks)
  • template/src/use_case.rs (2 hunks)
  • template/src/utils.rs (1 hunks)
💤 Files with no reviewable changes (1)
  • Dockerfile
✅ Files skipped from review due to trivial changes (1)
  • .dockerignore
🚧 Files skipped from review as they are similar to previous changes (10)
  • docker/kubimo/marimo/marimo.css
  • docker-bake.hcl
  • docker-bake.dev.hcl
  • docker/Dockerfile.kubimo
  • template/src/lib.rs
  • template/src/registry.rs
  • template/assets/dataset_marimo/pyproject.toml.hbs
  • template/src/utils.rs
  • template/assets/dataset_marimo/.gitignore.hbs
  • docker/kubimo/marimo/marimo.toml
🧰 Additional context used
🧬 Code graph analysis (4)
template/src/use_case.rs (1)
template/src/utils.rs (3)
  • assert_no_control_chars (67-72)
  • assert_semver (22-27)
  • assert_slug (38-43)
src/commands/new/mod.rs (1)
src/commands/new/dataset_marimo.rs (2)
  • dataset_marimo (31-91)
  • global (60-63)
template/src/dataset_marimo.rs (3)
template/src/registry.rs (2)
  • handlebars (83-84)
  • default (153-155)
template/src/utils.rs (3)
  • assert_semver (22-27)
  • assert_slug (38-43)
  • assert_username (54-59)
template/src/use_case.rs (4)
  • builder (27-29)
  • validate (37-46)
  • render (31-33)
  • render (48-52)
src/commands/new/dataset_marimo.rs (3)
src/commands/global_args.rs (1)
  • graphql_client (165-174)
src/error.rs (1)
  • format_permission_error (89-97)
template/src/dataset_marimo.rs (1)
  • builder (34-36)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (13)
  • GitHub Check: Build and publish npm package
  • GitHub Check: Build and Push Docker Image
  • GitHub Check: windows (x86)
  • GitHub Check: macos (x86_64)
  • GitHub Check: macos (aarch64)
  • GitHub Check: linux (x86_64)
  • GitHub Check: windows (x64)
  • GitHub Check: linux (aarch64)
  • GitHub Check: linux (x86)
  • GitHub Check: Test Workspace
  • GitHub Check: Test WASM (chrome)
  • GitHub Check: Test WASM (safari)
  • GitHub Check: Test WASM (firefox)
🔇 Additional comments (8)
src/graphql/get_latest_dataset_version.graphql (1)

1-7: LGTM!

The GraphQL query is well-structured and focused on fetching the latest dataset version. The variable naming convention appropriately uses snake_case for GraphQL variables while the schema uses camelCase for field arguments.

src/commands/new/mod.rs (1)

1-1: LGTM!

The integration of the dataset_marimo command follows the established pattern for the existing UseCase command. The subcommand is intentionally hidden from CLI help, which is appropriate for a feature that may still be in development or limited release.

Also applies to: 11-11, 17-18, 24-24

template/src/dataset_marimo.rs (2)

14-31: LGTM!

The struct definition is well-designed with appropriate use of the builder pattern. The optional fields (owner, local_slug, version, raw_init) provide flexibility for different initialization modes, and the default values for version fields are sensible.


43-67: LGTM!

The validation logic correctly enforces that either raw_init or all of (owner, local_slug, version) must be provided. The use of appropriate validators (assert_username, assert_slug, assert_semver) ensures data quality, and the validation for version fields handles optional defaults properly.

src/commands/new/dataset_marimo.rs (2)

12-29: LGTM!

The command argument structure is well-designed, supporting both raw initialization and slug-based initialization modes. The GraphQL query integration is properly set up with appropriate schema references.


31-85: LGTM!

The command handler logic is well-structured with appropriate branching for raw vs. slug-based initialization. Error handling is comprehensive with user-friendly messages, and the GraphQL integration for version resolution is properly implemented.

template/assets/dataset_marimo/readme.py.hbs (2)

1-4: LGTM!

The marimo app initialization is properly configured with appropriate metadata tracking the generating tool version. The app settings (width="full", auto_download=["html"]) are sensible defaults for a dataset exploration notebook.


19-23: LGTM!

The remaining cells are properly structured:

  • SQL editor widget is correctly configured and returned
  • Data initialization appropriately handles both raw and slug-based modes with the conditional Handlebars block
  • Marimo import and main guard follow standard patterns

Also applies to: 26-34, 37-40, 43-44

Comment on lines +10 to +12
const DEFAULT_PYTHON_VERSION: &str = "3.11";
const DEFAULT_MARIMO_VERSION: &str = "0.17.7";
const DEFAULT_CLI_VERSION_STR: &str = env!("CARGO_PKG_VERSION");
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

🧩 Analysis chain

Verify the Marimo version is current.

The default Marimo version is set to 0.17.7. Since this PR is from October 2025 and library versions change frequently, please confirm this is an appropriate version to use as the default.


🌐 Web query:

What is the latest stable version of marimo Python library?

💡 Result:

The latest stable version is marimo 0.18.0 (released/ uploaded to PyPI on Nov 20, 2025). [1][2]

Sources:
[1] PyPI - marimo (release 0.18.0).
[2] marimo GitHub releases.


Update Marimo version to the latest stable release.

The latest stable version of marimo is 0.18.0 (released November 20, 2025). The current default is set to 0.17.7, which is one minor version behind. Update DEFAULT_MARIMO_VERSION in template/src/dataset_marimo.rs (line 11) to "0.18.0".

🤖 Prompt for AI Agents
In template/src/dataset_marimo.rs around lines 10 to 12, the
DEFAULT_MARIMO_VERSION constant is outdated (currently "0.17.7"); update its
value to the latest stable release "0.18.0" by replacing the string literal for
DEFAULT_MARIMO_VERSION with "0.18.0" so the code uses the new default version.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants