Skip to content

Conversation

@bhoov
Copy link

@bhoov bhoov commented Jun 3, 2025

Addresses #1461

Using quarto and its VS code extension, I find that writing .qmd files to be a smoother interactive alternative to .ipynb files. That .qmd files are plain text comes with several advantages:

  1. .qmd seamlessly integrates with Cursor AI/other AI copilots.
  2. .qmd is fully compatible with standard git tooling
  3. .qmd works better with VIM keybindings
  4. .qmd files don't need a special nbdev_clean step to remove cell metadata and outputs, meaning your source files are not altered in any way by nbdev's transpilation process (something that bothers me immensely when developing in .ipynb)

Turns out, nbdev doesn't need many changes to implement this feature.

  1. Allow export globbing functions to search for .qmd in addition to .ipynb
  2. Implement a read_qmd/write_qmd function for converting the .qmd to/from nbdev's AttrDict format. This means two-way sync (via nbdev_update) also works for .qmd and its corresponding .py files.
  3. Because outputs are not stored inside .qmd files, I use execnb's run_all to generate outputs for the docs inside _proc/-cached .ipynb files.
  4. The custom frontmatter parser needed some tweaking to allow cells to include general markdown after the custom frontmatter.

It looks like there have been other attempts to allow .qmd support for nbdev (see this quarto issue) or allow plain-text support (see #1499). However, .qmd support is still missing in the current version of nbdev, and the latter seems to introduce jupytext as an additional dependency which uses the slow quarto convert command to pair a .ipynb and .qmd (this PR introduces a faster .qmd <-> .ipynb parser). Now you can seamlessly develop using a mix of .qmd and .ipynb, whichever you prefer, with no additional dependencies.

I've written up a small tutorial for setting good VSCode defaults in nbs/tutorials/develop_in_plain_text.qmd

A few notes of caution and room for improvement:

  1. Ensure that all files under nbs/ have distinct names: no 00_core.ipynb and 00_core.qmd, as both of these will create the intermediate _proc/00_core.ipynb
  2. Currently, the nbdev_prepare will run executable cells in .qmd documents twice: 1x when testing and, because outputs aren't saved, 1x when generating the docs.

The PR is in a pretty stable position already (see this fork of nbdev rewritten entirely using .qmd files). There may be edge cases that I haven't considered, but in all I hope this is nearing a good shape to distribute.

bhoov added 30 commits May 29, 2025 11:51
TinasheMTapera added a commit to NSAPH-Data-Processing/era5_sandbox that referenced this pull request Aug 18, 2025
- Adopt Quarto for documentation and notebooks making use of
[this nbdev PR](AnswerDotAI/nbdev#1521) that allows full `.qmd` driven packages
- Convert all `ipynb` files to `.qmd` format
- Use nbdev_docs to generate the documentation website
- Adopt logger that solves #3 (#3)
@football-kowshik
Copy link

What are the next steps here?

@TinasheMTapera
Copy link

I may be having challenges with this, but just wanted to check to see if you've seen this before or if it's something external to your fork:

nbdev_proc_nbs:

"""
Traceback (most recent call last):
  File "/n/home03/ttapera/.conda/envs/era5_sandbox/lib/python3.11/concurrent/futures/process.py", line 261, in _process_worker
    r = call_item.fn(*call_item.args, **call_item.kwargs)
        ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/n/home03/ttapera/.conda/envs/era5_sandbox/lib/python3.11/concurrent/futures/process.py", line 210, in _process_chunk
    return [fn(*args) for args in chunk]
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/n/home03/ttapera/.conda/envs/era5_sandbox/lib/python3.11/concurrent/futures/process.py", line 210, in <listcomp>
    return [fn(*args) for args in chunk]
            ^^^^^^^^^
  File "/n/home03/ttapera/.conda/envs/era5_sandbox/lib/python3.11/site-packages/fastcore/parallel.py", line 63, in _call
    return g(item)
           ^^^^^^^
  File "/n/home03/ttapera/.conda/envs/era5_sandbox/lib/python3.11/site-packages/nbdev/serve_drv.py", line 35, in main
    elif src.suffix=='.qmd': exec_qmd(src, dst, x)
                             ^^^^^^^^^^^^^^^^^^^^^
  File "/n/home03/ttapera/.conda/envs/era5_sandbox/lib/python3.11/site-packages/nbdev/serve_drv.py", line 23, in exec_qmd
    cb()(nb)
  File "/n/home03/ttapera/.conda/envs/era5_sandbox/lib/python3.11/site-packages/nbdev/processors.py", line 292, in __call__
    def __call__(self, nb): return self.nb_proc(nb).process()
                                   ^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/n/home03/ttapera/.conda/envs/era5_sandbox/lib/python3.11/site-packages/nbdev/process.py", line 130, in process
    for proc in self.procs: self._proc(proc)
                            ^^^^^^^^^^^^^^^^
  File "/n/home03/ttapera/.conda/envs/era5_sandbox/lib/python3.11/site-packages/nbdev/process.py", line 122, in _proc
    if hasattr(proc,'begin'): proc.begin()
                              ^^^^^^^^^^^^
  File "/n/home03/ttapera/.conda/envs/era5_sandbox/lib/python3.11/site-packages/nbdev/processors.py", line 108, in begin
    if getattr(cells[idx+1], 'has_sd', 0):
               ~~~~~^^^^^^^
IndexError: list index out of range
"""

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/n/home03/ttapera/.conda/envs/era5_sandbox/bin/nbdev_proc_nbs", line 8, in <module>
    sys.exit(nbdev_proc_nbs())
             ^^^^^^^^^^^^^^^^
  File "/n/home03/ttapera/.conda/envs/era5_sandbox/lib/python3.11/site-packages/fastcore/script.py", line 125, in _f
    return tfunc(**merge(args, args_from_prog(func, xtra)))
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/n/home03/ttapera/.conda/envs/era5_sandbox/lib/python3.11/site-packages/nbdev/quarto.py", line 217, in nbdev_proc_nbs
    _pre_docs(**kwargs)[0]
    ^^^^^^^^^^^^^^^^^^^
  File "/n/home03/ttapera/.conda/envs/era5_sandbox/lib/python3.11/site-packages/nbdev/quarto.py", line 209, in _pre_docs
    cache = proc_nbs(path, n_workers=n_workers, **kwargs)
            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/n/home03/ttapera/.conda/envs/era5_sandbox/lib/python3.11/site-packages/nbdev/serve.py", line 82, in proc_nbs
    parallel(nbdev.serve_drv.main, files, n_workers=n_workers, pause=0.01, **kw)
  File "/n/home03/ttapera/.conda/envs/era5_sandbox/lib/python3.11/site-packages/fastcore/parallel.py", line 134, in parallel
    return L(r)
           ^^^^
  File "/n/home03/ttapera/.conda/envs/era5_sandbox/lib/python3.11/site-packages/fastcore/foundation.py", line 100, in __call__
    return super().__call__(x, *args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/n/home03/ttapera/.conda/envs/era5_sandbox/lib/python3.11/site-packages/fastcore/foundation.py", line 108, in __init__
    items = listify(items, *rest, use_list=use_list, match=match)
            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/n/home03/ttapera/.conda/envs/era5_sandbox/lib/python3.11/site-packages/fastcore/basics.py", line 79, in listify
    elif is_iter(o): res = list(o)
                           ^^^^^^^
  File "/n/home03/ttapera/.conda/envs/era5_sandbox/lib/python3.11/concurrent/futures/process.py", line 620, in _chain_from_iterable_of_lists
    for element in iterable:
  File "/n/home03/ttapera/.conda/envs/era5_sandbox/lib/python3.11/concurrent/futures/_base.py", line 619, in result_iterator
    yield _result_or_cancel(fs.pop())
          ^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/n/home03/ttapera/.conda/envs/era5_sandbox/lib/python3.11/concurrent/futures/_base.py", line 317, in _result_or_cancel
    return fut.result(timeout)
           ^^^^^^^^^^^^^^^^^^^
  File "/n/home03/ttapera/.conda/envs/era5_sandbox/lib/python3.11/concurrent/futures/_base.py", line 456, in result
    return self.__get_result()
           ^^^^^^^^^^^^^^^^^^^
  File "/n/home03/ttapera/.conda/envs/era5_sandbox/lib/python3.11/concurrent/futures/_base.py", line 401, in __get_result
    raise self._exception
IndexError: list index out of range

Any thoughts? What else would you like to see to help debug?

@bhoov
Copy link
Author

bhoov commented Sep 22, 2025

I got this PR to work for my personal use cases and didn't see much initial interest on this PR to bring it into the main branch. Seems like there's gotten to be a bit more traction since I first made the PR, and I'm happy to push this forward.

What are the next steps here? @football-kowshik

From my side, it has been awhile since I've rebased with the main. I will do that and see what bugs/clashes have come up since then and try to resolve those. Beyond that it's up to the maintainers to see if this is worth incorporating into the main branch (I think it definitely is, but I am biased. The .qmd workflow has proven much smoother for my use cases and it is fully backward compatible with .ipynbs.)

@TinasheMTapera I am not positive, but this bug looks a lot like the weird edge cases I encountered when trying to parse .qmd files as valid nbdev source. Could you share a minimal .qmd file that reproduces this bug? I'm a bit new at contributing to larger OSS projects on github, but I feel that this bug doesn't need its own issue since it is pertinent only to this PR.

@kurianbenoy-sarvam
Copy link

@bhoov you haven't requested Jeremy Howard or any of the maintainers to review the PR.

I recently asked in discord, why this PR was not reviewed and the answer I received from Jeremy was:

No one requested a review so i didn't see it 🙂

@TinasheMTapera
Copy link

@bhoov I was able to figure out the problem, and it was not related to your PR, but rather to an edge case of nbdev itself, so please disregard!

@bhoov
Copy link
Author

bhoov commented Oct 20, 2025

I recently asked in discord, why this PR was not reviewed and the answer I received from Jeremy was:

No one requested a review so i didn't see it 🙂

Github actually does not allow me to request reviewers or assign people to this repository, otherwise I would have.

@jph00 can you review this PR? :)

image

@jph00
Copy link
Contributor

jph00 commented Oct 20, 2025

Will do!

@jph00 jph00 self-requested a review October 20, 2025 21:34
Copy link
Contributor

@jph00 jph00 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for this. Tbh it's a long way from being something that we could merge at this stage. Rather than try to get this PR into shape, my suggestion would be to gradually add a series of PRs which have the minimal code necessary for each piece. Start with something that adds the most useful bit you need with the least amount of code.

if stdin: return _write(f_in=sys.stdin, f_out=sys.stdout)
if fname is None: fname = get_config().nbs_path
for f in globtastic(fname, file_glob='*.ipynb', skip_folder_re='^[_.]'): _write(f_in=f, disp=disp)
for f in globtastic(fname, file_re=r'.*\.ipynb$', skip_folder_re='^[_.]'): _write(f_in=f, disp=disp) # Don't clean .qmd files
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can we leave the glob as it was then?

def cell(self, cell):
if cell.cell_type=='raw': self._update(_fm2dict, cell)
elif cell.cell_type=='markdown' and 'title' not in self.fm: self._update(_md2dict, cell)
elif (cell.cell_type=='markdown' and 'title' not in self.fm): self._update(_md2dict, cell)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

not sure this needs changing?

_re_fm_nb = re.compile(_RE_FM_BASE+'$', flags=re.DOTALL)
_re_fm_md = re.compile(_RE_FM_BASE, flags=re.DOTALL)

_re_fm_title_desc = re.compile(r'^#\s+(\S.*?)(?:\n|$)(?:\s*\n)*(?:>\s+(\S.*?)(?:\n|$)(?:\s*\n)*)?', flags=re.MULTILINE)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This has gotten far too complicated. Perhaps reduce the scope to only support frontmatter titles etc in qmd? Try to get to a PR that adds as little code as possible, and keeps the code no more complex than what we had before.

check_fname = path/".last_checked"
last_checked = os.path.getmtime(check_fname) if check_fname.exists() else None
nbs = globtastic(fname, file_glob='*.ipynb', skip_folder_re='^[_.]') if fname.is_dir() else [fname]
nbs = globtastic(fname, file_re=r'.*\.ipynb$|.*\.qmd$', skip_folder_re='^[_.]') if fname.is_dir() else [fname]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We could simplify this in the multiple places it occurs, eg:

Suggested change
nbs = globtastic(fname, file_re=r'.*\.ipynb$|.*\.qmd$', skip_folder_re='^[_.]') if fname.is_dir() else [fname]
nbs = globtastic(fname, file_re=r'.*\.(ipynb|qmd)$', skip_folder_re='^[_.]') if fname.is_dir() else [fname]

Although it feels like a glob would be nicer. What if we added a file_exts param to globtastic which took a comma separated list of file extensions? Then we could just have file_exts='ipynb,qmd'.

"Process cells and nbdev comments in a notebook"
def __init__(self, path=None, procs=None, nb=None, debug=False, rm_directives=True, process=False):
self.nb = read_nb(path) if nb is None else nb
self.nb = read_nb_or_qmd(path) if nb is None else nb
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Probably not worth breaking backwards incompatibility to change the name of this function. It's OK if read_nb is slightly misleadingly named :)

Comment on lines +10 to +12
import re

import sys,os,inspect,shutil
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
import re
import sys,os,inspect,shutil
import sys,os,inspect,shutil,re

if intermediate_md_source: raw_cells.append(_qmd_to_raw_cell(intermediate_md_source, 'markdown'))

# Construct the final notebook dictionary
notebook_dict = {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Don't hard code dicts like this. Use the functions we have.



# %% ../nbs/api/15_qmd.ipynb
def _get_fence_ticks(source:str):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Use a proper well-tested parser - don't do this kind of thing manually.


# %% ../nbs/api/15_qmd.ipynb
@call_parse
def ipynb_to_qmd(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Too much code here - and it shouldn't be printing on success. Keep things much simpler! :)

@bhoov
Copy link
Author

bhoov commented Oct 29, 2025

Thanks for the review Jeremy. I definitely prioritized feature completeness at the expense of tasteful code and minimal changes 🙃. Still new to contributing to larger existing projects, I'll get there

Do you suggest closing this mammoth PR and instead introduce bite-sized PRs to the main branch? Or should I make smaller PRs to a dedicated "qmd_support" branch?

@jph00
Copy link
Contributor

jph00 commented Oct 29, 2025 via email

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants