Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Markdown representations of Jupyter notebooks #15

Merged
merged 5 commits into from
Jul 30, 2024

Conversation

jmarshrossney
Copy link
Collaborator

This PR proposes to keep human-readable Markdown representations of Jupyter notebooks under version control, instead of the notebooks themselves. This makes it easier to track changes and reduces the chance of accidentally committing a bunch of generated outputs, which can increase the size of the repository for not necessarily good reasons.

The mechanism through which .ipynb and .md versions of the notebooks are synchronised is a package called Jupytext.

There are various fancy things one can do, including a pre-commit hook to sync notebooks, but perhaps the nice thing about this is the amount of extra effort required to implement this is tiny. You just need to remember to run jupytext --sync notebooks/* before committing - that's it!

This is not to suggest that we should never commit full notebooks with computed outputs, e.g. for showcasing results down the line.

@jmarshrossney
Copy link
Collaborator Author

Note - may need rebasing if #13 doesn't get merged.

@metazool
Copy link
Collaborator

Opinions welcomed on this from the rest of @NERC-CEH/eds-rse - I like the idea of having meaningful diffs, as notebook-based research code can present an awkward barrier to collaboration, but slightly reserved about adding extra workflow steps

@metazool
Copy link
Collaborator

maid/maid#324 i think here's the bug in the coverage action from forks with a link to discussion about mitigating it, but my suggested resolution is bypassing this in favour of making branches in this repo directly

jupytext --sync notebooks/*
```

If you modify the contents of a notebook, run the command after closing the notebook to re-sync the `.ipynb` and `.md` representations before committing.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I brought up this use of jupytext in the daily catchup @jmarshrossney and no opinions were forthcoming, so I'm going to call this a worthwhile contribution and try adopting it as a routine

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks Jo - I appreciate it's a slightly annoying additional step. One nice thing is that it's completely trivial to change our minds and go back to tracking the .ipynb files down the line!

@metazool metazool merged commit 043c6a5 into NERC-CEH:main Jul 30, 2024
2 of 3 checks passed
@jmarshrossney
Copy link
Collaborator Author

maid/maid#324 i think here's the bug in the coverage action from forks with a link to discussion about mitigating it, but my suggested resolution is bypassing this in favour of making branches in this repo directly

Interesting, and noted.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants