Skip to content

Add a "Dataset Importer" Windmill app #137

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 51 commits into
base: main
Choose a base branch
from

Conversation

rudokemper
Copy link
Member

@rudokemper rudokemper commented Aug 20, 2025

Goal

Closes #117.

Screenshots

image

What I changed

  • Added f/apps/gc_dataset_importer.app, which consists of:
    • A Windmill UI (defined in app.yaml) built around the Stepper component to guide the user through uploading a dataset.
    • A set of Python scripts to handle database interactions, file uploads, conversions, transformations, and writing to the database.
    • A set of frontend scripts to manage reactive state across the app.
  • Updated code previously merged under the GC Dataset Importer epic -- specifically in data_conversion.py and file_operations.py. I’ve added comments and additions to the docstrings to clarify why these changes were necessary.
  • Deleted the prototype f/apps/gc_uploader.app, which only supported Locus Map data.
  • I tried to make the app README as helpful as possible, usage examples, a diagram, and more. Please let me know if you think any of it is superfluous.

Notes to the reviewer

  • The easiest way to review this app is to explore it in a Windmill instance in the edit view (e.g. /apps/edit/f/apps/gc_dataset_importer) and walk through the steps to see how scripts are called, how state is managed, etc.
  • In terms of the code review, I'd recommend looking at the Python scripts in f/apps/gc_dataset_importer.app, and any changes to f/common_logic.
  • On that note: in building this, I tried to be mindful of where the code should live. The three Python scripts in f/apps/gc_dataset_importer.app mainly call functions from imported modules, handle errors, and log outcomes. Any broader business logic that I felt could be useful to other code was added to the appropriate file in f/common_logic/. There are nevertheless a few helper functions in the Python scripts in the app directory since they are highly specific to the app, so I did not think it was worth moving them into f/common_logic/. Opinions on my choices are welcome.
  • As it stands, there are some changes I made to f/connectors/ code in this branch. I already filed those as separate PRs -- Add CSV to Postgres script #135 and Generate unique ID if not present in GeoJSON feature #136. After these are merged, I will rebase on main so that overlapping changes here are dropped.
  • There was a ton of iteration involved in building this, so I don't think the commit history is worth looking at. (As is customary in this repo, will squash and merge).

What I'm not doing here

  • A bunch of stuff that came up during review that I felt was either out of scope, or better off as a separate improvement due to the complexity involved. These are listed in a TODO section in the README.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Build a Windmill app for GC Dataset Importer
1 participant