-
Notifications
You must be signed in to change notification settings - Fork 14
feat: Introduce Taskfile-based workflow #3
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
@Adarsh321123 @motiwari please review! |
8ac5ec0
to
ee8399a
Compare
…operations via Taskfile pattern
ee8399a
to
1b6d85c
Compare
This looks way, way better than having the user follow many different instructions (with many different ways things could go wrong) -- and I'm learning how to use Taskfiles effectively for the first time. Thank you for doing this! I reviewed the code changes and they look good. The only question I have is whether these changes preserve correctness. Does the code in this PR still produce the same results from the original paper? |
hi @motiwari ! That’s a great point. For me the runs take a very long time, and it’s hard to tell. Do you happen to have a benchmark toy dataset that we can use for mocking? If not, we should create one. |
Hi @vishnya. Thanks for this contribution! The Taskfile-based workflow is a huge improvement for developer experience and onboarding. For sanity checking correctness, we can simply run the new task run workflow on a small set of repos (like just Compfiles and MIL) and compare key metrics/outputs against those in the paper. Moreover, to check that the entire workflow works, we can use a separate blank repo. You can quickly do these by following the |
Hi folks! I've been too busy with work last week and haven't had a chance to test. Running the new task run on a small set of repos, and separately a blank repo, makes sense, although it seems fairer to compare the results against running the old flow on the same repos , rather than comparing the paper results. |
1/ Want to confirm that we want to test the following way, and whether or not you think the test will be straightforward to implement (i.e. youve done something similar before):
|
1/ Yes, testing that way for (a) and (b) is straightforward. |
Hi @vishnya , my apologies for the delay in getting back to you after our 1:1 discussion. The steps @Adarsh321123 mentioned seem good. Let us know if you need more details on how to run everything. @Adarsh321123 and I are also discussing setting up a lighter testing framework in #4 and #5 |
PR:
feat
: Introduce Task-based workflow for all project operationsThis PR introduces
Taskfile
as the new, unified entry point for all developer and user-facing operations, such as setup, testing, and running experiments. The motivation is to replace a collection of standalone scripts and manual command sequences with a single, self-documenting, and reproducible workflow. This simplifies onboarding and ensures consistency across all environments.Summary of Notable Changes (File by File)
Taskfile.yaml
setup
,test
,run
,run_fisher
, etc.) that orchestrate all necessary environment setup, downloads, and script executions. All configuration variables are documented with inline comments.run_compute_fisher.sh
run_fisher
task inTaskfile.yaml
, removing redundancy.replace_files.sh
ld_path.txt
andpl_path.txt
files, as it now cleans them up upon completion. This keeps the project directory clean without needing.gitignore
entries.tests/test_taskfile.py
Taskfile.yaml
. It ensures that critical tasks are defined and that the file does not contain unresolved template placeholders.pytest.ini
pytest
to look for tests exclusively within thetests/
directory. This prevents it from discovering and running tests from downloaded dependency repositories (e.g., indata/raid/repos_new
).README.md
Taskfile
-based workflow, instructing users to runtask <command>
instead of using individual scripts..gitignore
requirements.txt
main
.dynamic_database.py
How to Test This PR
Reviewers can optionally validate these changes by checking out the branch and running the primary workflows, which now feel much cleaner:
Known Issues & Next Steps
download_checkpoint_data
task, part of thesetup
workflow, relies ongdown
to fetch large files from Google Drive. During heavy testing, it's possible to hit a download quota, which appears to last up to 24 hours. A future improvement would be to host these artifacts on a more robust platform (e.g., Hugging Face Hub, AWS S3).replace_files.sh
: The file-patching mechanism inreplace_files.sh
is effective but somewhat brittle. A more robust, Python-based solution for applying these patches would be a valuable next step.