Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Make linear regression the quickstart notebook #310

Merged
merged 12 commits into from
Feb 14, 2025
Merged

Conversation

paul-buerkner
Copy link
Contributor

@paul-buerkner paul-buerkner commented Feb 12, 2025

This PR addresses #309 by making the linear regression notebook the quickstart notebook and removing the current WIP quickstart notebook.

If few things we need to do before this can be merged:

@codecov-commenter
Copy link

codecov-commenter commented Feb 12, 2025

⚠️ Please install the 'codecov app svg image' to ensure uploads and comments are reliably processed by Codecov.

Codecov Report

Attention: Patch coverage is 8.18182% with 101 lines in your changes missing coverage. Please review.

Files with missing lines Patch % Lines
bayesflow/utils/dict_utils.py 8.82% 31 Missing ⚠️
bayesflow/diagnostics/plots/pairs_samples.py 12.50% 21 Missing ⚠️
bayesflow/diagnostics/plots/pairs_posterior.py 15.78% 16 Missing ⚠️
bayesflow/diagnostics/plots/recovery.py 0.00% 6 Missing ⚠️
...yesflow/diagnostics/plots/calibration_histogram.py 0.00% 5 Missing ⚠️
bayesflow/diagnostics/plots/z_score_contraction.py 0.00% 5 Missing ⚠️
bayesflow/diagnostics/metrics/calibration_error.py 0.00% 4 Missing ⚠️
bayesflow/diagnostics/plots/calibration_ecdf.py 0.00% 4 Missing ⚠️
...sflow/diagnostics/metrics/posterior_contraction.py 0.00% 3 Missing ⚠️
...low/diagnostics/metrics/root_mean_squared_error.py 0.00% 3 Missing ⚠️
... and 2 more

❗ Your organization needs to install the Codecov GitHub app to enable full functionality.

Files with missing lines Coverage Δ
bayesflow/workflows/basic_workflow.py 24.26% <ø> (ø)
bayesflow/utils/plot_utils.py 20.77% <0.00%> (ø)
bayesflow/diagnostics/plots/mc_calibration.py 22.72% <0.00%> (ø)
...sflow/diagnostics/metrics/posterior_contraction.py 40.00% <0.00%> (ø)
...low/diagnostics/metrics/root_mean_squared_error.py 33.33% <0.00%> (ø)
bayesflow/diagnostics/metrics/calibration_error.py 22.22% <0.00%> (ø)
bayesflow/diagnostics/plots/calibration_ecdf.py 14.89% <0.00%> (ø)
...yesflow/diagnostics/plots/calibration_histogram.py 24.24% <0.00%> (ø)
bayesflow/diagnostics/plots/z_score_contraction.py 21.73% <0.00%> (ø)
bayesflow/diagnostics/plots/recovery.py 22.22% <0.00%> (ø)
... and 3 more

... and 1 file with indirect coverage changes

@paul-buerkner
Copy link
Contributor Author

I have made some progress in making the diagnostic plots prettier and easier to use. Specifically, I added back the filter_keys. We should also discuss the name of this but I will make more edits first.

I know it's not super clean that I will put all of this in one PR but I don't have the time to make both the starter notebook pretty and fix all the plots at the same time if they are not on the same branch.

@paul-buerkner
Copy link
Contributor Author

More updates. I have now cleaned up the interface of the diagnostics and some of the related backend code.

In particular, pairs_posterior is now much less awkward to use. For example, previously we would have had to do something like:

draws_stacked = bf.utils.dict_utils.dicts_to_arrays(
    post_draws, val_sims,
)

f = bf.diagnostics.plots.pairs_posterior(
    post_samples = draws_stacked["targets"][0], 
    true_params = draws_stacked["references"][0],
)

which also required users to know the deeply hidden function bf.utils.dict_utils.dicts_to_arrays. Now, we can just write:

f = bf.diagnostics.plots.pairs_posterior(
    post_draws, val_sims,
    dataset_id=0,
)

So the interface of pairs_posteriors is now the same as that of the other diagnostics for NPE except that is does require the dataset_id argument because it can only handle the posterior of a single dataset at once.

Tomorrow, I will work on the renamining. Here is the summary of what I propose to rename, must of which we discussed already:

  • targets -> estimates
  • references -> targets
  • filter_keys -> variable_keys
  • pretty naming will be continue to done via variable_names

You can check all the current diagnostics at work in the updated linear regression notebook on this branch.

I would love to hear your thoughts and comments :-)

@paul-buerkner
Copy link
Contributor Author

This PR is now ready for review. It closes #307, #308, and #309.

@stefanradev93 would you mind taking a look and merging if you are happy?

I left some remaining ToDos in the code, which would further improve the plots but are not essential at the moment. I will take care of them at a later point.

@LarsKue
Copy link
Contributor

LarsKue commented Feb 14, 2025

Made some adjustments with feedback from @paul-buerkner. IMO, the notebook looks great. Ready to merge from my side @stefanradev93.

@stefanradev93 stefanradev93 merged commit 3ff6d0d into dev Feb 14, 2025
13 checks passed
@stefanradev93 stefanradev93 deleted the lin-reg-quickstart branch February 14, 2025 13:05
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants