-
Notifications
You must be signed in to change notification settings - Fork 127
ENH: add ability to handle multi-dimensional thresholds #788
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Conversation
Codecov Report✅ All modified and coverable lines are covered by tests. Additional details and impacted files@@ Coverage Diff @@
## master #788 +/- ##
==========================================
Coverage 100.00% 100.00%
==========================================
Files 56 56
Lines 6325 6547 +222
Branches 360 378 +18
==========================================
+ Hits 6325 6547 +222 ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
| "fpr" for false positive rate. | ||
| - A custom instance of BinaryClassificationRisk object | ||
| predict_params : NDArray, default=np.linspace(0, 0.99, 100) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
2 questions :
- do we want the argument being called predict_params ? couldn't it be something like 'list-thresholds ?'
- can we imagine a case where the argument is a function/generator for optimal exploration ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
- I agree that the name (following the existing _predict_params name) is confusing. I think
paramsis the good term in the general setting (also called parameters in LTT paper), it's only in the one-dimensional case that it is defined as a threshold. We have to decide before merging as it's a user facing argument that we cannot change later. Maybe externally we can define the argumentslist_multi_dimensional_parametersfor multi-dimensional case andlist_thresholdsfor the one-dimensional case? - I think if the user has a function/generator, they can just format its output and give it in
predict_params
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we should go with list_params, specifying in the docstring it is form multidimensional cases. We probably want to keep the name short for the users.
| y_pred = y_pred.astype(int) | ||
| else: | ||
| try: | ||
| predictions_proba = self._predict_function(X)[:, 1] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Y-a-t-il une raison pour ne pas mettre la fonction de prédiction en multiparam dans le try ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oui car c'est pas le même appel de fonction et la deuxième erreur n'est pas adaptée (on peut peut-être adapter pour quand même tester la première erreur en multi-dim)
Description
Add the ability to handle multi-dimensional thresholds (lambdas)
predict_paramsis now an argument (even when only one-dimensional lambda) and docstring should be clearerbest_predict_paramis a tuple for multi-dimensional parameterspredict_paramsdimension)_get_predictions_per_paramhandles generalpredict_functions, but will process all parameter values sequentially (i don't know how to do it easily in parallel).get_predictions_per_paramwill check the prediction values in the calibration step (using a new argumentis_calibration_step, because I don't want to check at test time as it is not necessary and it might happen that when predicting a single probability, which can happen at test time, a value of 0 or 1 and would raise a warning)test_error_multi_dim_params_dim_mismatch)To manage the two types of predict functions in
__init__there are a few options:BinaryClassificationControllerpredict_function_generaland the user has to provide at least this or the originalpredict_function.I will go with option 4.