-
-
Notifications
You must be signed in to change notification settings - Fork 653
Description
🚀 Feature
Taken from this colab by @EricZimmermann
Description
Problem
For large tensors, computing AUC metrics over multiple thresholds is exhaustive and slow. For a sufficiently large dataset, caching or saving outputs is too expensive and must be done in post.
Solution
Assuming the distribution of values are known and bounded, approximate the an integral via riemann sum over a set of fixed or variable step sizes.
Let a set of monotonically increasing thresholds
Place calc on GPU for speed removing cpu bottleneck at each iteration
Applications
Optimizing a model for voxel level (each voxel treated as an independant sample) PR AUC / ROC AUC, ex: semantic pathology segmentation
Enable user to validate model based on best operating point setting (F1 for non 0.5 threshold)
Limitations
Domain (as thresholds) must be known ahead of time. This can be accounted for by setting a lot of thresholds over a very wide range where num thresholds << num voxels in output tensor
Future work can use a heuristic to widen / narrow num thresholds based on previous iterations
Context:
Code: