-
Notifications
You must be signed in to change notification settings - Fork 3.6k
LR Finder: compute gradients w.r.t. log10(lr) in exponential mode (fix torch.gradient spacing) #21171
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull Request Overview
This PR fixes the LR Finder's gradient computation to use the correct sample spacing for torch.gradient, ensuring mathematical correctness when suggesting learning rates. The key change is differentiating with respect to log10(lr) in exponential mode rather than treating log-spaced learning rates as linear coordinates.
- Updates gradient computation to use log10(lr) spacing for exponential mode and linear lr spacing for linear mode
- Fixes test to use torch.logspace for exponential mode testing and validates the suggestion against expected gradient-based selection
- Removes brittle test assertion comparing gradients with and without spacing
Reviewed Changes
Copilot reviewed 2 out of 2 changed files in this pull request and generated no comments.
File | Description |
---|---|
src/lightning/pytorch/tuner/lr_finder.py | Updates suggestion method to use correct spacing parameter based on mode |
tests/tests_pytorch/tuner/test_lr_finder.py | Fixes test to use log-spaced LRs and validates gradient computation correctness |
Tip: Customize your code reviews with copilot-instructions.md. Create the file or learn how to get started.
@KAVYANSHTYAGI thanks for the PR. Does this solve the problems mention in issue #21141? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It mitigates the problem from #21141 but does not fully implement the “step based gradient” behavior some commenters requested.
What this PR fixes
- In mode="exponential", gradients are now computed w.r.t. log10(lr) (i.e., the actual sampling coordinate) not linear LR.
- This removes the non-uniform-grid pathology introduced in ≥2.5.3, which amplified early-step noise and biased suggestions toward tiny LRs.
What it does not change
- It does not switch the derivative to training-step index (uniform step spacing). Linear mode remains unchanged.
Summary
Motivation
expects the coordinates of samples. With log-spaced LRs, the intended derivative is
∂loss/∂log10lr
Previous behavior used linear LR coordinates, which is inconsistent and can skew the suggested LR.
Tests
Impact
📚 Documentation preview 📚: https://pytorch-lightning--21171.org.readthedocs.build/en/21171/