-
Notifications
You must be signed in to change notification settings - Fork 0
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Gradient Sign Bug #1
Comments
in the wrong direction. Flipping the sign leads to proper convergence.
15d5ac0 demonstrates the bug. |
3d62745 shows one way to flip the sign again which leads to convergence. This is not a fix, it simply demonstrates the bug. |
However only fitting one of the two parameters while keeping the other one constant should work fine. It does work like expected for absorption, but for scattering the gradient points in the wrong direction which is absolutely unexpected behavior. Flipping the sign like demonstrated in commit 3d62745 leads to proper convergence which makes no sense. The bug also arises under different circumstances, but this specific example will be used as a base for trying to create a minimal working example to nail this bug down. |
Attempts to replicate the bug in a simplified one dimensional simulation (minimal_1d.py) were unsuccessful for now. |
Changing the scattering function to something else, for example only flipping the direction vectors with some given probability into exactly the opposite direction like demonstrated in 3fa66c6, makes the bug disappear. So one might assume the bug can be nailed down to the scattering. But changing the loss function to something else, for example a loss function based on the moments of the final positions (which would not be possible on real data) just like I did in c535128, while using the original scattering also "resolves" the bug. Maybe there are multiple independent issues here which flip the gradient for some reason? |
We found the underlying issue. The bug is not actually a bug, but expected behavior. It is demonstrated in bug_test.py. The problem is that there is no gradient for the number of loop iterations, which is crucial for correct behavior. Convergence can still happen on chance, if the gradient derived from the body happens to point into the right direction, which explains the weird behavior. More on this can be read here. On p. 15 it explicitly says "This means that we assume that So how do we solve or go around this? It seems like tricking the gradient to point into the right direction is not a safe option in higher dimensions. |
Under some circumstances the sign of the gradient seems to be flipped to the wrong direction.
The text was updated successfully, but these errors were encountered: