Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Gradient Sign Bug #1

Open
AlexHarn opened this issue Apr 19, 2018 · 6 comments
Open

Gradient Sign Bug #1

AlexHarn opened this issue Apr 19, 2018 · 6 comments
Assignees
Labels
bug Something isn't working

Comments

@AlexHarn
Copy link
Owner

Under some circumstances the sign of the gradient seems to be flipped to the wrong direction.

@AlexHarn AlexHarn added the bug Something isn't working label Apr 19, 2018
@AlexHarn AlexHarn self-assigned this Apr 19, 2018
AlexHarn added a commit that referenced this issue Apr 19, 2018
in the wrong direction. Flipping the sign leads to proper convergence.
@AlexHarn
Copy link
Owner Author

15d5ac0 demonstrates the bug.

@AlexHarn
Copy link
Owner Author

3d62745 shows one way to flip the sign again which leads to convergence. This is not a fix, it simply demonstrates the bug.

@AlexHarn
Copy link
Owner Author

AlexHarn commented Apr 19, 2018

grid
This is a visual representation of the loss function with the "true" parameters in the middle (100 m for absorption and 25 m for scattering). As expected there is a strong negative correlation between absorption and scattering when only using total hit counts without utilizing time information.

However only fitting one of the two parameters while keeping the other one constant should work fine. It does work like expected for absorption, but for scattering the gradient points in the wrong direction which is absolutely unexpected behavior. Flipping the sign like demonstrated in commit 3d62745 leads to proper convergence which makes no sense.

The bug also arises under different circumstances, but this specific example will be used as a base for trying to create a minimal working example to nail this bug down.

@AlexHarn
Copy link
Owner Author

AlexHarn commented Apr 20, 2018

Attempts to replicate the bug in a simplified one dimensional simulation (minimal_1d.py) were unsuccessful for now.

@AlexHarn
Copy link
Owner Author

AlexHarn commented Apr 20, 2018

Changing the scattering function to something else, for example only flipping the direction vectors with some given probability into exactly the opposite direction like demonstrated in 3fa66c6, makes the bug disappear.

So one might assume the bug can be nailed down to the scattering. But changing the loss function to something else, for example a loss function based on the moments of the final positions (which would not be possible on real data) just like I did in c535128, while using the original scattering also "resolves" the bug.

Maybe there are multiple independent issues here which flip the gradient for some reason?

@AlexHarn
Copy link
Owner Author

AlexHarn commented Apr 26, 2018

We found the underlying issue. The bug is not actually a bug, but expected behavior. It is demonstrated in bug_test.py.

The problem is that there is no gradient for the number of loop iterations, which is crucial for correct behavior. Convergence can still happen on chance, if the gradient derived from the body happens to point into the right direction, which explains the weird behavior. More on this can be read here. On p. 15 it explicitly says "This means that we assume that ​pred​ is not trainable.", where pred indirectly refers to the number of loop iterations.

So how do we solve or go around this? It seems like tricking the gradient to point into the right direction is not a safe option in higher dimensions.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

1 participant