-
Notifications
You must be signed in to change notification settings - Fork 742
Fix double-tracing in SpecPropPass #15485
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
🔗 Helpful Links🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/executorch/15485
Note: Links to docs will display an error until the docs builds have been completed. ❌ 2 New FailuresAs of commit 1996324 with merge base 9981e41 ( NEW FAILURES - The following jobs have failed:
This comment was automatically generated by Dr. CI and updates every 15 minutes. |
|
@GregoryComer has exported this pull request. If you are a Meta employee, you can view the originating Diff in D85913581. |
db9ef9c to
3944aac
Compare
Summary: Pull Request resolved: #15485 Differential Revision: D85913581
Summary: Pull Request resolved: pytorch#15485 Differential Revision: D85913581
3944aac to
6c9d1cb
Compare
Summary: Pull Request resolved: pytorch#15485 Differential Revision: D85913581
6c9d1cb to
8213149
Compare
|
One additional note is that aliasing analysis feels pretty fragile as is. I fixed several subtle issues (luckily caught by CI) where my changes were accidentally planning two seperate tensors when they should alias / share one TensorSpec. I'm wondering if we should re-write this pass again to either rely on ProxyValue reference equality or otherwise introduce some proper aliasing analysis. This is as opposed to hard coding that getitem and output, for example, always alias their argument. This seems like it could get messy with non-functional custom ops or defunctionalization, in general. @JacobSzwejbka @angelayi what are your thoughts on this? |
8213149 to
01418d5
Compare
Summary: Pull Request resolved: pytorch#15485 Differential Revision: D85913581
Summary: Pull Request resolved: pytorch#15485 Differential Revision: D85913581
01418d5 to
0152b4a
Compare
Summary: Pull Request resolved: pytorch#15485 Differential Revision: D85913581
0152b4a to
83de667
Compare
|
@GregoryComer has imported this pull request. If you are a Meta employee, you can view this in D85913581. |
Summary: Pull Request resolved: pytorch#15485 Differential Revision: D85913581
83de667 to
19eef20
Compare
Summary: Pull Request resolved: pytorch#15485 Differential Revision: D85913581
19eef20 to
1996324
Compare
|
Note that the moshi and zephyr size test failures are pre-existing. |
Our current SpecPropPass doesn't properly capture the effect of guards in the shape environment due to double-tracing certain ops. The problem looks like this:
To resolve this, I've updated the SpecPropPass to re-trace the graph and then generate specs based on the meta values, not the traced ProxyValues (thanks @angelayi for the suggestion). This resolves the issue.
I originally saw this issue with the NMS torchvision op, but to avoid adding a new dep to the core EXIR tests, I've written a test with a custom op that uses an unbacked symint in the meta kernel output shape to replicate the bug in the same way.
Differential Revision: D85913581