-
Notifications
You must be signed in to change notification settings - Fork 66
Update opset imports in version_converter #2295
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
❌ 8 Tests Failed:
View the top 3 failed test(s) by shortest run time
To view more test analytics, go to the Test Analytics Dashboard |
if domain == "": | ||
model_or_function.opset_imports[domain] = max(version, current_version) | ||
return | ||
elif domain == "ai.onnx": | ||
model_or_function.opset_imports[domain] = max(version, current_version) | ||
return |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
since ""
and "ai.onnx"
is the same, they should share the same opset import. In practice it may be good to assert that we don't have any "ai.onnx"
nodes if handling both domains over complicates the logic
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
#2283 should handle this, pre-conversion? Then it makes sense just to check for ""
opset imports
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmmm ... I guess #2283 does NOT do it for opset-imports, that remains a TODO, is that right?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah we need to create a new data structure for opset imports to do this, which is a todo.
@@ -320,6 +352,8 @@ def visit_model(self, model: ir.Model) -> None: | |||
return None | |||
self.model_version = model_version | |||
self.visit_graph(model.graph) | |||
# Finally, update the opset imports for the model | |||
self._update_opset_imports(model) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
With this logic, if a user request opset 22, but all nodes were updated to opset 21 max, is it still possible for the model to be set at 22? Would it make sense to set the opset import filed directly to the target opset, if the conversion succeeds?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Currently, all nodes are upconverted to target_version. If no adapter is registered for a particular node, only the node.version is updated to be =target_version.
Now, the real issue would arise, in situations in which up_conversion for a node is not possible, let's take the example for GroupNorm with no bias, if no bias was provided the up conversion from 20->21 would not possible, meaning we would have a model, where GroupNorm is version 20, but all other ops are 21.
I think we need a rethink of the version conversion logic, instead of iterating (each node, then up-converting verA->verB->verC) we would have to switch the loop order to be (verA->verB, iterate each node, verB->verC, iterate each node, and so on), maintaining a copy of the original graph pre-conversion for that opset during each up conversion from opset -> opset +1, if any conversion fails, we fall back to the graph from before that up conversion, so all the nodes are of a particular opset. Once this process is done, we assign the model_imports to be whatever the last successful up conversion was.
Marking the PR as draft as I flesh out/try to implement this new logic.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Updated logic.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the update. I think it will also help to verify/document the global-strategy. I think some of the earlier complications stem back to how onnxscript functions are registered for torchlib ops. Suppose I register an onnxscript function for torch.gelu targetting onnx opset 20. Suppose the user requests onnx opset 21. If we use the abovementioned onnxscript function, will it generate the decomposition with calls to opset 20 or 21? I assume that they will be 20? And will need to be version-converted to 21?
So, in general: the assumption is that the model we start with (before version-conversion) could contain calls to several different onnx opsets (say both 20 and 21), with the assumption that they will be typically <= the target version.
What is the self.model_version's initial value in your loop? How is it determined?
1ca1bd2
to
13e7f05
Compare
890a6de
to
3950910
Compare
# Return non-converted graph if any node fails to convert. | ||
for node in graph: | ||
up_conversion = True | ||
if node.version is None: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We should check if it is an onnx domain op first: else skip, and don't do anything (even updating node.version).
|
||
# TODO(shubhambhokare1): Support down-conversion | ||
while self.model_version < self.target_version: | ||
pre_conversion_graph = copy.copy(graph) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@justinchuby : how expensive is copy? Specifically, what does it do for large tensors?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We can't copy graphs yet. This won't work (sorry)
I intend to implement graph.duplicate()
. But I have not done it. cc @titaiwangms who has a similar need.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@justinchuby What would be the correct way to store pre-conversion state graphs?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe implement a proper graph.duplicate()? Before that maybe aborting on failures is acceptable?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My understanding is this:
Hence:
The goal of producing a model with same opset-version through-out will require extra work anyway. @justinchuby : does this sound reasonable? |
That sounds reasonable to me! |
Redo of PR #2295 as discussed there. * Ensure opset_imports is updated when version converter is applied TODO (in a separate PR): * Cleanup error status API (and return value) --------- Signed-off-by: Ganesan Ramalingam <[email protected]>
Replaced by #2318 |
Uh oh!
There was an error while loading. Please reload this page.