What's the correct way to go about static quantization of models in timm? #643
-
Right now I'm working on quantizing efficientnet. I'm rerunning my code over and over and hunting down the errors one by one, then patching the relevant sections of code. They mainly consist of
I'm over an hour in and the end is still not in sight, so I was wondering if I was missing something inbuilt that could help with this. |
Beta Was this translation helpful? Give feedback.
Replies: 2 comments 7 replies
-
@alexander-soare I'd look at FX based quantization. It's pretty new but I think there are some examples out there. The prev approach, manually replacing fns didn't seem like a good solution. FX based operates using FX transformes on the traced model IR, so it doesn't matter what functions are used to create the structure of the model... |
Beta Was this translation helpful? Give feedback.
-
Assuming the trace for FX quantization is no different from other forms of tracing, there may need to be another fun workaround for the same padding (it's annoying, but no other way to support Tensorflow like SAME padding properly unless it gets implemented in the core of Pytorch someday). See ONNX export code I have in a diff project here: https://github.com/rwightman/gen-efficientnet-pytorch/blob/master/onnx_export.py#L77-L102 It replaces conv2 dynamic same with a static (run once and then export) alternative (loose resolution flexibility) https://github.com/rwightman/gen-efficientnet-pytorch/blob/master/geffnet/conv2d_layers.py#L88-L113 I can bring that layer here if it's needed |
Beta Was this translation helpful? Give feedback.
Assuming the trace for FX quantization is no different from other forms of tracing, there may need to be another fun workaround for the same padding (it's annoying, but no other way to support Tensorflow like SAME padding properly unless it gets implemented in the core of Pytorch someday).
See ONNX export code I have in a diff project here: https://github.com/rwightman/gen-efficientnet-pytorch/blob/master/onnx_export.py#L77-L102
It replaces conv2 dynamic same with a static (run once and then export) alternative (loose resolution flexibility) https://github.com/rwightman/gen-efficientnet-pytorch/blob/master/geffnet/conv2d_layers.py#L88-L113
I can bring that layer here if it's needed