-
Notifications
You must be signed in to change notification settings - Fork 681
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
XPU backend support 8bit optimizer #1565
base: multi-backend-refactor
Are you sure you want to change the base?
XPU backend support 8bit optimizer #1565
Conversation
After I verified it on ipex 2.7, we can add XPU tests on test_optim. |
Thanks! Optimizer support isn't addressed yet on the new custom ops interface that we've mainlined, but we can keep dev on it here in this branch until that's ready. Is there a plan to support any other optimizers? Completely understandable if not; just curious! |
The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update. |
if out.dtype == torch.float16: | ||
ipex.xpu.bitsandbytes.cdequantize_blockwise_fp16(code, A, absmax, out, blocksize, A.numel()) | ||
elif out.dtype == torch.bfloat16: | ||
ipex.xpu.bitsandbytes.cdequantize_blockwise_bf16(code, A, absmax, out, blocksize, A.numel()) | ||
elif out.dtype == torch.float32: | ||
ipex.xpu.bitsandbytes.cdequantize_blockwise_fp32(code, A, absmax, out, blocksize, A.numel()) | ||
else: | ||
raise ValueError(f"Blockwise quantization only supports 16/32-bit floats, but got {out.dtype}") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This will be useful when porting over to the new custom ops as an implementation for bitsandbytes::dequantize_blockwise.out(Tensor A, Tensor absmax, Tensor code, int blocksize, ScalarType dtype, Tensor! out) -> ()
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hard to understand. Could you please supply more details or instructions? Thanks!
Currently no plan to enable other optimizers. |
This pr adds support of 8bit optimizer for XPU backend.
The backend kernels is integrated in Intel_extension_for_pytorch now.
We have verified the whole path accuracy with 8bit Adam blockwise.
Also add device synchronize func for every backend class to avoid cuda hardcode.
@jiqing-feng @matthewdouglas @Titus-von-Koeller