-
Notifications
You must be signed in to change notification settings - Fork 315
Add IntxUnpackedTensor #2732
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Add IntxUnpackedTensor #2732
Conversation
🔗 Helpful Links🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/ao/2732
Note: Links to docs will display an error until the docs builds have been completed. ✅ No FailuresAs of commit 93948a4 with merge base 6cfa477 ( This comment was automatically generated by Dr. CI and updates every 15 minutes. |
block_size: the block size for quantization, representing the granularity, for example groupwise quantization will have block_size (1, group_size) | ||
""" | ||
|
||
tensor_data_attrs = ["int_data", "scale", "zero_point"] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
btw if you update these to tensor_data_names
and tensor_attribute_names
you'll be able to remove some of the implementations, see docs in https://github.com/pytorch/ao/pull/2710/files#diff-d2a11602a79e83305208472f1abe6a4106f02ce62a7f9524007181813863fcf6R687, example: #2738
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I can still override the behavior in TorchAOBaseTensor, right?
For example, it looks like aten._to_copy.default gets auto-populated, but I want to define its dtype variant in addition to device variant.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this should be working, I haven't actively tested this behavior though, I'll try to add a test for this
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Changed
) | ||
|
||
@classmethod | ||
def from_float( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: we are standardizing on from_hp
now
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What does hp stand for?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
high precision
@@ -30,3 +30,8 @@ class PackingFormat(str, Enum): | |||
preshuffled is referring to the preshuffled format used by fbgemm kernels | |||
""" | |||
PRESHUFFLED = "preshuffled" | |||
|
|||
""" | |||
Unpacked means the subbyte quantized data is stored as int8 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
is this int only? we could be more specific and say UnpackedToInt8
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sure, I can make the format UNPACKED_TO_INT8
torchao/quantization/quant_api.py
Outdated
@@ -2060,6 +2061,8 @@ class IntxWeightOnlyConfig(AOBaseConfig): | |||
mapping_type: MappingType = MappingType.SYMMETRIC | |||
scale_dtype: Optional[torch.dtype] = None | |||
layout: Layout = QDQLayout() | |||
packing_format: PackingFormat = PackingFormat.UNPACKED | |||
VERSION: int = 1 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: we updated the name to version
No description provided.