-
Notifications
You must be signed in to change notification settings - Fork 7.1k
Open
Labels
PerfFor performance improvementsFor performance improvementsmodule: transformsneeds discussionprototype
Description
All the performance benchmarks that did so far for transforms v1 vs. v2 were on contiguous inputs. However, we have a few kernels that leave the output in a noncontiguous state:
affine_image_tensor
in caseimage.numel() > 0 and image.ndim == 4 and fill is not None
convert_color_space
in case we only strip the alpha channel, i.e.RGB_ALPHA -> RGB
andGRAY_ALPHA -> ALPHA
rotate_image_tensor
in caseimage.numel() > 0 and image.ndim == 4 and fill is not None
crop_image_tensor
center_crop_image_tensor
five_crop_image_tensor
ten_crop_image_tensor
If applicable, the same is also valid for the *_mask
and *_video
kernels since they are thin wrappers around the *_image_tensor
ones.
We should benchmark at least for a few kernels whether noncontiguous inputs cause a performance degredation that is larger than enforcing contiguous outputs on the kernels above. If so we should probably enforce contiguous outputs of our kernels.
Metadata
Metadata
Assignees
Labels
PerfFor performance improvementsFor performance improvementsmodule: transformsneeds discussionprototype