You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Accelerate could come with a prepared script to tune itself the first time it's configured, right now I had to manually run a tuning training, adding multiple tunableop env variables and using a small dataset, it doubled down my training speeds (for reference, cut down a 1024,1024 batch 8 sdxl lora training from 7.5s/it to 3.2s/it all thanks to the tunableOP tuning).
Could be extremely useful for RDNA3 owners!
The text was updated successfully, but these errors were encountered:
As suggest, this is to accelerate trainings on AMD cards that support TunableOP (such as 7900XTX)
https://pytorch.org/docs/stable/cuda.tunable.html
Accelerate could come with a prepared script to tune itself the first time it's configured, right now I had to manually run a tuning training, adding multiple tunableop env variables and using a small dataset, it doubled down my training speeds (for reference, cut down a 1024,1024 batch 8 sdxl lora training from 7.5s/it to 3.2s/it all thanks to the tunableOP tuning).
Could be extremely useful for RDNA3 owners!
The text was updated successfully, but these errors were encountered: