Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

AMD Suggestion, TunableOP tuning script #3377

Open
Charmandrigo opened this issue Feb 3, 2025 · 0 comments
Open

AMD Suggestion, TunableOP tuning script #3377

Charmandrigo opened this issue Feb 3, 2025 · 0 comments

Comments

@Charmandrigo
Copy link

Charmandrigo commented Feb 3, 2025

As suggest, this is to accelerate trainings on AMD cards that support TunableOP (such as 7900XTX)
https://pytorch.org/docs/stable/cuda.tunable.html

Accelerate could come with a prepared script to tune itself the first time it's configured, right now I had to manually run a tuning training, adding multiple tunableop env variables and using a small dataset, it doubled down my training speeds (for reference, cut down a 1024,1024 batch 8 sdxl lora training from 7.5s/it to 3.2s/it all thanks to the tunableOP tuning).
Could be extremely useful for RDNA3 owners!

Image
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant