Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[ENHANCEMENT]Analysis Tool #488

Closed
yxyOo opened this issue Sep 6, 2023 · 3 comments
Closed

[ENHANCEMENT]Analysis Tool #488

yxyOo opened this issue Sep 6, 2023 · 3 comments
Labels
stale No activity in 60 days on issue or PR

Comments

@yxyOo
Copy link

yxyOo commented Sep 6, 2023

I'm always frustrated that I can't estimate the amount of resources the model will consume during the training of large language models, or determine whether my training configuration will lead to out-of-memory error. It's equally frustrating not knowing the minimum number of GPU cards needed, which prevents appropriate resource allocation. Running the model to solve these issues is both time-consuming and ineffective. Moreover, I desire to understand more detailed information in during the training process, such as communication information and mappings between GPU and model.

To tackle these issues, I've developed the Analysis Tool for offline analysis of memory requirements and communication data during Megatron-LM GPTModel training under hybrid parallel strategies.

What do you think of this tool?

@yxyOo yxyOo changed the title [ENHANCEMENT] [ENHANCEMENT]Analysis Tool Sep 6, 2023
Copy link

github-actions bot commented Nov 5, 2023

Marking as stale. No activity in 60 days.

@github-actions github-actions bot added the stale No activity in 60 days on issue or PR label Nov 5, 2023
@Jack47
Copy link

Jack47 commented Nov 6, 2023

This tool is awesome, I really need this analysis tool

@github-actions github-actions bot removed the stale No activity in 60 days on issue or PR label Nov 6, 2023
Copy link

github-actions bot commented Jan 5, 2024

Marking as stale. No activity in 60 days.

@github-actions github-actions bot added the stale No activity in 60 days on issue or PR label Jan 5, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
stale No activity in 60 days on issue or PR
Projects
None yet
Development

No branches or pull requests

3 participants