[QUESTION] Hello, how to accurately calculate memory usage based on operating parameters? #597

13416157913 · 2023-11-20T01:20:47Z

Hello, how to accurately calculate memory usage based on operating parameters ?
For example,
TP_SIZE=1
PP_SIZE=1
WORLD_SIZE=8
MICRO_BATCH_SIZE=2
GLOBAL_BATCH_SIZE=128
--finetune
--sequence-parallel
--num-layers 32
--hidden-size 4096
--num-attention-heads 32
--seq-length 4096
--max-position-embeddings 4096
--no-position-embedding
--use-rotary-position-embeddings
--swiglu
--ffn-hidden-size 11008
--disable-bias-linear
--RMSNorm
--layernorm-epsilon 1e-6
--causal-lm
--distributed-optimizer
--use-flash-attn

hwdef · 2023-11-20T02:05:06Z

#482
Please check this

13416157913 · 2023-11-20T11:48:08Z

#482 Please check this

Thank you very much.

deepakn94 · 2023-11-28T23:56:27Z

We also now have a report_theoretical_memory.py script now that should take the same set of arguments as pretrain_gpt.py.

You can use like this:

CUDA_DEVICE_MAX_CONNECTIONS=1 WORLD_SIZE=<WORLD_SIZE> python -u report_theoretical_memory.py ${options}

github-actions · 2024-01-28T18:19:50Z

Marking as stale. No activity in 60 days.

deepakn94 · 2024-01-28T18:21:55Z

Going to close this. Feel free to re-open if you are still running into issues.

* fix gitignore * add local dataset dir * add ignore * add local dataset support * add some about local dataset in README.md * fix some * add some in README * remove data dir * add line to gitignore * fix some following good advices * fix some about format * reformat code using yapf --------- Co-authored-by: 宋超 <[email protected]>

github-actions bot added the stale No activity in 60 days on issue or PR label Jan 28, 2024

deepakn94 closed this as completed Jan 28, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[QUESTION] Hello, how to accurately calculate memory usage based on operating parameters? #597

[QUESTION] Hello, how to accurately calculate memory usage based on operating parameters? #597

13416157913 commented Nov 20, 2023

hwdef commented Nov 20, 2023

13416157913 commented Nov 20, 2023

deepakn94 commented Nov 28, 2023

github-actions bot commented Jan 28, 2024

deepakn94 commented Jan 28, 2024

[QUESTION] Hello, how to accurately calculate memory usage based on operating parameters? #597

[QUESTION] Hello, how to accurately calculate memory usage based on operating parameters? #597

Comments

13416157913 commented Nov 20, 2023

hwdef commented Nov 20, 2023

13416157913 commented Nov 20, 2023

deepakn94 commented Nov 28, 2023

github-actions bot commented Jan 28, 2024

deepakn94 commented Jan 28, 2024