script to run mlperf llama2 70b on gpu #1373

jwyang-google · 2025-03-10T23:19:31Z

Description

Start with a short description of what the PR does and how this is a change from
the past.

The rest of the description includes relevant details and context, examples:

why is this change being made,
the problem being solved and any relevant context,
why this is a good solution,
some information about the specific implementation,
shortcomings of the solution and possible future improvements.

If the change fixes a bug or a Github issue, please include a link, e.g.,:
FIXES: b/123456
FIXES: #123456

Notice 1: Once all tests pass, the "pull ready" label will automatically be assigned.
This label is used for administrative purposes. Please do not add it manually.

Notice 2: For external contributions, our settings currently require an approval from a MaxText maintainer to trigger CI tests.

Tests

Please describe how you tested this change, and include any instructions and/or
commands to reproduce.

Checklist

Before submitting this PR, please make sure (put X in square brackets):

I have performed a self-review of my code.
I have necessary comments in my code, particularly in hard-to-understand areas.
I have run end-to-end tests tests and provided workload links above if applicable.
I have made or will make corresponding changes to the doc if needed.

vipannalla

looks good

bvandermoon · 2025-03-11T23:10:48Z

MaxText/maxengine.py

+  if base_engine.model.quant:
+    engine.model.quant.quant_mode = base_engine.model.quant.quant_mode


Why is this piece needed?

if the quantization is not enabled, we will have quant=None which raises error

Got it, thanks. It won't cause issues that it's no longer set in other cases, right?

yeah, right

bvandermoon · 2025-03-12T00:16:04Z

MaxText/maxengine.py

+  if base_engine.model.quant:
+    engine.model.quant.quant_mode = base_engine.model.quant.quant_mode


Got it, thanks. It won't cause issues that it's no longer set in other cases, right?

bvandermoon · 2025-03-12T00:20:23Z

MaxText/inference_mlperf/gpu/benchmarks_llama2-70b-h100_8.sh

+fi
+
+if [[ -z ${CHECKPOINT} ]] ; then
+  export CHECKPOINT="gs://jwyang/maxtext/direct_generate_param_only_checkpoint_llama2_70b_chat/checkpoints/0/items"


Can this bucket be an input argument instead? Or can we at least use a general bucket instead of a personal one?

just changed it to a general bucket the inference team is using

bvandermoon

LGTM pending the unit test failure

jwyang-google requested review from gobbleturk, khatwanimohit, bvandermoon, vipannalla, RissyRan, richjames0, rni418, gagika, shralex, yangyuwei, SurbhiJainUSC, hengtaoguo and A9isha as code owners March 10, 2025 23:19

tohaowu approved these changes Mar 11, 2025

View reviewed changes

jwyang-google force-pushed the gpu_mlperf_llama2 branch from ade7120 to ffb234e Compare March 11, 2025 18:56

vipannalla approved these changes Mar 11, 2025

View reviewed changes

jwyang-google force-pushed the gpu_mlperf_llama2 branch from ffb234e to c0d68ba Compare March 11, 2025 20:39

bvandermoon reviewed Mar 11, 2025

View reviewed changes

bvandermoon reviewed Mar 12, 2025

View reviewed changes

jwyang-google force-pushed the gpu_mlperf_llama2 branch from c0d68ba to 78d5c79 Compare March 12, 2025 16:40

bvandermoon approved these changes Mar 12, 2025

View reviewed changes

script to run mlperf llama2 70b on gpu

05e9667

jwyang-google force-pushed the gpu_mlperf_llama2 branch from 78d5c79 to 05e9667 Compare March 12, 2025 18:34

github-actions bot added the pull ready label Mar 12, 2025

parambole requested review from mitalisi, gpolovets1, mailvijayasingh, jrplatin, patemotter and Lumosis as code owners July 11, 2025 18:58

parambole requested a review from aireenmei as a code owner July 11, 2025 18:58

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

script to run mlperf llama2 70b on gpu #1373

script to run mlperf llama2 70b on gpu #1373

Uh oh!

jwyang-google commented Mar 10, 2025 •

edited

Loading

Uh oh!

vipannalla left a comment

Uh oh!

bvandermoon Mar 11, 2025

Uh oh!

jwyang-google Mar 11, 2025

Uh oh!

bvandermoon Mar 12, 2025

Uh oh!

jwyang-google Mar 12, 2025

Uh oh!

bvandermoon Mar 12, 2025

Uh oh!

bvandermoon Mar 12, 2025

Uh oh!

jwyang-google Mar 12, 2025

Uh oh!

bvandermoon left a comment

Uh oh!

Uh oh!

		if base_engine.model.quant:
		engine.model.quant.quant_mode = base_engine.model.quant.quant_mode

script to run mlperf llama2 70b on gpu #1373

Are you sure you want to change the base?

script to run mlperf llama2 70b on gpu #1373

Uh oh!

Conversation

jwyang-google commented Mar 10, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Description

Tests

Checklist

Uh oh!

vipannalla left a comment

Choose a reason for hiding this comment

Uh oh!

bvandermoon Mar 11, 2025

Choose a reason for hiding this comment

Uh oh!

jwyang-google Mar 11, 2025

Choose a reason for hiding this comment

Uh oh!

bvandermoon Mar 12, 2025

Choose a reason for hiding this comment

Uh oh!

jwyang-google Mar 12, 2025

Choose a reason for hiding this comment

Uh oh!

bvandermoon Mar 12, 2025

Choose a reason for hiding this comment

Uh oh!

bvandermoon Mar 12, 2025

Choose a reason for hiding this comment

Uh oh!

jwyang-google Mar 12, 2025

Choose a reason for hiding this comment

Uh oh!

bvandermoon left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

jwyang-google commented Mar 10, 2025 •

edited

Loading