Gemma

Gemma is a family of lightweight, state-of-the art open models built from research and technology that we used to create the Gemini models.

Following the instructions at kaggle will let you download Gemma model weights. You will have to consent to license for Gemma using your kaggle account's API credentials.

After downloading the weights run convert_gemma_chkpt.py, which converts the checkpoint to be compatible with MaxText and uploads them to a GCS bucket. You can run decode and finetuning using instructions mentioned in the test scripts at end_to_end/tpu/gemma.

MaxText supports pretraining and finetuning with high performance

Model Flop utilization for training on v5e and v5p TPUs.

Model	v5e-256 (bf16)	v5p-128 (bf16)	v5e-256 (int8)	v5p-128 (int8)
Gemma-2b	58%	55%	64%	68%
Gemma-7b	58%	60%	70%	70%

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Files

Run_Gemma.md

Run_Gemma.md

Gemma

MaxText supports pretraining and finetuning with high performance

Files

Run_Gemma.md

Latest commit

History

Run_Gemma.md

File metadata and controls

Gemma

MaxText supports pretraining and finetuning with high performance