[exp] nnx copy #2456

hengtaoguo · 2025-10-06T20:43:28Z

Description

Start with a short description of what the PR does and how this is a change from
the past.

The rest of the description includes relevant details and context, examples:

why is this change being made,
the problem being solved and any relevant context,
why this is a good solution,
some information about the specific implementation,
shortcomings of the solution and possible future improvements.

If the change fixes a bug or a Github issue, please include a link, e.g.,:
FIXES: b/123456
FIXES: #123456

Notice 1: Once all tests pass, the "pull ready" label will automatically be assigned.
This label is used for administrative purposes. Please do not add it manually.

Notice 2: For external contributions, our settings currently require an approval from a MaxText maintainer to trigger CI tests.

Tests

Please describe how you tested this change, and include any instructions and/or
commands to reproduce.

Checklist

Before submitting this PR, please make sure (put X in square brackets):

I have performed a self-review of my code. For an optional AI review, add the gemini-review label.
I have necessary comments in my code, particularly in hard-to-understand areas.
I have run end-to-end tests tests and provided workload links above if applicable.
I have made or will make corresponding changes to the doc if needed.

PiperOrigin-RevId: 810134577

update refactor update Add optional config Explicitly shard input tensors across mesh devices Run on 0.7.2 candidate image Fix typo in image tag Revert to use latest tag update test for new jax version Remove sharding rules for q_lora and kv_lora from base.yml update with configs clean up update

Removed tunix from requirements files Install tunix if device=tpu Mark tunix-based tests as tpu-only Ignore sft hooks test for gpu tests

PiperOrigin-RevId: 811553039

This commit introduces a fully-featured, OpenAI-compatible RESTful API server for serving MaxText models. The server is built with FastAPI, supports multi-host inference on TPUs, and is designed for both interactive use and large-scale benchmarking. Key features and additions: 1. **Core Server Implementation:** - Adds `maxtext_server.py`, a FastAPI application that serves `/v1/completions` and `/v1/chat/completions` endpoints. - Implements dynamic request batching to efficiently utilize underlying hardware. - Uses `maxtext_generator.py` to encapsulate the MaxText inference engine, handling model loading, tokenization, and the generation loop. - Includes Pydantic models in `server_models.py` for robust, OpenAI-compliant request and response validation. 2. **Deployment and Utilities:** - Provides `start_server.sh` to simplify launching the server from the project root. - Adds `port_forward_xpk.sh`, a utility script to automatically find and connect to a server running on a GKE cluster via `xpk`, supporting custom namespaces. - Isolates server-specific dependencies in `benchmarks/api_server/requirements.txt` (`uvicorn`, `fastapi`, `openai-harmony`). 3. **Comprehensive Documentation:** - A new `README.md` in the `api_server` directory offers a complete guide covering: - Installation and environment setup. - Launching the server in both single-pod and multi-pod GKE environments. - Detailed examples for interacting with the API using `curl` and the `openai` Python client. - Step-by-step instructions for running benchmarks with `lm-evaluation-harness` and `evalchemy` for both log-likelihood and generative tasks.

PiperOrigin-RevId: 812337661

Revert "fix setup.sh for MODE=nightly" This reverts commit 91dcc69. Fix bugs in uploading safetensors to GCS fix setup.sh for MODE=nightly fix readme Migrate GemmaDecoderLayer and Gemma2DecoderLayer to NNX Disable setting specific profiler options on Pathways backend. PiperOrigin-RevId: 815770690 fix preprocessor Llama4Vision NNX clean up remove unused import

NicoGrande and others added 30 commits September 25, 2025 18:50

adding llama4 image tiling and basic batching for mm SFT.

2fcff3f

single image tiling support for decode.

c3a9001

adding llama4 tiling changes to decode.

9d59fb9

small typo in decode.

da421cf

adding missing params for image masks.

d422caf

Mainly internal tokenizer path fix

75f44d7

PiperOrigin-RevId: 810134577

Continue pinning Tunix dependency for nightly builds

8d671b4

Add documentation to run SFT with Deepseek-V3 model

87b9a17

Add Gemini CLI for PR review

ca75fe6

GPT-OSS: Add user guide and tests

3fa7c90

Moved tunix to setup.sh

a818e1e

Removed tunix from requirements files Install tunix if device=tpu Mark tunix-based tests as tpu-only Ignore sft hooks test for gpu tests

Update typo for pyconfig to fix the breakage on the head

1899214

PiperOrigin-RevId: 811553039

Update pinned commit for Tunix

af3803c

remove redundant code block

1a4fcfb

add vocab tiling

5dc1cd7

All-in-one commit for new pw_recipe modularization

4160978

Update Tunix commit in extra_deps_from_github

3de8da1

Add data pipeline perf in explanations

ef24250

Add codeowners for model bring-up

1d9ef55

[src/MaxText] pytype + pylint + pyink + codespell

1a33123

Add mtc_data_parallelism config for multi-tier checkpointing.

48eb50d

PiperOrigin-RevId: 812337661

adding llama4 image tiling and basic batching for mm SFT.

d447f53

Merge branch 'main' into nicogrande/sft-llama4-tiling

e8fa373

Merge branch 'main' into nicogrande/sft-llama4-tiling

23d3a2e

fixing _apply_embedding call.

b74da04

removing need for llama4 input changes.

161f338

Merge branch 'main' into nicogrande/sft-llama4-tiling

99b2b5c

fixing pad tile normalization.

cbcd022

NicoGrande and others added 7 commits October 3, 2025 02:18

fixing merge mm embeddings.

3eff680

Merge branch 'main' into nicogrande/sft-llama4-tiling

1bd9f00

linting fixes.

f27daa6

pyink linting fixes.

0b79904

Merge branch 'main' into nicogrande/sft-llama4-tiling

be95c02

adding missing dim.

e57680b

hengtaoguo changed the title ~~[exp] Hengtaoguo nnx copy~~ [exp] nnx copy Oct 6, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[exp] nnx copy #2456

[exp] nnx copy #2456

Uh oh!

hengtaoguo commented Oct 6, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

17 participants

[exp] nnx copy #2456

Are you sure you want to change the base?

[exp] nnx copy #2456

Uh oh!

Conversation

hengtaoguo commented Oct 6, 2025

Description

Tests

Checklist

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

17 participants