Record: 0.9076 BPB — 10L + N-gram Backoff + Matrix LR 0.03 by bigbag · Pull Request #828 · openai/parameter-golf

bigbag · 2026-03-26T07:48:46Z

Summary

val_bpb = 0.9074 (3-seed mean, std 0.0002) | 15.26-15.46 MB | 8xH100 SXM, 600s

Single change from PR #802: MATRIX_LR=0.03 (was 0.02). Discovered through systematic hyperparameter screening (74 experiments across steps 10-12).

Results

Seed	Steps	ms/step	Pre-quant BPB	N-gram BPB	Artifact
42	6,693	89.6	1.1528	0.9076	15,320,749
1337	6,605	90.9	1.1521	0.9072	15,261,004
2024	6,607	90.8	1.1520	0.9074	15,457,538
Mean				0.9074 ± 0.0002

Key Change

MATRIX_LR=0.03 vs PR #802's default 0.02.

Architecture

10L, 512d, GQA 8H/4KV, MLP 3x LeakyReLU(0.5)²
BigramHash(4096), SmearGate, Value Residual, Gated Attention
Mixed int5-MLP/int6-attn + zstd-22, EMA(0.997)

Eval: Multi-Order N-gram Backoff (from PR #802)

Score-first backward-looking n-gram cache (orders 2-7)
Entropy-adaptive alpha mixing
133-156s eval time (well within 600s budget)

Reproduction

MATRIX_LR=0.03 torchrun --standalone --nproc_per_node=8 train_gpt.py

Test plan

8xH100 SXM, seed 42: 0.9076 BPB
8xH100 SXM, seed 1337: 0.9072 BPB
8xH100 SXM, seed 2024: 0.9074 BPB
3-seed mean: 0.9074 ± 0.0002
All artifacts ≤ 16MB (15.26-15.46 MB)
Training ≤ 600s
Eval ≤ 600s (133-156s)

Based On

PR 10L + Multi-Order N-gram Backoff (0.9123 BPB) #802: 10L + Multi-Order N-gram Backoff (0.9123 BPB)
Systematic screening: 74 experiments across RTX4500 + 8xH100

🤖 Generated with Claude Code

Single change from PR openai#802: MATRIX_LR=0.03 (was 0.02). Discovered through systematic screening (74 experiments, steps 10-12). - 10L, 512d, GQA 8/4, LeakyReLU(0.5)², BigramHash 4096 - Multi-order n-gram backoff eval cache (orders 2-7) - Entropy-adaptive alpha mixing (score-first, legal) - 8xH100 SXM, 600s training, 138s eval - Artifact: 15.32 MB Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

MatoTeziTanka · 2026-03-26T14:24:44Z

Nice result — the systematic hyperparameter screening (74 experiments) is a solid approach, and the MATRIX_LR finding is a clean single-variable improvement.

Heads up: the submission currently has 1 seed. The leaderboard requires 3-seed validation with statistical significance for record claims. Totally understand if you're waiting on compute before running the remaining seeds — just flagging so it doesn't get passed over during review.

… proxy) 10L + Multi-Order N-gram Backoff with entropy-adaptive alpha. Validated on 1xH100 SXM (876 steps, 59% eval coverage). Pending 8xH100 SXM verification for official record submission. Based on PR openai#828 approach with MATRIX_LR=0.03. Architecture: 10L, 512d, MLP 3x LeakyReLU(0.5)², XSA-4, VRL, BigramHash, SmearGate. Artifact: 15.18 MB (under 16 MB limit). Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

Seeds 42 (0.9076), 1337 (0.9072), 2024 (0.9074). All artifacts under 16MB (15.26-15.46 MB). Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

valerio-oai · 2026-03-27T23:02:22Z

Thanks for your submission! Unfortunately, it's disallowed due to the use of hashed n-gram caches, which do not renormalize correctly / correctly reweight the LM's token distribution, look ahead to the target token to mix probabilities and therefore leak eval tokens. Please refer to the long discussion about this under the issues tab for more details, and please submit more runs in the future!

MatoTeziTanka mentioned this pull request Mar 26, 2026

⛳ Parameter Golf Live AI Commentary ⛳ + Analysis / Ideas | every 10 minutes #140

Open

greqone mentioned this pull request Mar 26, 2026

Non-record (WIP): Multi-Order N-gram Backoff — val_bpb=0.8004 (1xH100 proxy) #871

Open

5 tasks

Bortlesboat mentioned this pull request Mar 26, 2026

10L + Two-Pass Order-11 N-gram Backoff (0.5863 BPB) #876

Open

Add 3-seed validation: mean 0.9074 ± 0.0002 BPB

34c1781

Seeds 42 (0.9076), 1337 (0.9072), 2024 (0.9074). All artifacts under 16MB (15.26-15.46 MB). Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

valerio-oai closed this Mar 27, 2026

valerio-oai mentioned this pull request Mar 27, 2026

Illegal submissions megathread #677

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Record: 0.9076 BPB — 10L + N-gram Backoff + Matrix LR 0.03#828

Record: 0.9076 BPB — 10L + N-gram Backoff + Matrix LR 0.03#828
bigbag wants to merge 2 commits intoopenai:mainfrom
bigbag:submission/10L-ngram-lr03-0.9076

bigbag commented Mar 26, 2026 •

edited

Loading

Uh oh!

MatoTeziTanka commented Mar 26, 2026

Uh oh!

valerio-oai commented Mar 27, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

bigbag commented Mar 26, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Results

Key Change

Architecture

Eval: Multi-Order N-gram Backoff (from PR #802)

Reproduction

Test plan

Based On

Uh oh!

MatoTeziTanka commented Mar 26, 2026

Uh oh!

valerio-oai commented Mar 27, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

bigbag commented Mar 26, 2026 •

edited

Loading