Add 11L XSA11 + BigramHash3072 + AdamW Legal TTT submission by someone114514 · Pull Request #841 · openai/parameter-golf

someone114514 · 2026-03-26T11:12:34Z

Summary

Adds a new 16MB submission folder with:

11-layer 512d transformer
XSA on all 11 layers
BigramHash with 3072 buckets and dim 112
Parameter Banking + Parallel Muon
score-first legal TTT with AdamW

Included Files

README.md
submission.json
train_seed1337.log
train_gpt.py

Result

legal_ttt_exact val_bpb: 1.11565196
artifact size: 15,983,339 bytes

Notes

sliding window stride: 64
TTT chunk size: 131072
TTT epochs: 3
TTT freeze blocks: 8

Combines NewTest (PR openai#841 base) with SOTA experiments that achieved ~1.12 BPB: - train_seq_len/eval_seq_len: 2048 → 4096 (long context from user's SOTA exps) - bigram_vocab_size: 3072 → 2048, bigram_dim: 112 → 128 (proven SOTA settings) - xsa_last_n: 11 → 4 (from user's best experiments) - gated_attention + value_residual: enabled by default (PR openai#824/838 show ~0.018 BPB improvement) - Bank QAT: symmetric int6 STE fake-quant on all weight banks during warmdown - Fix: CastedLinear QAT clip range (-32,31) → (-31,31) to match export format - Compression: lzma-6 → zstd-22 (PR openai#824/838: 14.9MB vs ~16MB, critical for fitting under limit) - Fix: target_mb budget uses decimal MB (1e6) not MiB (1024^2) matching competition rules - Budget-aware ±1 weight pruning retained from NewTest

someone114514 added 4 commits March 26, 2026 15:28

Add PR549 fused record candidate

4776c46

Prepare XSA11 Bigram3072 AdamW TTT submission folder

f6af4d7

Remove helper script and add requirements

b3ecf41

Drop dependency note from submission folder

cc84adf

This was referenced Mar 27, 2026

Ultimate: GatedAttn + ValueResidual + Full QAT + lzma-9 + BigramHash(2048) #952

Closed

GatedAttn + ValueResidual + Full QAT + lzma-9 + BigramHash(2048) #347

Draft

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add 11L XSA11 + BigramHash3072 + AdamW Legal TTT submission#841

Add 11L XSA11 + BigramHash3072 + AdamW Legal TTT submission#841
someone114514 wants to merge 4 commits intoopenai:mainfrom
someone114514:xsa11-bigram3072-adamw-legal-ttt

someone114514 commented Mar 26, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

someone114514 commented Mar 26, 2026

Summary

Included Files

Result

Notes

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant