From d5cf7c42896645bad0b73c48641bf68085b62e0a Mon Sep 17 00:00:00 2001
From: =?UTF-8?q?X=CE=BBRI-U5?= <b3f0cus@icloud.com>
Date: Mon, 8 Jul 2024 17:07:18 +0700
Subject: [PATCH] Update README.md

---
 examples/mup/README.md | 5 +++++
 1 file changed, 5 insertions(+)

diff --git a/examples/mup/README.md b/examples/mup/README.md
index c86850ca..ed94c1fb 100644
--- a/examples/mup/README.md
+++ b/examples/mup/README.md
@@ -32,3 +32,8 @@ We trained a 350m model with spectral µTransfer and standard parametrization us
 Please check the directory [[./examples/mup/configs]](/examples/mup/configs) for the configurations we used to reproduce the experiments.
 
 ![LLaMA](./assets/llama.png)
+
+
+#### Thoughts
+
+For Spectral MuP, the experiments we used it on MLP only [link] and 300m LLaMA [link] (there are links to the experiment config in the mup readme). However, when we tested it on 1B/8B models iirc, the loss blew up for some reasons. So, we'd recommend they try μTransfer, not spectral μTransfer.