Skip to content

Commit 78374ad

Browse files
danielhancheneverythingisc00lSethHWeidmanNinoRisteskiErland366
authored
Gemma 3 (#1986)
* Update llama.py * GRPO optimized * Update rl.py * Update rl_replacements.py * Update rl_replacements.py * Update rl.py * Update rl.py * Update rl.py * Update rl.py * Update rl_replacements.py * Update rl_replacements.py * Update rl_replacements.py * Selective Log softmax * Fix GRPO bsz * Update rl.py * Update rl_replacements.py * Update rl_replacements.py * Update rl_replacements.py * Update rl_replacements.py * Fix TRL * Metrics GRPO * Update rl_replacements.py * Update rl_replacements.py * No compile * Update rl.py * Remove docs * Update rl.py * Update rl.py * Update rl.py * Update rl.py * Update rl_replacements.py * Update rl.py * Update rl.py * Update rl_replacements.py * Update rl_replacements.py * llama-quantize on WINDOWS WSL error fix - edit save.py (gguf saving breaks) (#1649) * edit save.py to fix gguf saving breaks. * add check for .exe or not exe file extension for linux and windows * Update rl_replacements.py * Update rl_replacements.py * Update rl_replacements.py * Update rl_replacements.py * Update rl_replacements.py * Update rl_replacements.py * Update rl_replacements.py * Update rl_replacements.py * Update rl_replacements.py * Update rl_replacements.py * Update llama.py * Update rl_replacements.py * Update rl_replacements.py * Update rl_replacements.py * Update llama.py * Update llama.py * Update rl_replacements.py * Update rl_replacements.py * Update rl_replacements.py * Update rl.py * Update rl.py * Update rl_replacements.py * Update rl.py * Update rl.py * Update rl_replacements.py * Update rl_replacements.py * Update rl_replacements.py * Update rl_replacements.py * Update rl_replacements.py * Update rl_replacements.py * unsloth_num_chunks * Update rl.py * Update rl_replacements.py * Update rl_replacements.py * Update rl_replacements.py * Update rl.py * Update rl.py * Update rl.py * Update rl.py * Update rl.py * Update rl_replacements.py * Update rl_replacements.py * Update rl_replacements.py (#1754) Fix typo in comment: know -> now. This was printed when running the Llama3.1_(8B)-GRPO.ipynb example notebook, so I'd expect others to run into it as well. * Optional logits * Update rl.py * Update rl.py * Update rl.py * Update rl.py * Update rl.py * Update rl.py * Update rl.py * Update rl.py * Update rl_replacements.py * Update rl.py * Update rl.py * Update rl.py * Update rl.py * fix an import error (#1767) * fix an import error * Delete .gitignore * Update loader.py * Update save.py --------- Co-authored-by: Daniel Han <[email protected]> * SamplingParams * Convert mask to float (#1762) * [Windows Support] Add latest `xformers` wheels to pyproject.toml (#1753) * Add latest xformers * Add a couple of lines to docs * vLLMSamplingParams * Update __init__.py * default num_chunks == -1 * Versioning * Update llama.py * Update llama.py * Update llama.py * Update llama.py * Update llama.py * Update _utils.py * Update rl_replacements.py * Update rl_replacements.py * Update pyproject.toml * Update pyproject.toml * Export Model to ollama.com (#1648) * Ollama Export Model to ollama.com Signed-off-by: Jyotin Goel <[email protected]> * Check for model_name Signed-off-by: Jyotin Goel <[email protected]> * subprocess use instead of requests | added check for ollama server Signed-off-by: Jyotin Goel <[email protected]> * create_ollama_model Signed-off-by: Jyotin Goel <[email protected]> * create_ollama_model | fix Signed-off-by: Jyotin Goel <[email protected]> * Push to Ollama Signed-off-by: Jyotin Goel <[email protected]> --------- Signed-off-by: Jyotin Goel <[email protected]> * Update cross_entropy_loss.py * torch_cuda_device * Update utils.py * Update utils.py * Update utils.py * device * device * Update loader.py * Update llama.py * Update README.md * Update llama.py * Update llama.py * Update _utils.py * Update utils.py * Update utils.py * Update utils.py * Update utils.py * Update utils.py * Update llama.py * Update llama.py * Update llama.py * Update llama.py * Update llama.py * Update utils.py * Update utils.py * Update utils.py * Update utils.py * __version__ * Update rl.py * Bug fixes * Bug fixes * Update llama.py * Update _utils.py * _wrap_fast_inference * Update llama.py * Update llama.py * Update llama.py * Update llama.py * Update llama.py * Update llama.py * Update llama.py * Update llama.py * Update llama.py * Update llama.py * Update llama.py * Update _utils.py * SFT dataset prepare * Update pyproject.toml * Update rl_replacements.py * Update rl_replacements.py * Update rl_replacements.py * Update rl.py * Update llama.py * Update llama.py * Update utils.py * bug fix * Update llama.py * Update llama.py * Update llama.py * Update llama.py * Update llama.py * Update __init__.py * Update _utils.py * Update _utils.py * Update _utils.py * Update _utils.py * Update _utils.py * Update rl.py * Update rl.py * Update rl.py * Update _utils.py * Update __init__.py * Update _utils.py * Version * versioning * Update _utils.py * Update llama.py * Update llama.py * Bug fixes * FastModel * __doc__ * Update vision.py * Update loader.py * Update loader.py * Update loader.py * version * move use_modelscope to _utils (#1938) * move use_modelscope to _utils * Update _utils.py * Update loader.py --------- Co-authored-by: Daniel Han <[email protected]> * Don't use revision when loading model_config and is_peft=True (#1949) * More syntax warnings (#1944) * move use_modelscope to _utils * fix * Update _utils.py * Update loader.py --------- Co-authored-by: Daniel Han <[email protected]> * Update loader.py * Full finetuning and other fixes * UNSLOTH_ENABLE_FULL_FINETUNING * Update loader.py * Update loader.py * Update loader.py * Update vision.py * Update vision.py * full finetuning * Update loader.py * Update loader.py * Update loader.py * Update _utils.py * max_seq_length * Update rl.py * Update rl.py * Update rl.py * Update pyproject.toml * AutoModelForImageTextToText * Update mapper.py * Update pyproject.toml * Update _utils.py * Update _utils.py * Update _utils.py * Batch samples * Update loader.py * Update loader.py * Update loader.py * Update loader.py * Update _utils.py * Update loader.py * Update vision.py * Update loader.py * Update vision.py * Update vision.py * Update vision.py * Update mapper.py --------- Signed-off-by: Jyotin Goel <[email protected]> Co-authored-by: Gennadii Manzhos <[email protected]> Co-authored-by: Seth Weidman <[email protected]> Co-authored-by: Nino Risteski <[email protected]> Co-authored-by: Edd <[email protected]> Co-authored-by: Ben <[email protected]> Co-authored-by: Jyotin Goel <[email protected]> Co-authored-by: Kareem <[email protected]> Co-authored-by: Wilson Wu <[email protected]>
1 parent 2b5d81d commit 78374ad

9 files changed

+329
-163
lines changed

pyproject.toml

+2-2
Original file line numberDiff line numberDiff line change
@@ -40,7 +40,7 @@ triton = [
4040
]
4141

4242
huggingface = [
43-
"unsloth_zoo>=2025.3.8",
43+
"unsloth_zoo>=2025.3.9",
4444
"packaging",
4545
"tyro",
4646
"transformers>=4.46.1,!=4.47.0",
@@ -354,7 +354,7 @@ colab-ampere-torch220 = [
354354
"flash-attn>=2.6.3",
355355
]
356356
colab-new = [
357-
"unsloth_zoo>=2025.3.8",
357+
"unsloth_zoo>=2025.3.9",
358358
"packaging",
359359
"tyro",
360360
"transformers>=4.46.1,!=4.47.0",

unsloth/__init__.py

+11-6
Original file line numberDiff line numberDiff line change
@@ -198,14 +198,19 @@ def is_bf16_supported(): return SUPPORTS_BFLOAT16
198198
# Check for unsloth_zoo
199199
try:
200200
unsloth_zoo_version = importlib_version("unsloth_zoo")
201-
if Version(unsloth_zoo_version) < Version("2025.3.8"):
202-
try:
203-
os.system("pip install --upgrade --no-cache-dir --no-deps unsloth_zoo")
204-
except:
201+
if Version(unsloth_zoo_version) < Version("2025.3.9"):
202+
print(
203+
"Unsloth: Updating Unsloth-Zoo utilies to the latest version.\n"\
204+
"To disable this, set os.environ['UNSLOTH_DISABLE_AUTO_UPDATES'] = '1'"
205+
)
206+
if os.environ.get("UNSLOTH_DISABLE_AUTO_UPDATES", "0") == "0":
205207
try:
206-
os.system("pip install --upgrade --no-cache-dir --no-deps --user unsloth_zoo")
208+
os.system("pip install --upgrade --no-cache-dir --no-deps unsloth_zoo")
207209
except:
208-
raise ImportError("Unsloth: Please update unsloth_zoo via `pip install --upgrade --no-cache-dir --no-deps unsloth_zoo`")
210+
try:
211+
os.system("pip install --upgrade --no-cache-dir --no-deps --user unsloth_zoo")
212+
except:
213+
raise ImportError("Unsloth: Please update unsloth_zoo via `pip install --upgrade --no-cache-dir --no-deps unsloth_zoo`")
209214
import unsloth_zoo
210215
except:
211216
raise ImportError("Unsloth: Please install unsloth_zoo via `pip install unsloth_zoo`")

unsloth/models/_utils.py

+29-101
Original file line numberDiff line numberDiff line change
@@ -12,7 +12,7 @@
1212
# See the License for the specific language governing permissions and
1313
# limitations under the License.
1414

15-
__version__ = "2025.3.9"
15+
__version__ = "2025.3.10"
1616

1717
__all__ = [
1818
"SUPPORTS_BFLOAT16",
@@ -25,6 +25,7 @@
2525
"__version__",
2626
"HAS_FLASH_ATTENTION",
2727
"HAS_FLASH_ATTENTION_SOFTCAPPING",
28+
"USE_MODELSCOPE",
2829
"platform_system",
2930
"patch_tokenizer",
3031
"get_statistics",
@@ -100,6 +101,7 @@
100101
from unsloth_zoo.loss_utils import (
101102
HAS_CUT_CROSS_ENTROPY,
102103
fused_linear_cross_entropy,
104+
_unsloth_get_batch_samples,
103105
)
104106
from unsloth_zoo.vision_utils import (
105107
process_vision_info,
@@ -108,6 +110,9 @@
108110
get_transformers_model_type,
109111
unsloth_compile_transformers as _unsloth_compile_transformers,
110112
)
113+
from unsloth_zoo.training_utils import (
114+
prepare_model_for_training,
115+
)
111116

112117
# =============================================
113118
# Disable some warnings which can get annoying
@@ -508,67 +513,16 @@ def prepare_model_for_kbit_training(
508513
use_gradient_checkpointing : Optional = True,
509514
use_reentrant : Optional[bool] = True,
510515
) -> Any:
511-
"""
512-
Calculates where to place the gradient checkpoints given n_layers.
513-
We also freeze all other layers's gradients
514-
515-
Args:
516-
model: Any LlamaModel with layers.
517-
use_gradient_checkpointing (`bool`, *optional*):
518-
Default enabled. Provides memory savings by not saving all activations,
519-
but only some.
520-
use_reentrant (`bool`, *optional*):
521-
https://github.com/pytorch/pytorch/blob/main/torch/utils/checkpoint.py#L354
522-
Optimal gradient checkpointing algorithm which will be the default in
523-
future Pytorch versions.
524-
"""
525-
526-
# Freeze all parameters except LoRA
527-
with torch.no_grad():
528-
for name, param in model.named_parameters():
529-
if ".lora_A." in name or ".lora_B." in name or ".lora_magnitude_vector" in name:
530-
param.requires_grad_(True)
531-
# Also must be in float32!
532-
if param.dtype != torch.float32:
533-
name = name.replace("base_model", "model", 1)
534-
layer_number = re.search(r"\.[\d]{1,}\.", name).group(0)
535-
name = name.replace(layer_number, f"[{layer_number[1:-1]}].")
536-
name = name.replace(".weight", "", 1)
537-
exec(f"{name}.to(torch.float32)")
538-
pass
539-
else:
540-
param.requires_grad_(False)
541-
pass
542-
pass
543-
544-
# Gradient checkpointing!
545-
if use_gradient_checkpointing == "unsloth":
546-
547-
# Saves VRAM!
548-
original_model = model
549-
while hasattr(original_model, "model"):
550-
original_model._offloaded_gradient_checkpointing = True
551-
original_model = original_model.model
552-
pass
553-
original_model._offloaded_gradient_checkpointing = True
554-
555-
model.gradient_checkpointing_enable()
556-
557-
elif use_gradient_checkpointing == True:
558-
model.gradient_checkpointing_enable()
559-
pass
560-
561-
# If use_reentrant = True which is the Pytorch default, we just make the input requires_grad.
562-
if use_reentrant:
563-
if hasattr(model, "enable_input_require_grads"):
564-
model.enable_input_require_grads()
565-
else:
566-
def make_inputs_require_grad(module, input, output):
567-
output.requires_grad_(True)
568-
model.get_input_embeddings().register_forward_hook(make_inputs_require_grad)
569-
pass
570-
571-
return model
516+
return prepare_model_for_training(
517+
model = model,
518+
use_gradient_checkpointing = use_gradient_checkpointing,
519+
use_reentrant = use_reentrant,
520+
full_finetuning = False,
521+
train_layernorms = False,
522+
train_embedding = False,
523+
train_lm_head = False,
524+
float32_mixed_precision = True,
525+
)
572526
pass
573527

574528
# =============================================
@@ -999,44 +953,6 @@ def test_mask_creation():
999953
pass
1000954

1001955

1002-
def _unsloth_get_batch_samples(self, epoch_iterator, num_batches):
1003-
batch_samples = []
1004-
num_items_in_batch = None
1005-
1006-
# Check if model allows **kwargs
1007-
model = self.model
1008-
f = model.base_model.model.forward if hasattr(model, "base_model") else model.forward
1009-
has_kwargs = tuple(inspect.signature(f).parameters.values())[-1].kind == inspect._VAR_KEYWORD
1010-
1011-
# Iterate to find all batches
1012-
for _ in range(num_batches):
1013-
try:
1014-
batch_samples += [next(epoch_iterator)]
1015-
except StopIteration:
1016-
break
1017-
pass
1018-
1019-
# Get num_items_in_batch
1020-
if has_kwargs and len(batch_samples) > 0 and "labels" in batch_samples[0]:
1021-
try:
1022-
num_items_in_batch = sum(
1023-
[(x["labels"][..., 1:] != -100).sum() for x in batch_samples]
1024-
)
1025-
1026-
if self.args.average_tokens_across_devices:
1027-
num_items_in_batch = self.accelerator.gather(num_items_in_batch).sum().item()
1028-
1029-
if torch.is_tensor(num_items_in_batch):
1030-
num_items_in_batch = num_items_in_batch.item()
1031-
1032-
except Exception as exception:
1033-
logger.warning_once(exception)
1034-
pass
1035-
1036-
return batch_samples, num_items_in_batch
1037-
pass
1038-
1039-
1040956
def _unsloth_pre_compute_loss(self, model, inputs, *args, **kwargs):
1041957
num_items_in_batch = None
1042958

@@ -1053,7 +969,12 @@ def _unsloth_pre_compute_loss(self, model, inputs, *args, **kwargs):
1053969
# Get gradient accumulation steps if possible
1054970
if num_items_in_batch is None and \
1055971
getattr(getattr(self, "args", self), "gradient_accumulation_steps", 1) != 1:
1056-
name = (model.base_model.model if hasattr(model, "base_model") else model).__class__.__name__
972+
973+
inner_model = model
974+
if hasattr(inner_model, "base_model"): inner_model = inner_model. base_model
975+
if hasattr(inner_model, "model"): inner_model = inner_model.model
976+
name = inner_model.__class__.__name__
977+
1057978
logger.warning_once(
1058979
f"Unsloth: Not an error, but {name} does not accept `num_items_in_batch`.\n"\
1059980
"Using gradient accumulation will be very slightly less accurate.\n"\
@@ -1271,3 +1192,10 @@ def __str__ (self): return LOGITS_ERROR_STRING
12711192
try: exec(f"EMPTY_LOGITS.{function} = raise_{j}", globals(), locals())
12721193
except: continue
12731194
pass
1195+
1196+
USE_MODELSCOPE = os.environ.get("UNSLOTH_USE_MODELSCOPE", "0") == "1"
1197+
if USE_MODELSCOPE:
1198+
if importlib.util.find_spec("modelscope") is None:
1199+
raise ImportError(f'You are using the modelscope hub, please install modelscope by `pip install modelscope -U`')
1200+
pass
1201+
pass

unsloth/models/llama.py

+9-5
Original file line numberDiff line numberDiff line change
@@ -1913,12 +1913,12 @@ def from_pretrained(
19131913

19141914
# Save max_seq_length
19151915
model.max_seq_length = max_seq_length
1916-
internal_model = model
1917-
while hasattr(internal_model, "model"):
1918-
internal_model.max_seq_length = max_seq_length
1919-
internal_model = internal_model.model
1916+
m = model
1917+
while hasattr(m, "model"):
1918+
m.max_seq_length = max_seq_length
1919+
m = m.model
19201920
pass
1921-
internal_model.max_seq_length = max_seq_length
1921+
m.max_seq_length = max_seq_length
19221922

19231923
# We check the tokenizer first for errors
19241924
if fix_tokenizer:
@@ -2016,6 +2016,10 @@ def get_peft_model(
20162016
temporary_location = "_unsloth_temporary_saved_buffers",
20172017
**kwargs,
20182018
):
2019+
if os.environ.get("UNSLOTH_ENABLE_FULL_FINETUNING", "0") == "1":
2020+
print("Unsloth: Full finetuning is enabled, so .get_peft_model has no effect")
2021+
return model
2022+
pass
20192023
transformers_set_seed(random_state)
20202024

20212025
if use_gradient_checkpointing == "unsloth":

0 commit comments

Comments
 (0)