FIX: Correct adapter dtype with bnb weights #2893

BenjaminBossan · 2025-11-04T12:13:24Z

Resolves #2889

Description

The reported bug is this: When the base model is quantized with 4bit bitsandbytes, the adapter weights would be cast to float32, even if autocast_adapter_dtype=False was passed. This is because the dtype of the base layer was not correctly determined in that case. This PR now correctly determines the dtype.

While working on this, I noticed that the peft_model.add_adapter method was lacking the option to disable autocasting. This was added now and the tests cover it as well. I also updated some of the corresponding docstrings.

Tangential changes

An unrelated issue I noticed is when I was debugging: At one point, OSF calls if not hasattr(module, "osf_svd_params"). This would error when the module was a ModulesToSaveWrapper because ModulesToSaveWrapper._hasattr_wrapped was not taking into account the case that there is no active adapter. This is now fixed too.

Moreover, OSF implemented its own _cast_adapter_dtype. This would basically bypass upcasting to float32 of the OSF weights if the base model is loaded in lower precision. However, unless the user explicitly passes autocast_adapter_dtype=False, the default in PEFT is to upcast the adapter weights to float32. With the changes to this PR, upcasting is now done. To make this work with the forward pass, the x is cast to the dtype of the weight. We assume that the output dtype should be the same as the original dtype of x.

TODOs

There is still an issue left with 8bit bnb weights. They don't have a compute dtype, so at a layer level, it is not possible to determine what the dtype of the PEFT adapter should be (of course, it cannot be int8). Therefore, the corresponding tests for 8bit bnb are x-failing for now. One possible solution could be to pass down the dtype of the base model (if any) and use that as a fallback. This could be implemented in a later PR.

Resolves huggingface#2889 The reported bug is this: When the base model is quantized with 4bit bitsandbytes, the adapter weights would be cast to float32, even if autocast_adapter_dtype=False was passed. This is because the dtype of the base layer was not correctly determined in that case. This PR now correctly determines the dtype. While working on this, I noticed that the peft_model.add_adapter method was lacking the option to disable autocasting. This was added now and the tests cover it as well. TODOs For LNTuning and OSF, I found that the dtype is not correctly being applied when it is float16 or bfloat16. As I didn't want to blow up this PR even more, I decided to skip those methods for now and leave the fix for those for another time. Moreover, there is still an issue left with 8bit bnb weights. They don't have a compute dtype, so at a layer level, it is not possible to determine what the dtype of the PEFT adapter should be (of course, it cannot be int8). Therefore, the corresponding tests for 8bit bnb are x-failing for now. One possible solution could be to pass down the dtype of the base model (if any) and use that as a fallback. This could be implemented in a later PR.

HuggingFaceDocBuilderDev · 2025-11-04T12:17:08Z

The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update.

_hasattr_wrapped would fail if no active adapter

BenjaminBossan · 2025-11-07T14:47:13Z

@NikhilNayak-debug this PR contains some updates to OSF. Could you please check if they make sense? Check the PR description above for the reason of the changes.

BenjaminBossan added 3 commits November 4, 2025 13:46

Fix mixed model

d843632

GPU test: test target_parameters, move to GPU

2e651ba

Fix bug in ModulesToSaveWrapper

1f2823f

_hasattr_wrapped would fail if no active adapter

BenjaminBossan requested a review from githubnemo November 4, 2025 14:13

This was referenced Nov 4, 2025

autocast_adapter_dtype=False doesn't work when the model is quantized #2889

Open

Fix autocast_adapter_dtype=False for quantized models #2891

Open

BenjaminBossan marked this pull request as draft November 5, 2025 15:30

BenjaminBossan added 2 commits November 5, 2025 17:06

Add test with low_cpu_mem_usage

b0c60a2

Fix for OSF

8909f19

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

FIX: Correct adapter dtype with bnb weights #2893

FIX: Correct adapter dtype with bnb weights #2893

Uh oh!

BenjaminBossan commented Nov 4, 2025 •

edited

Loading

Uh oh!

HuggingFaceDocBuilderDev commented Nov 4, 2025

Uh oh!

BenjaminBossan commented Nov 7, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

FIX: Correct adapter dtype with bnb weights #2893

Are you sure you want to change the base?

FIX: Correct adapter dtype with bnb weights #2893

Uh oh!

Conversation

BenjaminBossan commented Nov 4, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Description

Tangential changes

TODOs

Uh oh!

HuggingFaceDocBuilderDev commented Nov 4, 2025

Uh oh!

BenjaminBossan commented Nov 7, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

BenjaminBossan commented Nov 4, 2025 •

edited

Loading