Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
38 commits
Select commit Hold shift + click to select a range
5b8d30e
Added BD-LoRA support
Conzel Oct 22, 2025
4e57924
Changed BD-LoRA weights to be 2d instead of 3d
Conzel Oct 22, 2025
872b91d
Fixed formatting and style
Conzel Nov 3, 2025
08c4a98
Add docstrings and updated comments
Conzel Nov 3, 2025
1969d95
Enable mixed_compatible
Conzel Nov 3, 2025
fd768d7
Fixed failing tests due to previous refactors
Conzel Nov 3, 2025
43782b5
Fixed shaping issue with BD-LoRA
Conzel Nov 3, 2025
069302d
Add usage example for bdlora
Conzel Nov 3, 2025
3a96bc3
Created experiment results
Conzel Nov 4, 2025
9404334
Merge branch 'huggingface:main' into bdlora
Conzel Nov 4, 2025
797c7af
Updated notebook with clarification on sharding and performance
Conzel Nov 4, 2025
d5f7b28
Moved BD-LoRA implementation to a LoRA variant
Conzel Nov 10, 2025
f8e4e89
Removed BD-LoRA from peft types
Conzel Nov 18, 2025
d6f1f1d
Renamed attributes in BD-LoRA Config
Conzel Nov 18, 2025
d9a5edd
Add post-init check to ensure no overlap in BD-modules
Conzel Nov 18, 2025
83fd06d
Improved error message for BlockDiagonalLinear if bias is supplemented
Conzel Nov 18, 2025
9a0bf5d
Fixed adapter merging in BD-LoRA and added tests
Conzel Nov 18, 2025
7657cb7
Moved tests to custom model matrix
Conzel Nov 18, 2025
e49ff77
Fixed issues discovered through new tests
Conzel Nov 18, 2025
55460f7
Updated notebook to new BD-LoRA variant implementation
Conzel Nov 18, 2025
c0b599d
Add check that features are divisible by blocks
Conzel Nov 18, 2025
639578a
Fixed wrong default argument
Conzel Nov 18, 2025
6f9a496
Add more information in the BdLoraConfig Docstring
Conzel Nov 18, 2025
fac0291
Minor refactors
Conzel Nov 20, 2025
4e15182
Added match_strict option to BD-LoRA
Conzel Nov 20, 2025
cf459d9
Removed kaiming_uniform argument for a, not needed
Conzel Nov 20, 2025
fec4643
Shortened docstrings
Conzel Nov 20, 2025
9cc8f03
Changed BD-LoRA experiments
Conzel Nov 20, 2025
362c5c0
Add example with vLLM integration
Conzel Nov 20, 2025
a6e7819
Added tests for BD-LoRA initialization
Conzel Nov 21, 2025
2fce6c2
Improved BD-LoRA finetuning example
Conzel Nov 24, 2025
b7f1a9e
Cleaned up asserts and spelling in BD-LoRA implementation
Conzel Nov 24, 2025
9b8e965
Cleaned up tests and removed bias assertion
Conzel Nov 24, 2025
86913b6
Update examples/bdlora_finetuning/chat.py
Conzel Nov 25, 2025
dc0c8e9
Merge branch 'main' of https://github.com/huggingface/peft into bdlora
Conzel Nov 25, 2025
512aef9
Removed all BD-LoRA experiments besides the one for rank 14
Conzel Nov 25, 2025
9dd8166
Moved target name to end of argument list
Conzel Nov 25, 2025
84d6b4c
Fixed shape error with float16 forward pass on compatible devices
Conzel Dec 3, 2025
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
15 changes: 15 additions & 0 deletions examples/bdlora_finetuning/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,15 @@
# BD-LoRA Finetuning

Block-Diagonal LoRA (BD-LoRA) is a LoRA variant in which some LoRA factors are constrained to be block-diagonal.
This allows faster serving by eliminating communication overheads when running inference on multiple GPU, at the same finetuning performance as vanilla LoRA.

To get an overview on how to use BD-LoRA, please view the Python notebook at `peft/examples/bdlora_finetuning/bdlora_peft_demo.ipynb`.

To benefit from inference speed-ups, you need an inference engine that is compatible with BD-LoRA. At the moment, there is an experimental PR at https://github.com/vllm-project/vllm/pull/28136 which allows you to use BD-LoRA in vLLM. If you find this work useful, consider leaving a comment there.

To install, you can clone the GitHub repository connected to the fork at https://github.com/Conzel/vllm/tree/bdlora-bk. Then, install vLLM following the usual instructions: https://docs.vllm.ai/en/stable/getting_started/installation/. We assume that you have a hardware setup with at least 2 available GPUs.

This example folder contains 3 scripts:
- `bdlora_peft_demo.ipynb` Showcases how to instantiate a BD-LoRA model, train it, and save/reload the weights.
- `vllm_server.bash` Spins up a BD-LoRA compatible vLLM server. To use it, you need to run the notebook once to create adapters with the correct format.
- `chat.py` Can be used to query the vLLM server after it has finished booting up. Usage example: `python3 chat.py --target lora1`.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added examples/bdlora_finetuning/bdlora-sharding.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading