Skip to content

Commit d339f36

Browse files
younesbelkadastas00stevhliuTimDettmers
authored
Blog post on bitsandbytes integration on Hugging Face (#463)
* first commit * add new thumbnails * add more content * add new gif * Update _blog.yml * rename files * Apply suggestions from code review Co-authored-by: Stas Bekman <[email protected]> * Apply suggestions from code review * change content a bit - add more details and adapt from stas suggestions * re-write text: part 1 * few modifs - add credits - add image - modify a bit the content * modify a bit * add more content * add image * paraphrase a bit * add more content * add more content * some improvements * add thumbnail * add more text + fix table * fix table * fix tables * add stas as author * add a last sentence * edit some more * few modifs * modify thumbail * add thumbnail * add removed comment * add photos * add more infos * Apply suggestions from code review Co-authored-by: Steven Liu <[email protected]> * Apply suggestions from code review Co-authored-by: Steven Liu <[email protected]> * Add files via upload * add steven to the credits! * edits * edits * edits * edits * add script * change to std err * refactor a bit the tables * add Tim's comments * remove separators * explain why it is slow * Update hf-bitsandbytes-integration.md Co-authored-by: Stas Bekman <[email protected]> * Add links to paper * delete dummy file * add correct link to paper * add more explanation on speed * update figure * replace authors by we * add freezed image * remove old table * Update hf-bitsandbytes-integration.md Some slight edits. * Apply suggestions from code review * Apply suggestions from code review Co-authored-by: Stas Bekman <[email protected]> Co-authored-by: Stas Bekman <[email protected]> Co-authored-by: Steven Liu <[email protected]> Co-authored-by: Tim Dettmers <[email protected]>
1 parent 822183c commit d339f36

23 files changed

+642
-1
lines changed

_blog.yml

Lines changed: 10 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1120,7 +1120,6 @@
11201120
- guide
11211121

11221122

1123-
11241123
- local: skops
11251124
title: Introducing Skops
11261125
author: merve
@@ -1132,3 +1131,13 @@
11321131
- announcement
11331132
- guide
11341133

1134+
1135+
- local: hf-bitsandbytes-integration
1136+
title: "A Gentle Introduction to 8-bit Matrix Multiplication for transformers at scale using transformers, accelerate and bitsandbytes"
1137+
author: ybelkada
1138+
thumbnail: /blog/assets/96_hf_bitsandbytes_integration/thumbnail_blue.png
1139+
date: August 17, 2022
1140+
tags:
1141+
- nlp
1142+
- llm
1143+
- quantization
31.8 KB
Loading
36.8 KB
Loading
47.8 KB
Loading
60 KB
Loading
962 KB
Loading
1.82 MB
Loading
Loading
Loading
120 KB
Loading
Loading
50.1 KB
Loading
Lines changed: 41 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,41 @@
1+
import torch
2+
import torch.nn as nn
3+
4+
from bitsandbytes.nn import Linear8bitLt
5+
6+
# Utility function
7+
8+
def get_model_memory_footprint(model):
9+
r"""
10+
Partially copied and inspired from: https://discuss.pytorch.org/t/gpu-memory-that-model-uses/56822/2
11+
"""
12+
return sum([param.nelement() * param.element_size() for param in model.parameters()])
13+
14+
# Main script
15+
16+
fp16_model = nn.Sequential(
17+
nn.Linear(64, 64),
18+
nn.Linear(64, 64)
19+
).to(torch.float16)
20+
21+
# Train and save your model!
22+
23+
torch.save(fp16_model.state_dict(), "model.pt")
24+
25+
# Define your int8 model!
26+
27+
int8_model = nn.Sequential(
28+
Linear8bitLt(64, 64, has_fp16_weights=False),
29+
Linear8bitLt(64, 64, has_fp16_weights=False)
30+
)
31+
32+
int8_model.load_state_dict(torch.load("model.pt"))
33+
int8_model = int8_model.to(0) # Quantization happens here
34+
35+
input_ = torch.randn(8, 64, dtype=torch.float16)
36+
hidden_states = int8_model(input_.to(0))
37+
38+
mem_int8 = get_model_memory_footprint(int8_model)
39+
mem_fp16 = get_model_memory_footprint(fp16_model)
40+
41+
print(f"Relative difference: {mem_fp16/mem_int8}")

assets/96_hf_bitsandbytes_integration/mantissa.svg

Lines changed: 129 additions & 0 deletions
Loading
Loading
Loading
Loading
Loading
Loading
Loading
29.6 KB
Loading
60.2 KB
Loading

hf-bitsandbytes-integration.md

Lines changed: 462 additions & 0 deletions
Large diffs are not rendered by default.

0 commit comments

Comments
 (0)