We read every piece of feedback, and take your input very seriously.
To see all available qualifiers, see our documentation.
1 parent 52245b3 commit c4ec631Copy full SHA for c4ec631
examples/llama7b_sparse_quantized/README.md
@@ -73,10 +73,12 @@ run the following in the same Python instance as the previous steps.
73
74
```python
75
import torch
76
+import os
77
from sparseml.transformers import SparseAutoModelForCausalLM
78
79
compressed_output_dir = "output_llama7b_2:4_w4a16_channel_compressed"
-model = SparseAutoModelForCausalLM.from_pretrained(output_dir, torch_dtype=torch.bfloat16)
80
+uncompressed_path = os.path.join(output_dir, "stage_quantization")
81
+model = SparseAutoModelForCausalLM.from_pretrained(uncompressed_path, torch_dtype=torch.bfloat16)
82
model.save_pretrained(compressed_output_dir, save_compressed=True)
83
```
84
0 commit comments