Skip to content

Commit

Permalink
fix dependency issue with --load_quantized_model_with_autoawq
Browse files Browse the repository at this point in the history
  • Loading branch information
schoi-habana committed Feb 8, 2025
1 parent 9e882f2 commit cefc5a8
Show file tree
Hide file tree
Showing 3 changed files with 4 additions and 2 deletions.
2 changes: 1 addition & 1 deletion Makefile
Original file line number Diff line number Diff line change
Expand Up @@ -107,7 +107,7 @@ slow_tests_diffusers: test_installs

# Run text-generation non-regression tests
slow_tests_text_generation_example: test_installs
python -m pip install triton==3.1.0 autoawq
python -m pip install -r requirements_awq.txt
BUILD_CUDA_EXT=0 python -m pip install -vvv --no-build-isolation git+https://github.com/HabanaAI/AutoGPTQ.git
python -m pip install git+https://github.com/HabanaAI/[email protected]
python -m pytest tests/test_text_generation_example.py tests/test_encoder_decoder.py -v -s --token $(TOKEN)
Expand Down
2 changes: 1 addition & 1 deletion examples/text-generation/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -735,7 +735,7 @@ Currently, this support is limited to UINT4 inference of pre-quantized models on

Please run the following command to install AutoAWQ:
```bash
pip install triton==3.1.0 autoawq
pip install -r requirements_awq.txt
```

You can run a *UINT4 weight quantized* model using AutoAWQ by including the argument `--load_quantized_model_with_autoawq`.
Expand Down
2 changes: 2 additions & 0 deletions examples/text-generation/requirements_awq.txt
Original file line number Diff line number Diff line change
@@ -0,0 +1,2 @@
triton==3.1.0
autoawq

0 comments on commit cefc5a8

Please sign in to comment.