From 55dc91d7c8aed4be3ed8c89213f59f45485a9a10 Mon Sep 17 00:00:00 2001
From: Qubitium-ModelCloud <qubitium@modelcloud.ai>
Date: Fri, 17 Jan 2025 09:27:46 +0800
Subject: [PATCH] 1.7.0 release (#1085)

* prepare for v1.7.0 release

* Update version.py

* Update README.md
---
 README.md            | 1 +
 gptqmodel/version.py | 2 +-
 2 files changed, 2 insertions(+), 1 deletion(-)
diff --git a/README.md b/README.md
index 1c9224f32..2ecc4af2d 100644
--- a/README.md
+++ b/README.md
@@ -9,6 +9,7 @@
 </p>
   
 ## News
+* 01/17/2025 [1.7.0](https://github.com/ModelCloud/GPTQModel/releases/tag/v1.7.0): 🎉🎉 `backend.MLX` added for runtime-conversion and execution of GPTQ models on Apple's `MLX` framework on Applie Silicon. Exports of `gptq` models to `mlx` also now possible. We have added `mlx` exported models to [huggingface.co/ModelCloud](https://huggingface.co/collections/ModelCloud/vortex-673743382af0a52b2a8b9fe2). `lm_head` quantization now fully support by GPTQModel without external pkg dependency. 
 * 01/07/2025 [1.6.1](https://github.com/ModelCloud/GPTQModel/releases/tag/v1.6.1): 🎉 New OpenAI api compatible end-point via `model.serve(host, port)`. Auto-enable flash-attention2 for inference.  Fixed `sym=False` loading regression. 
 * 01/06/2025 [1.6.0](https://github.com/ModelCloud/GPTQModel/releases/tag/v1.6.0): ⚡25% faster quantization. 35% reduction in vram usage vs v1.5. 👀 AMD ROCm (6.2+) support added and validated for 7900XT+ GPU. Auto-tokenizer loader via `load()` api. For most models you no longer need to manually init a tokenizer for both inference and quantization.
 * 01/01/2025 [1.5.1](https://github.com/ModelCloud/GPTQModel/releases/tag/v1.5.1): 🎉 2025! Added `QuantizeConfig.device` to clearly define which device is used for quantization: default = `auto`. Non-quantized models are always loaded on cpu by-default and each layer is moved to `QuantizeConfig.device` during quantization to minimize vram usage. Compatibility fixes for `attn_implementation_autoset` in latest transformers. 
diff --git a/gptqmodel/version.py b/gptqmodel/version.py
index c2e8226cb..3eb3d3b4d 100644
--- a/gptqmodel/version.py
+++ b/gptqmodel/version.py
@@ -13,4 +13,4 @@
 # See the License for the specific language governing permissions and
 # limitations under the License.
 
-__version__ = "1.7.0-dev"
+__version__ = "1.7.0"