Skip to content

Releases: huggingface/text-generation-inference

v3.3.2

30 May 14:20
Compare
Choose a tag to compare

Gaudi improvements.

What's Changed

Full Changelog: v3.3.1...v3.3.2

v3.3.1

22 May 07:49
Compare
Choose a tag to compare

This release updates TGI to Torch 2.7 and CUDA 12.8.

What's Changed

New Contributors

Full Changelog: v3.3.0...v3.3.1

v3.3.0

09 May 13:57
Compare
Choose a tag to compare

Notable changes

  • Prefill chunking for VLMs.

What's Changed

New Contributors

Full Changelog: v3.2.3...v3.3.0

v3.2.3

08 Apr 08:18
a1f3ebe
Compare
Choose a tag to compare

Main changes

  • Patching Llama 4

What's Changed

Full Changelog: v3.2.2...v3.2.3

v3.2.2

06 Apr 09:41
c67546f
Compare
Choose a tag to compare

What's Changed

New Contributors

Full Changelog: v3.2.1...v3.2.2

v3.2.1

18 Mar 14:28
4d28897
Compare
Choose a tag to compare

What's Changed

Full Changelog: v3.2.0...v3.2.1

v3.2.0

12 Mar 10:17
411a282
Compare
Choose a tag to compare

Important changes

  • BREAKING CHANGE: Lots of modifications around tool calling. Tool calling now respects fully OpenAI return results (arguments return type is a string instead of a real JSON object). Lots of improvements around the tool calling and side effects fixed.

  • Added Gemma 3 support.

What's Changed

New Contributors

Full Changelog: v3.1.1...v3.2.0

v3.1.1

04 Mar 17:15
c34bd9d
Compare
Choose a tag to compare

What's Changed

New Contributors

Full Changelog: v3.1.0...v3.1.1

v3.1.0

31 Jan 13:26
463228e
Compare
Choose a tag to compare

Important changes

Deepseek R1 is fully supported on both AMD and Nvidia !

docker run --gpus all --shm-size 1g -p 8080:80 -v $volume:/data \
    ghcr.io/huggingface/text-generation-inference:3.1.0 --model-id deepseek-ai/DeepSeek-R1

What's Changed

Full Changelog: v3.0.2...v3.1.0

v3.0.2

24 Jan 11:16
b70f29d
Compare
Choose a tag to compare

Tl;dr

New transformers backend supporting flashattention at roughly same performance as pure TGI for all non officially supported models directly in TGI. Congrats @Cyrilvallez

New models unlocked: Cohere2, olmo, olmo2, helium.

What's Changed

New Contributors

Full Changelog: v3.0.1...v3.0.2