-
Notifications
You must be signed in to change notification settings - Fork 45
[Docs]: Add Release Documentation for Version 1.20.0 #501
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Open
abukhoy
wants to merge
36
commits into
quic:main
Choose a base branch
from
abukhoy:docs-update
base: main
Could not load branches
Branch not found: {{ refName }}
Loading
Could not load tags
Nothing to show
Loading
Are you sure you want to change the base?
Some commits from the old base branch may be removed from the timeline,
and old review comments may become outdated.
+91
−9
Open
Changes from all commits
Commits
Show all changes
36 commits
Select commit
Hold shift + click to select a range
6983a7a
Main Readme updating for latest news
abukhoy 84c7a43
Main Readme updating for latest news
abukhoy b172f89
Merge branch 'main' into docs-update
abukhoy bd1000c
docs modified
abukhoy 195740e
Merge branch 'main' into docs-update
abukhoy de93706
Merge branch 'main' into docs-update
abukhoy dc7ae55
Readme update and validate
abukhoy aa5878b
Merge branch 'main' into docs-update
abukhoy 4cbc841
Merge branch 'main' into docs-update
abukhoy 50302ab
supported features updated
abukhoy 627f7a2
Merge branch 'main' into docs-update
abukhoy c2280ba
Merge branch 'main' into docs-update
abukhoy 0ca718e
CB, single and dual qpc column added in validation doc
abukhoy 2353a76
CB, single and dual qpc column added in validation doc
abukhoy 2c42d36
source/introduction modified
abukhoy 8b3c362
source/validate modified
abukhoy dfda020
Merge branch 'main' into docs-update
abukhoy 3e3656e
Comments are addressed
abukhoy d86b836
Comments are addressed
abukhoy 56f56a9
comments are adressed
abukhoy 8352e14
Merge branch 'quic:main' into docs-update
abukhoy b88d970
release docs added and granite MOE removed from validate list
abukhoy 7e46180
release dcos modified
abukhoy 50db4bc
release docs added for 1.20
abukhoy d16eeb3
Merge branch 'main' into docs-update
abukhoy 640a61a
comments are adrressed
abukhoy 03ccbb8
Merge branch 'main' into docs-update
abukhoy cb566e8
granite vision removed from docs
abukhoy 271e623
granite vision removed from docs
abukhoy effac64
Comments Addressed
abukhoy aa77cc8
Merge branch 'main' into docs-update
abukhoy 01a07fa
Comments Addressed
abukhoy cba26d3
Comments Addressed
abukhoy 2467cde
Comments Addressed
abukhoy 9cd323c
Merge branch 'main' into docs-update
abukhoy fa848c8
Merge branch 'main' into docs-update
abukhoy File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,62 @@ | ||
# 🚀 Efficient Transformer Library - Release 1.20.0 (Beta) | ||
|
||
Welcome to the official release of **Efficient Transformer Library v1.20.0**! This release brings a host of new model integrations, performance enhancements, and fine-tuning capabilities to accelerate your AI development. | ||
|
||
> ✅ All features and models listed below are available on the `release/1.20.0` branch and `mainline`. | ||
|
||
--- | ||
|
||
## 🧠 Newly Supported Models | ||
|
||
- **Llama-4-Scout-17B-16E-Instruct** | ||
- Text & Image+Text support | ||
- Chunk attention, Single/Dual QPC support | ||
- Multi-image prompts enabled via VLLM interface | ||
- [Llama4 Example Script](https://github.com/quic/efficient-transformers/blob/main/examples/llama4_example.py) | ||
|
||
- **Grok-1** | ||
- Executable via `QEffAutoModelForCausalLM` | ||
|
||
- **Gemma3** | ||
- Text & Image+Text support | ||
- Sliding window support | ||
- [Gemma3 Example Script](https://github.com/quic/efficient-transformers/blob/main/examples/gemma3_example/gemma3_mm.py) | ||
|
||
|
||
- **SwiftKV (Llama-3.1-SwiftKV-8B-Instruct)** | ||
- Supports both continuous and non-continuous batching | ||
- Executable via `QEffAutoModelForCausalLM` | ||
|
||
- **GGUF Models** | ||
- Execution support (non-quantized) | ||
- [Example Script](https://github.com/quic/efficient-transformers/blob/main/examples/basic_gguf_models.py) | ||
|
||
- **FP8 Compressed Quantization** | ||
- Support for `Llama-3.3-70B-Instruct-FP8-Dynamic` | ||
|
||
--- | ||
|
||
## ✨ Key Features & Enhancements | ||
|
||
- **Transformer Upgrade**: Now using version `4.51.3` | ||
- **SpD & Multi-Projection Heads**: Token speculation via post-attention projections | ||
- **I/O Encryption**: `--io-encrypt` flag support in compile/infer APIs | ||
- **Separate Prefill/Decode Compilation**: For disaggregated serving | ||
- **On-Device Sampling**: Supported using VLLM, which reduces host-device latency for CausalLM models | ||
|
||
--- | ||
|
||
## 🔍 Embedding Model Upgrades | ||
|
||
- **Flexible Pooling**: Choose from standard or custom strategies | ||
- **Sentence Embedding**: Now runs directly on AI100 | ||
- **Multi-Seq Length Compilation**: Auto-selects optimal graph at runtime | ||
|
||
--- | ||
|
||
## 🛠️ Fine-Tuning Support | ||
|
||
- BERT fine-tuning support with templates and documentation | ||
- Gradient checkpointing, device-aware `GradScaler`, and CLI `--help` added | ||
|
||
--- |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Uh oh!
There was an error while loading. Please reload this page.