This repository implements a collection of Flash-Linear-Attention models that extend beyond language, supporting vision, video, and more.
-
$\texttt{[2025-01-25]}$ : This repo is created with vision models.
vision
:fla-zoo
currently supports vision models. A simple documentation is in here. TL;DR: use hybrid model and random-scan for better performance and efficiency. "Vision" here refers to image classification tasks.video
:fla-zoo
currently supports certain video models. Documentation is in progress.
Requirements:
- All the dependencies shown here
- torchvision
For example, you can install all the dependencies using the following command:
conda create -n flazoo python=3.12
conda activate flazoo
pip install torch torchvision accelerate
pip install transformers datasets evaluate causal_conv1d einops sklearn wandb
pip install flash-attn --no-build-isolation
Now we can start cooking! 🚀
Note that as an actively developed repo, fla-zoo
currently no released packages are provided.
- Write documentation for vision models.
- Write documentation for video models.
- Release training scripts for vision models.