Skip to content

[AAAI 2026] Official Implementation (PyTorch) of "TabFlash: Efficient Table Understanding with Progressive Question Conditioning and Token Focusing"

Notifications You must be signed in to change notification settings

mlvlab/TabFlash

Folders and files

NameName
Last commit message
Last commit date

Latest commit

Β 

History

1 Commit
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 

Repository files navigation

TabFlash

Official implementation of TabFlash: Efficient Table Understanding with Progressive Question Conditioning and Token Focusing, in collaboration with Google Cloud AI, accepted at AAAI 2026 (Main Technical Track).

Overview

πŸ€– TabFlash is an efficient and accurate multimodal LLM, achieving state-of-the-art performance outperforming GPT-4o and Gemini 2.5 Pro with exceptionally low computational cost.

πŸš€ TabFlash (3B) achieves state-of-the-art performance while reducing FLOPs by 27% and memory usage by 30% compared to the second-best MLLM.

⚑ TabFlash (1B) outperforms most MLLMs with exceptionally low TFLOPs and just 11.2 GB peak memory, enabling deployment on low-memory GPUs.

Accuracy vs TFLOPs Plot

Setup

This code is tested on python 3.9, CUDA 12.4, PyTorch 2.4.1, and FlashAttention 2.7.3.

1. Create Conda Environment

conda create -n tabflash python=3.9 -y
conda activate tabflash

2. Install InternVL-2.5

Follow the official guide.

cd InternVL
pip install -r requirements.txt

3. Install PyTorch

pip install torch==2.4.1 torchvision==0.19.1 torchaudio==2.4.1 --index-url https://download.pytorch.org/whl/cu124

4. Install Flash Attention v2.7.3

git clone --branch v2.7.3 --single-branch https://github.com/Dao-AILab/flash-attention.git
cd flash-attention
python setup.py install
cd ..

5. Install Additional Dependencies

pip install wandb sacrebleu distance apted bitsandbytes --upgrade
pip install datasets==2.18.0

Dataset Preparation

TabFlash uses MMTab from Table-LLaVA.

1. Download MMTab-pre (Pretraining)

  1. Download MMTab-instruct_table_images_82K.zip and MMTab-pre_table_images_part_2_16K.zip.
  2. Place them under data/LLaVA-Pretrain/images and unzip them. Rename IID_train_image directory to table_pretrain_part_1.
  3. Download table_only_pretrain_data_with_length.jsonl. Place it under data/LLaVA-Pretrain.

2. Download MMTab-instruct (Instruction Fine-tuning)

  1. Download MMTab-instruct_table_images_82K.zip.
  2. Place it under data/LLaVA-Finetune/images/table_instructV and unzip it. Rename the resulting IID_train_image directory to images.
  3. Download table_only_sft_data_with_length.jsonl. Place it under data/LLaVA-Finetune.

3. Download MMTab-eval (Inference)

  1. Download MMTab-eval_test_data_49K_llava_jsonl_format.jsonl and MMTab-eval_table_images_23K.zip.
  2. Place them under data/LLaVA-Inference and unzip it.

4. Download files for evaluation

  1. Download MMTab-eval_test_data_49K.json and MMTab-eval_test_tables_23K.json.
  2. Place them under data/MMTab-eval_evaluation.

Final file structure

TabFlash/
β”œβ”€β”€ InternVL/
β”‚   β”œβ”€β”€ internvl_chat/
β”‚   β”‚   β”œβ”€β”€ scripts/
β”‚   β”‚   β”œβ”€β”€ inference.py
β”‚   β”‚   β”œβ”€β”€ mmtab_eval.py
β”‚   β”‚   └── ...
β”‚   └── ...
β”œβ”€β”€ data/
β”‚   β”œβ”€β”€ LLaVA-Pretrain
β”‚   β”‚   β”œβ”€β”€ images/
β”‚   β”‚   β”‚   β”œβ”€β”€ table_pretrain_part_1/
β”‚   β”‚   β”‚   β”œβ”€β”€ table_pretrain_part_2/
β”‚   β”‚   β”œβ”€β”€ table_only_pretrain_data_with_length.jsonl
β”‚   β”œβ”€β”€ LLaVA-Finetune
β”‚   β”‚   β”œβ”€β”€ images/
β”‚   β”‚   β”‚   β”œβ”€β”€ table_instructV/
β”‚   β”‚   β”‚   β”‚   β”œβ”€β”€ images/
β”‚   β”‚   β”œβ”€β”€ table_only_sft_data_with_length.jsonl
β”‚   β”œβ”€β”€ LLaVA-Inference
β”‚   β”‚   β”œβ”€β”€ all_test_image/
β”‚   β”‚   β”œβ”€β”€ MMTab-eval_test_data_49K_llava_jsonl_format.jsonl
β”‚   β”œβ”€β”€ MMTab-eval_evaluation
β”‚   β”‚   β”œβ”€β”€ MMTab-eval_test_data_49K.json
β”‚   β”‚   β”œβ”€β”€ MMTab-eval_test_tables_23K.json
β”œβ”€β”€ assets/
β”‚   β”œβ”€β”€ acc_tflops_plot.png
β”‚   └── ...
└── README.md

Move into code directory

Move into the directory below for training / inference / evaluation.

cd InternVL/internvl_chat/

Training

Pre-trained models

If you only want to use model, download tabflash_stage2_4b.tar and tabflash_stage2_1b.tar and unzip it under work_dirs/internvl_chat_v2_5/tabflash_4b and work_dirs/internvl_chat_v2_5/tabflash_1b, respectively.

If you want to train the model from scratch, follow the instructions below. TabFlash training consists of two stages:

Stage 1

bash scripts/4b_train_stage1.sh # For 4B model
bash scripts/1b_train_stage1.sh # For 1B model

Stage 2

bash scripts/4b_train_stage2.sh # For 4B model
bash scripts/1b_train_stage2.sh # For 1B model

Inference

Run inference on test set:

bash scripts/4b_inference.sh # For 4B model
bash scripts/1b_inference.sh # For 1B model

Evaluation

Evaluate the model predictions:

python mmtab_eval.py --pred_file results/{exp_name}/result.jsonl

Citation

If you find this work useful, please cite:

@inproceedings{
    kim2026tabflash,
    title={TabFlash: Efficient Table Understanding with Progressive Question Conditioning and Token Focusing},
    author={Kim, Jongha and Bae, Minseong and Lee, Sanghyeok and Yoon, Jinsung and Kim, Hyunwoo J},
    booktitle={AAAI},
    year={2026}
}

Acknowledgements

This codebase is based on InternVL and Table-LLaVA.

About

[AAAI 2026] Official Implementation (PyTorch) of "TabFlash: Efficient Table Understanding with Progressive Question Conditioning and Token Focusing"

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published