GitHub - flagos-ai/FlagTensor

Introduction

FlagTensor is part of FlagOS. FlagTensor is a tensor-primitive library oriented toward multiple hardware backends. It provides high-performance implementations of common tensor primitives (for example, unary, binary, and contraction operations), and supports correctness and performance comparisons against cuTensor baselines.

FlagTensor is a high-performance tensor-primitive library implemented with the Triton programming language launched by OpenAI.

CI workflows

This repository provides two GitHub Actions workflows under .github/workflows:

flagtensor-ci: split into correctness and perf jobs for smoke-style automated validation.
flagtensor-weekly: runs the weekly correctness and benchmark pipeline from an operator list.

Operator registry

The authoritative operator list lives in conf/operators.yaml.

It is used to track:

operator category
implementation path
correctness / benchmark entry points
supported benchmark modes
blocked operators and skip reasons

By default, the local CI and weekly runners discover operators from this registry.

Development quality gates

Install and enable pre-commit locally:

pip install pre-commit
pre-commit install

The repository ships a .pre-commit-config.yaml with YAML, formatting, import ordering, lint, and C/C++ formatting hooks.

Benchmark modes

Both workflows support the benchmark mode input:

kernel
operator

The default mode is kernel.

How to use

Trigger flagtensor-ci from workflow_dispatch when you want a quick automated check of the currently covered operators.
Trigger flagtensor-weekly from workflow_dispatch when you want to run the weekly-style multi-operator pipeline.
For flagtensor-weekly, you can optionally provide a custom operator list file; otherwise the workflow generates one from the discovered tests.

Run CI correctness locally:

python tools/run_flagtensor_ci.py --smoke --run-correctness --exclude-op tensor_contraction_trinary --mode kernel --results-dir ci_results_correctness

Run CI perf locally in kernel mode:

python tools/run_flagtensor_ci.py --smoke --run-perf --exclude-op tensor_contraction_trinary --mode kernel --results-dir ci_results_perf

Run CI perf locally in operator mode:

python tools/run_flagtensor_ci.py --smoke --run-perf --exclude-op tensor_contraction_trinary --mode operator --results-dir ci_results_perf_operator

Run weekly locally in kernel mode:

python tools/run_flagtensor_weekly.py --project-root . --gpus 0 --mode kernel --results-dir weekly_results_ci

Run weekly locally in operator mode:

python tools/run_flagtensor_weekly.py --project-root . --gpus 0 --mode operator --results-dir weekly_results_ci_operator

Run weekly with an explicit operator list (optional; generated from registry if omitted):

python tools/run_flagtensor_weekly.py --project-root . --op-list my_ops.txt --gpus 0 --mode kernel --results-dir weekly_results_ci

Features

Tensor primitives have undergone performance tuning
Triton kernel call optimization
Flexible multi-backend support mechanism
Support for common tensor primitives

Quick Installation

Install Dependencies

pip install -U pip setuptools wheel
pip install torch triton pytest pyyaml matplotlib openpyxl

Install FlagTensor

git clone https://github.com/flagos-ai/FlagTensor.git
cd FlagTensor
pip install -e .

Usage Example

import torch
import flagtensor

# Create a tensor
x = torch.randn(1024, device="cuda", dtype=torch.float32)

# Apply ReLU operator
y = flagtensor.relu(x)

This project is licensed under the Apache (Version 2.0) License.

Name		Name	Last commit message	Last commit date
Latest commit History 54 Commits
.github/workflows		.github/workflows
benchmark		benchmark
conf		conf
ctests		ctests
docs		docs
src		src
tests		tests
tools		tools
.gitignore		.gitignore
.pre-commit-config.yaml		.pre-commit-config.yaml
LICENSE		LICENSE
README.md		README.md
README_cn.md		README_cn.md
pyproject.toml		pyproject.toml

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Introduction

CI workflows

Operator registry

Development quality gates

Benchmark modes

How to use

Features

Quick Installation

Install Dependencies

Install FlagTensor

Usage Example

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Introduction

CI workflows

Operator registry

Development quality gates

Benchmark modes

How to use

Features

Quick Installation

Install Dependencies

Install FlagTensor

Usage Example

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages