Skip to content

Add Ascend NPU CI workflow for backend tests#1072

Draft
xuedinge233 wants to merge 3 commits intolinkedin:mainfrom
xuedinge233:main
Draft

Add Ascend NPU CI workflow for backend tests#1072
xuedinge233 wants to merge 3 commits intolinkedin:mainfrom
xuedinge233:main

Conversation

@xuedinge233
Copy link
Contributor

Summary

This PR introduces a new GitHub Actions workflow to validate the Ascend NPU backend using a dedicated CI pipeline.

The workflow is designed to run on Ascend NPU self-hosted runners and focuses on executing backend-related tests to ensure correctness and stability of NPU-specific implementations.


What’s included

  • Ascend NPU CI workflow triggered on:

    • push / pull_request to main (scoped to Ascend backend and test changes)
    • manual trigger via workflow_dispatch
    • scheduled daily runs
  • tests job running on Ascend NPU runners with:

    • Ascend CANN runtime container
    • NPU device passthrough and environment verification
    • Installation of torch_npu and triton-ascend
    • Execution of the transformers-related test suite for the Ascend backend

At this stage, the workflow runs the transformers test subset only. The full make test target is intentionally left commented out and can be enabled incrementally once the CI setup is fully validated and stable.


Motivation

The goal of this workflow is to provide early signal and continuous validation for Ascend NPU backend changes, while keeping the initial CI scope focused and reliable.

By starting with a targeted test suite, we can:

  • reduce CI resource pressure on NPU runners,
  • iterate on stability and environment setup,
  • and gradually expand coverage in follow-up PRs.

Future work

  • Enable the full make test target once the CI pipeline has proven stable over time
  • Extend test coverage as additional NPU features are exercised

@xuedinge233 xuedinge233 marked this pull request as draft February 5, 2026 11:13
uv pip install attrs==24.2.0 numpy==1.26.4 scipy==1.13.1 decorator==5.1.1 psutil==6.0.0 pytest==9.0.2 pytest-xdist==3.6.1 pyyaml pybind11 transformers==4.57.6
uv pip install torch==2.6.0 torchvision==0.21.0 torchaudio==2.6.0 --index-url https://download.pytorch.org/whl/cpu
uv pip install torch_npu==2.6.0
uv pip install triton-ascend==3.2.0rc4
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

triton-ascend has released its stable version. I recommend we switch to it: https://gitcode.com/Ascend/triton-ascend/releases/v3.2.0

Copy link
Collaborator

@Tcc0403 Tcc0403 Feb 7, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

torch_npu and triton-ascend should be installed via uv pip install -e .[dev]. Is there any considerations why we designate specific versions?

Liger-Kernel/setup.py

Lines 27 to 28 in effb776

elif platform == "npu":
return ["torch_npu==2.7.1", "triton-ascend"]

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

triton-ascend has released its stable version. I recommend we switch to it: https://gitcode.com/Ascend/triton-ascend/releases/v3.2.0

After testing, triton-ascend v3.2.0 can be completed normally. I will modify it to this version.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

torch_npu and triton-ascend should be installed via uv pip install -e .[dev]. Is there any considerations why we designate specific versions?

Liger-Kernel/setup.py

Lines 27 to 28 in effb776

elif platform == "npu":
return ["torch_npu==2.7.1", "triton-ascend"]

Currently, triton-ascend is based on torch_npu v2.6.0. If there are any subsequent updates, it will be synchronized

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Currently, triton-ascend is based on torch_npu v2.6.0. If there are any subsequent updates, it will be synchronized

Perhaps we could consider merging this PR first: #1055

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants