You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
(Optional for Linux users) Output of /opt/rocm/bin/rocminfo --support
No response
Additional Information
I get the following compiler error when building the python package:
`
building 'flash_attn_2_cuda' extension
creating build/temp.linux-x86_64-cpython-312/build
creating build/temp.linux-x86_64-cpython-312/csrc/flash_attn_ck
/opt/rocm-6.3.0/bin/hipcc -I/home/USER/flash-attention/csrc/composable_kernel/include -I/home/USER/flash-attention/csrc/composable_kernel/library/include -I/home/USER/flash-attention/csrc/composable_kernel/example/ck_tile/01_fmha -I/home/USER/HunyuanVideo/venv/lib/python3.12/site-packages/torch/include -I/home/USER/HunyuanVideo/venv/lib/python3.12/site-packages/torch/include/torch/csrc/api/include -I/home/USER/HunyuanVideo/venv/lib/python3.12/site-packages/torch/include/TH -I/home/USER/HunyuanVideo/venv/lib/python3.12/site-packages/torch/include/THC -I/home/USER/HunyuanVideo/venv/lib/python3.12/site-packages/torch/include/THH -I/opt/rocm-6.3.0/include -I/home/USER/HunyuanVideo/venv/include -I/usr/include/python3.12 -c build/fmha_bwd_api.hip -o build/temp.linux-x86_64-cpython-312/build/fmha_bwd_api.o -fPIC -D__HIP_PLATFORM_AMD__=1 -DUSE_ROCM=1 -DHIPBLAS_V2 -DCUDA_HAS_FP16=1 -D__HIP_NO_HALF_OPERATORS__=1 -D__HIP_NO_HALF_CONVERSIONS__=1 --offload-arch=native -O3 -std=c++17 -DCK_TILE_FMHA_FWD_FAST_EXP2=1 -fgpu-flush-denormals-to-zero -DCK_ENABLE_BF16 -DCK_ENABLE_BF8 -DCK_ENABLE_FP16 -DCK_ENABLE_FP32 -DCK_ENABLE_FP64 -DCK_ENABLE_FP8 -DCK_ENABLE_INT8 -DCK_USE_XDL -DUSE_PROF_API=1 -D__HIP_PLATFORM_HCC__=1 -DCK_TILE_FLOAT_TO_BFLOAT16_DEFAULT=3 -fno-offload-uniform-block -mllvm -enable-post-misched=0 -mllvm -amdgpu-early-inline-all=true -mllvm -amdgpu-function-calls=false -mllvm -amdgpu-coerce-illegal-types=1 -DTORCH_API_INCLUDE_EXTENSION_H -DPYBIND11_COMPILER_TYPE="gcc" -DPYBIND11_STDLIB="libstdcpp" -DPYBIND11_BUILD_ABI="cxxabi1011" -DTORCH_EXTENSION_NAME=flash_attn_2_cuda -D_GLIBCXX_USE_CXX11_ABI=0 -fno-gpu-rdc
In file included from build/fmha_bwd_api.hip:6:
In file included from /home/USER/flash-attention/csrc/composable_kernel/example/ck_tile/01_fmha/fmha_bwd_hip.hpp:7:
In file included from /home/USER/flash-attention/csrc/composable_kernel/include/ck_tile/core_hip.hpp:11:
/home/USER/flash-attention/csrc/composable_kernel/include/ck_tile/core/arch/amd_buffer_addressing_hip.hpp:29:36: error: use of undeclared identifier 'CK_TILE_BUFFER_RESOURCE_3RD_DWORD'
29 | buffer_resource res{ptr, size, CK_TILE_BUFFER_RESOURCE_3RD_DWORD};
| ^
1 error generated when compiling for gfx1151.
failed to execute:/opt/rocm-6.3.0/lib/llvm/bin/clang++ --offload-arch=native -I/home/USER/flash-attention/csrc/composable_kernel/include -I/home/USER/flash-attention/csrc/composable_kernel/library/include -I/home/USER/flash-attention/csrc/composable_kernel/example/ck_tile/01_fmha -I/home/USER/HunyuanVideo/venv/lib/python3.12/site-packages/torch/include -I/home/USER/HunyuanVideo/venv/lib/python3.12/site-packages/torch/include/torch/csrc/api/include -I/home/USER/HunyuanVideo/venv/lib/python3.12/site-packages/torch/include/TH -I/home/USER/HunyuanVideo/venv/lib/python3.12/site-packages/torch/include/THC -I/home/USER/HunyuanVideo/venv/lib/python3.12/site-packages/torch/include/THH -I/opt/rocm-6.3.0/include -I/home/USER/HunyuanVideo/venv/include -I/usr/include/python3.12 -c -x hip build/fmha_bwd_api.hip -o "build/temp.linux-x86_64-cpython-312/build/fmha_bwd_api.o" -fPIC -D__HIP_PLATFORM_AMD=1 -DUSE_ROCM=1 -DHIPBLAS_V2 -DCUDA_HAS_FP16=1 -D__HIP_NO_HALF_OPERATORS_=1 -D__HIP_NO_HALF_CONVERSIONS__=1 -O3 -std=c++17 -DCK_TILE_FMHA_FWD_FAST_EXP2=1 -fgpu-flush-denormals-to-zero -DCK_ENABLE_BF16 -DCK_ENABLE_BF8 -DCK_ENABLE_FP16 -DCK_ENABLE_FP32 -DCK_ENABLE_FP64 -DCK_ENABLE_FP8 -DCK_ENABLE_INT8 -DCK_USE_XDL -DUSE_PROF_API=1 -D__HIP_PLATFORM_HCC__=1 -DCK_TILE_FLOAT_TO_BFLOAT16_DEFAULT=3 -fno-offload-uniform-block -mllvm -enable-post-misched=0 -mllvm -amdgpu-early-inline-all=true -mllvm -amdgpu-function-calls=false -mllvm -amdgpu-coerce-illegal-types=1 -DTORCH_API_INCLUDE_EXTENSION_H -DPYBIND11_COMPILER_TYPE="_gcc" -DPYBIND11_STDLIB="_libstdcpp" -DPYBIND11_BUILD_ABI="_cxxabi1011" -DTORCH_EXTENSION_NAME=flash_attn_2_cuda -D_GLIBCXX_USE_CXX11_ABI=0 -fno-gpu-rdc
error: command '/opt/rocm-6.3.0/bin/hipcc' failed with exit code 1
error: subprocess-exited-with-error
× python setup.py bdist_wheel did not run successfully.
│ exit code: 1
╰─> See above for output.
note: This error originates from a subprocess, and is likely not a problem with pip.
full command: /home/USER/HunyuanVideo/venv/bin/python3 -u -c '
exec(compile('"'"''"'"''"'"'
This is -- a caller that pip uses to run setup.py
- It imports setuptools before invoking setup.py, to enable projects that directly
import from distutils.core to work with newer packaging standards.
- It provides a clear error message when setuptools is not installed.
- It sets sys.argv[0] to the underlying setup.py, when invoking setup.py so
setuptools doesn'"'"'t think the script is -c. This avoids the following warning:
manifest_maker: standard file '"'"'-c'"'"' not found".
- It generates a shim setup.py, for handling setup.cfg-only projects.
import os, sys, tokenize
try:
import setuptools
except ImportError as error:
print(
"ERROR: Can not execute setup.py since setuptools is not available in "
"the build environment.",
file=sys.stderr,
)
sys.exit(1)
file = %r
sys.argv[0] = file
if os.path.exists(file):
filename = file
with tokenize.open(file) as f:
setup_py_code = f.read()
else:
filename = ""
setup_py_code = "from setuptools import setup; setup()"
exec(compile(setup_py_code, filename, "exec"))
'"'"''"'"''"'"' % ('"'"'/home/USER/flash-attention/setup.py'"'"',), "", "exec"))' bdist_wheel -d /tmp/pip-wheel-cl0_tkkn
cwd: /home/USER/flash-attention/
Building wheel for flash_attn (setup.py) ... error
ERROR: Failed building wheel for flash_attn
`
The text was updated successfully, but these errors were encountered:
kiram9
changed the title
GFX1151 fails to build:
GFX1151 fails to build: undeclared identifier 'CK_TILE_BUFFER_RESOURCE_3RD_DWORD'
Dec 17, 2024
kiram9
changed the title
GFX1151 fails to build: undeclared identifier 'CK_TILE_BUFFER_RESOURCE_3RD_DWORD'
Issue: GFX1151 fails to build. Undeclared identifier 'CK_TILE_BUFFER_RESOURCE_3RD_DWORD'
Dec 17, 2024
Problem Description
OS:
NAME="Ubuntu"
VERSION="24.04.1 LTS (Noble Numbat)"
6.11.0-1009-oem #9-Ubuntu SMP PREEMPT_DYNAMIC Wed Nov 27 04:51:58 UTC 2024 x86_64 x86_64 x86_64 GNU/Linux
I am trying to build flash attention on rocm6.3. However there is a compiler error while trying to build flash-attention for my GPU arch.
Operating System
Ubuntu 24.04.1
CPU
GFX1151
GPU
GFX1151
ROCm Version
ROCm 6.3.0
ROCm Component
ROCm
Steps to Reproduce
Create a venv in python and activated it.
I have installed
ROCm using the prebuilt packages on https://rocm.docs.amd.com/projects/install-on-linux/en/latest/install/quick-start.html
Pytorch:
pip3 install --pre torch torchvision torchaudio --index-url https://download.pytorch.org/whl/nightly/rocm6.3/
flash_attention using the commands:
https://rocm.blogs.amd.com/artificial-intelligence/flash-attention/README.html
git clone --recursive https://github.com/ROCm/flash-attention.git
cd flash-attention
MAX_JOBS=$((
nproc
- 1)) pip install -v .(Optional for Linux users) Output of /opt/rocm/bin/rocminfo --support
No response
Additional Information
I get the following compiler error when building the python package:
`
building 'flash_attn_2_cuda' extension
creating build/temp.linux-x86_64-cpython-312/build
creating build/temp.linux-x86_64-cpython-312/csrc/flash_attn_ck
/opt/rocm-6.3.0/bin/hipcc -I/home/USER/flash-attention/csrc/composable_kernel/include -I/home/USER/flash-attention/csrc/composable_kernel/library/include -I/home/USER/flash-attention/csrc/composable_kernel/example/ck_tile/01_fmha -I/home/USER/HunyuanVideo/venv/lib/python3.12/site-packages/torch/include -I/home/USER/HunyuanVideo/venv/lib/python3.12/site-packages/torch/include/torch/csrc/api/include -I/home/USER/HunyuanVideo/venv/lib/python3.12/site-packages/torch/include/TH -I/home/USER/HunyuanVideo/venv/lib/python3.12/site-packages/torch/include/THC -I/home/USER/HunyuanVideo/venv/lib/python3.12/site-packages/torch/include/THH -I/opt/rocm-6.3.0/include -I/home/USER/HunyuanVideo/venv/include -I/usr/include/python3.12 -c build/fmha_bwd_api.hip -o build/temp.linux-x86_64-cpython-312/build/fmha_bwd_api.o -fPIC -D__HIP_PLATFORM_AMD__=1 -DUSE_ROCM=1 -DHIPBLAS_V2 -DCUDA_HAS_FP16=1 -D__HIP_NO_HALF_OPERATORS__=1 -D__HIP_NO_HALF_CONVERSIONS__=1 --offload-arch=native -O3 -std=c++17 -DCK_TILE_FMHA_FWD_FAST_EXP2=1 -fgpu-flush-denormals-to-zero -DCK_ENABLE_BF16 -DCK_ENABLE_BF8 -DCK_ENABLE_FP16 -DCK_ENABLE_FP32 -DCK_ENABLE_FP64 -DCK_ENABLE_FP8 -DCK_ENABLE_INT8 -DCK_USE_XDL -DUSE_PROF_API=1 -D__HIP_PLATFORM_HCC__=1 -DCK_TILE_FLOAT_TO_BFLOAT16_DEFAULT=3 -fno-offload-uniform-block -mllvm -enable-post-misched=0 -mllvm -amdgpu-early-inline-all=true -mllvm -amdgpu-function-calls=false -mllvm -amdgpu-coerce-illegal-types=1 -DTORCH_API_INCLUDE_EXTENSION_H -DPYBIND11_COMPILER_TYPE="gcc" -DPYBIND11_STDLIB="libstdcpp" -DPYBIND11_BUILD_ABI="cxxabi1011" -DTORCH_EXTENSION_NAME=flash_attn_2_cuda -D_GLIBCXX_USE_CXX11_ABI=0 -fno-gpu-rdc
In file included from build/fmha_bwd_api.hip:6:
In file included from /home/USER/flash-attention/csrc/composable_kernel/example/ck_tile/01_fmha/fmha_bwd_hip.hpp:7:
In file included from /home/USER/flash-attention/csrc/composable_kernel/include/ck_tile/core_hip.hpp:11:
/home/USER/flash-attention/csrc/composable_kernel/include/ck_tile/core/arch/amd_buffer_addressing_hip.hpp:29:36: error: use of undeclared identifier 'CK_TILE_BUFFER_RESOURCE_3RD_DWORD'
29 | buffer_resource res{ptr, size, CK_TILE_BUFFER_RESOURCE_3RD_DWORD};
| ^
1 error generated when compiling for gfx1151.
failed to execute:/opt/rocm-6.3.0/lib/llvm/bin/clang++ --offload-arch=native -I/home/USER/flash-attention/csrc/composable_kernel/include -I/home/USER/flash-attention/csrc/composable_kernel/library/include -I/home/USER/flash-attention/csrc/composable_kernel/example/ck_tile/01_fmha -I/home/USER/HunyuanVideo/venv/lib/python3.12/site-packages/torch/include -I/home/USER/HunyuanVideo/venv/lib/python3.12/site-packages/torch/include/torch/csrc/api/include -I/home/USER/HunyuanVideo/venv/lib/python3.12/site-packages/torch/include/TH -I/home/USER/HunyuanVideo/venv/lib/python3.12/site-packages/torch/include/THC -I/home/USER/HunyuanVideo/venv/lib/python3.12/site-packages/torch/include/THH -I/opt/rocm-6.3.0/include -I/home/USER/HunyuanVideo/venv/include -I/usr/include/python3.12 -c -x hip build/fmha_bwd_api.hip -o "build/temp.linux-x86_64-cpython-312/build/fmha_bwd_api.o" -fPIC -D__HIP_PLATFORM_AMD=1 -DUSE_ROCM=1 -DHIPBLAS_V2 -DCUDA_HAS_FP16=1 -D__HIP_NO_HALF_OPERATORS_=1 -D__HIP_NO_HALF_CONVERSIONS__=1 -O3 -std=c++17 -DCK_TILE_FMHA_FWD_FAST_EXP2=1 -fgpu-flush-denormals-to-zero -DCK_ENABLE_BF16 -DCK_ENABLE_BF8 -DCK_ENABLE_FP16 -DCK_ENABLE_FP32 -DCK_ENABLE_FP64 -DCK_ENABLE_FP8 -DCK_ENABLE_INT8 -DCK_USE_XDL -DUSE_PROF_API=1 -D__HIP_PLATFORM_HCC__=1 -DCK_TILE_FLOAT_TO_BFLOAT16_DEFAULT=3 -fno-offload-uniform-block -mllvm -enable-post-misched=0 -mllvm -amdgpu-early-inline-all=true -mllvm -amdgpu-function-calls=false -mllvm -amdgpu-coerce-illegal-types=1 -DTORCH_API_INCLUDE_EXTENSION_H -DPYBIND11_COMPILER_TYPE="_gcc" -DPYBIND11_STDLIB="_libstdcpp" -DPYBIND11_BUILD_ABI="_cxxabi1011" -DTORCH_EXTENSION_NAME=flash_attn_2_cuda -D_GLIBCXX_USE_CXX11_ABI=0 -fno-gpu-rdc
error: command '/opt/rocm-6.3.0/bin/hipcc' failed with exit code 1
error: subprocess-exited-with-error
× python setup.py bdist_wheel did not run successfully.
│ exit code: 1
╰─> See above for output.
note: This error originates from a subprocess, and is likely not a problem with pip.
full command: /home/USER/HunyuanVideo/venv/bin/python3 -u -c '
exec(compile('"'"''"'"''"'"'
This is -- a caller that pip uses to run setup.py
- It imports setuptools before invoking setup.py, to enable projects that directly
import from
distutils.core
to work with newer packaging standards.- It provides a clear error message when setuptools is not installed.
- It sets
sys.argv[0]
to the underlyingsetup.py
, when invokingsetup.py
sosetuptools doesn'"'"'t think the script is
-c
. This avoids the following warning:manifest_maker: standard file '"'"'-c'"'"' not found".
- It generates a shim setup.py, for handling setup.cfg-only projects.
import os, sys, tokenize
try:
import setuptools
except ImportError as error:
print(
"ERROR: Can not execute
setup.py
since setuptools is not available in ""the build environment.",
file=sys.stderr,
)
sys.exit(1)
file = %r
sys.argv[0] = file
if os.path.exists(file):
filename = file
with tokenize.open(file) as f:
setup_py_code = f.read()
else:
filename = ""
setup_py_code = "from setuptools import setup; setup()"
exec(compile(setup_py_code, filename, "exec"))
'"'"''"'"''"'"' % ('"'"'/home/USER/flash-attention/setup.py'"'"',), "", "exec"))' bdist_wheel -d /tmp/pip-wheel-cl0_tkkn
cwd: /home/USER/flash-attention/
Building wheel for flash_attn (setup.py) ... error
ERROR: Failed building wheel for flash_attn
`
The text was updated successfully, but these errors were encountered: