Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Issue: GFX1151 fails to build. Undeclared identifier 'CK_TILE_BUFFER_RESOURCE_3RD_DWORD' #111

Open
kiram9 opened this issue Dec 17, 2024 · 1 comment

Comments

@kiram9
Copy link

kiram9 commented Dec 17, 2024

Problem Description

OS:
NAME="Ubuntu"
VERSION="24.04.1 LTS (Noble Numbat)"
6.11.0-1009-oem #9-Ubuntu SMP PREEMPT_DYNAMIC Wed Nov 27 04:51:58 UTC 2024 x86_64 x86_64 x86_64 GNU/Linux

I am trying to build flash attention on rocm6.3. However there is a compiler error while trying to build flash-attention for my GPU arch.

Operating System

Ubuntu 24.04.1

CPU

GFX1151

GPU

GFX1151

ROCm Version

ROCm 6.3.0

ROCm Component

ROCm

Steps to Reproduce

Create a venv in python and activated it.

I have installed
ROCm using the prebuilt packages on https://rocm.docs.amd.com/projects/install-on-linux/en/latest/install/quick-start.html

Pytorch:
pip3 install --pre torch torchvision torchaudio --index-url https://download.pytorch.org/whl/nightly/rocm6.3/

flash_attention using the commands:
https://rocm.blogs.amd.com/artificial-intelligence/flash-attention/README.html
git clone --recursive https://github.com/ROCm/flash-attention.git
cd flash-attention
MAX_JOBS=$((nproc - 1)) pip install -v .

(Optional for Linux users) Output of /opt/rocm/bin/rocminfo --support

No response

Additional Information

I get the following compiler error when building the python package:
`
building 'flash_attn_2_cuda' extension
creating build/temp.linux-x86_64-cpython-312/build
creating build/temp.linux-x86_64-cpython-312/csrc/flash_attn_ck
/opt/rocm-6.3.0/bin/hipcc -I/home/USER/flash-attention/csrc/composable_kernel/include -I/home/USER/flash-attention/csrc/composable_kernel/library/include -I/home/USER/flash-attention/csrc/composable_kernel/example/ck_tile/01_fmha -I/home/USER/HunyuanVideo/venv/lib/python3.12/site-packages/torch/include -I/home/USER/HunyuanVideo/venv/lib/python3.12/site-packages/torch/include/torch/csrc/api/include -I/home/USER/HunyuanVideo/venv/lib/python3.12/site-packages/torch/include/TH -I/home/USER/HunyuanVideo/venv/lib/python3.12/site-packages/torch/include/THC -I/home/USER/HunyuanVideo/venv/lib/python3.12/site-packages/torch/include/THH -I/opt/rocm-6.3.0/include -I/home/USER/HunyuanVideo/venv/include -I/usr/include/python3.12 -c build/fmha_bwd_api.hip -o build/temp.linux-x86_64-cpython-312/build/fmha_bwd_api.o -fPIC -D__HIP_PLATFORM_AMD__=1 -DUSE_ROCM=1 -DHIPBLAS_V2 -DCUDA_HAS_FP16=1 -D__HIP_NO_HALF_OPERATORS__=1 -D__HIP_NO_HALF_CONVERSIONS__=1 --offload-arch=native -O3 -std=c++17 -DCK_TILE_FMHA_FWD_FAST_EXP2=1 -fgpu-flush-denormals-to-zero -DCK_ENABLE_BF16 -DCK_ENABLE_BF8 -DCK_ENABLE_FP16 -DCK_ENABLE_FP32 -DCK_ENABLE_FP64 -DCK_ENABLE_FP8 -DCK_ENABLE_INT8 -DCK_USE_XDL -DUSE_PROF_API=1 -D__HIP_PLATFORM_HCC__=1 -DCK_TILE_FLOAT_TO_BFLOAT16_DEFAULT=3 -fno-offload-uniform-block -mllvm -enable-post-misched=0 -mllvm -amdgpu-early-inline-all=true -mllvm -amdgpu-function-calls=false -mllvm -amdgpu-coerce-illegal-types=1 -DTORCH_API_INCLUDE_EXTENSION_H -DPYBIND11_COMPILER_TYPE="gcc" -DPYBIND11_STDLIB="libstdcpp" -DPYBIND11_BUILD_ABI="cxxabi1011" -DTORCH_EXTENSION_NAME=flash_attn_2_cuda -D_GLIBCXX_USE_CXX11_ABI=0 -fno-gpu-rdc
In file included from build/fmha_bwd_api.hip:6:
In file included from /home/USER/flash-attention/csrc/composable_kernel/example/ck_tile/01_fmha/fmha_bwd_hip.hpp:7:
In file included from /home/USER/flash-attention/csrc/composable_kernel/include/ck_tile/core_hip.hpp:11:
/home/USER/flash-attention/csrc/composable_kernel/include/ck_tile/core/arch/amd_buffer_addressing_hip.hpp:29:36: error: use of undeclared identifier 'CK_TILE_BUFFER_RESOURCE_3RD_DWORD'
29 | buffer_resource res{ptr, size, CK_TILE_BUFFER_RESOURCE_3RD_DWORD};
| ^
1 error generated when compiling for gfx1151.
failed to execute:/opt/rocm-6.3.0/lib/llvm/bin/clang++ --offload-arch=native -I/home/USER/flash-attention/csrc/composable_kernel/include -I/home/USER/flash-attention/csrc/composable_kernel/library/include -I/home/USER/flash-attention/csrc/composable_kernel/example/ck_tile/01_fmha -I/home/USER/HunyuanVideo/venv/lib/python3.12/site-packages/torch/include -I/home/USER/HunyuanVideo/venv/lib/python3.12/site-packages/torch/include/torch/csrc/api/include -I/home/USER/HunyuanVideo/venv/lib/python3.12/site-packages/torch/include/TH -I/home/USER/HunyuanVideo/venv/lib/python3.12/site-packages/torch/include/THC -I/home/USER/HunyuanVideo/venv/lib/python3.12/site-packages/torch/include/THH -I/opt/rocm-6.3.0/include -I/home/USER/HunyuanVideo/venv/include -I/usr/include/python3.12 -c -x hip build/fmha_bwd_api.hip -o "build/temp.linux-x86_64-cpython-312/build/fmha_bwd_api.o" -fPIC -D__HIP_PLATFORM_AMD
=1 -DUSE_ROCM=1 -DHIPBLAS_V2 -DCUDA_HAS_FP16=1 -D__HIP_NO_HALF_OPERATORS
_=1 -D__HIP_NO_HALF_CONVERSIONS__=1 -O3 -std=c++17 -DCK_TILE_FMHA_FWD_FAST_EXP2=1 -fgpu-flush-denormals-to-zero -DCK_ENABLE_BF16 -DCK_ENABLE_BF8 -DCK_ENABLE_FP16 -DCK_ENABLE_FP32 -DCK_ENABLE_FP64 -DCK_ENABLE_FP8 -DCK_ENABLE_INT8 -DCK_USE_XDL -DUSE_PROF_API=1 -D__HIP_PLATFORM_HCC__=1 -DCK_TILE_FLOAT_TO_BFLOAT16_DEFAULT=3 -fno-offload-uniform-block -mllvm -enable-post-misched=0 -mllvm -amdgpu-early-inline-all=true -mllvm -amdgpu-function-calls=false -mllvm -amdgpu-coerce-illegal-types=1 -DTORCH_API_INCLUDE_EXTENSION_H -DPYBIND11_COMPILER_TYPE="_gcc" -DPYBIND11_STDLIB="_libstdcpp" -DPYBIND11_BUILD_ABI="_cxxabi1011" -DTORCH_EXTENSION_NAME=flash_attn_2_cuda -D_GLIBCXX_USE_CXX11_ABI=0 -fno-gpu-rdc
error: command '/opt/rocm-6.3.0/bin/hipcc' failed with exit code 1
error: subprocess-exited-with-error

× python setup.py bdist_wheel did not run successfully.
│ exit code: 1
╰─> See above for output.

note: This error originates from a subprocess, and is likely not a problem with pip.
full command: /home/USER/HunyuanVideo/venv/bin/python3 -u -c '
exec(compile('"'"''"'"''"'"'

This is -- a caller that pip uses to run setup.py

- It imports setuptools before invoking setup.py, to enable projects that directly

import from distutils.core to work with newer packaging standards.

- It provides a clear error message when setuptools is not installed.

- It sets sys.argv[0] to the underlying setup.py, when invoking setup.py so

setuptools doesn'"'"'t think the script is -c. This avoids the following warning:

manifest_maker: standard file '"'"'-c'"'"' not found".

- It generates a shim setup.py, for handling setup.cfg-only projects.

import os, sys, tokenize

try:
import setuptools
except ImportError as error:
print(
"ERROR: Can not execute setup.py since setuptools is not available in "
"the build environment.",
file=sys.stderr,
)
sys.exit(1)

file = %r
sys.argv[0] = file

if os.path.exists(file):
filename = file
with tokenize.open(file) as f:
setup_py_code = f.read()
else:
filename = ""
setup_py_code = "from setuptools import setup; setup()"

exec(compile(setup_py_code, filename, "exec"))
'"'"''"'"''"'"' % ('"'"'/home/USER/flash-attention/setup.py'"'"',), "", "exec"))' bdist_wheel -d /tmp/pip-wheel-cl0_tkkn
cwd: /home/USER/flash-attention/
Building wheel for flash_attn (setup.py) ... error
ERROR: Failed building wheel for flash_attn

`

@kiram9 kiram9 changed the title GFX1151 fails to build: GFX1151 fails to build: undeclared identifier 'CK_TILE_BUFFER_RESOURCE_3RD_DWORD' Dec 17, 2024
@kiram9 kiram9 changed the title GFX1151 fails to build: undeclared identifier 'CK_TILE_BUFFER_RESOURCE_3RD_DWORD' Issue: GFX1151 fails to build. Undeclared identifier 'CK_TILE_BUFFER_RESOURCE_3RD_DWORD' Dec 17, 2024
@zichguan-amd
Copy link

Hi @kiram9, CK backend is for MI cards only, have you tried to use the triton backend https://github.com/ROCm/flash-attention?tab=readme-ov-file#getting-started?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants