Skip to content

Merge release/2.6_ck_2 to release/2.6 #2012

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 2 commits into from

Conversation

akashveramd
Copy link

Since release/2.6_ck is a protected branch, hence, to push more commits, I have a create new PR. Because I can't push new commits to the existing open PR- #2007

This PR includes-
On top of the features that exist in open PR (#2007), added USE_CK_FLASH_ATTENTION as a cmake variable and fixed lint/NIT errors using lintrunner.

…eate link target. Enable USE_CK_FLASH_ATTENTION based on USE_FLASH_ATTENTION option.
@rocm-repo-management-api
Copy link

rocm-repo-management-api bot commented Apr 1, 2025

Jenkins build for 6943678cdc3d79c37906161e88df783828f8d403 commit finished as FAILURE
Links: Blue Ocean view / Build artifacts

Detected error during Pytorch building:

[7792/7892] Building CXX object caffe2/torch/CMakeFiles/torch_python.dir/csrc/jit/runtime/register_distributed_ops.cpp.o
cc1plus: warning: command-line option ‘-Wno-duplicate-decl-specifier’ is valid for C/ObjC but not for C++
[7793/7892] Building CXX object caffe2/torch/CMakeFiles/torch_python.dir/csrc/utils/python_dispatch.cpp.o
cc1plus: warning: command-line option ‘-Wno-duplicate-decl-specifier’ is valid for C/ObjC but not for C++
[7794/7892] Linking CXX executable bin/static_runtime_bench
FAILED: bin/static_runtime_bench 
: && /opt/cache/bin/c++ -D_GLIBCXX_USE_CXX11_ABI=1 -fvisibility-inlines-hidden -DUSE_PTHREADPOOL -DNDEBUG -DUSE_KINETO -DLIBKINETO_NOCUPTI -DLIBKINETO_NOXPUPTI=ON -DUSE_FBGEMM -DUSE_PYTORCH_QNNPACK -DUSE_XNNPACK -DSYMBOLICATE_MOBILE_DEBUG_HANDLE -O2 -fPIC -Wall -Wextra -Werror=return-type -Werror=non-virtual-dtor -Werror=range-loop-construct -Werror=bool-operation -Wnarrowing -Wno-missing-field-initializers -Wno-type-limits -Wno-array-bounds -Wno-unknown-pragmas -Wno-unused-parameter -Wno-strict-overflow -Wno-strict-aliasing -Wno-stringop-overflow -Wsuggest-override -Wno-psabi -Wno-error=old-style-cast -Wno-missing-braces -fdiagnostics-color=always -faligned-new -Wno-unused-but-set-variable -Wno-maybe-uninitialized -fno-math-errno -fno-trapping-math -Werror=format -Wno-stringop-overflow -DHAVE_AVX512_CPU_DEFINITION -DHAVE_AVX2_CPU_DEFINITION -O3 -DNDEBUG -DNDEBUG -rdynamic     -Wl,--no-as-needed caffe2/CMakeFiles/static_runtime_bench.dir/__/benchmarks/static_runtime/deep_wide_pt.cc.o caffe2/CMakeFiles/static_runtime_bench.dir/__/benchmarks/static_runtime/deep_wide_pt_bench.cc.o -o bin/static_runtime_bench -L/lib/intel64   -L/lib/intel64_win   -L/lib/win-x64 -Wl,-rpath,/lib/intel64:/lib/intel64_win:/lib/win-x64:/opt/conda/envs/py_3.10/lib:/var/lib/jenkins/pytorch/build/lib:/opt/rocm-6.3.4/lib:/opt/rocm/lib  lib/libbenchmark.a  -Wl,--no-as-needed,"/var/lib/jenkins/pytorch/build/lib/libtorch.so" -Wl,--as-needed  -Wl,--no-as-needed,"/var/lib/jenkins/pytorch/build/lib/libtorch_cpu.so" -Wl,--as-needed  lib/libprotobuf.a  /opt/conda/envs/py_3.10/lib/libmkl_intel_lp64.so  /opt/conda/envs/py_3.10/lib/libmkl_gnu_thread.so  /opt/conda/envs/py_3.10/lib/libmkl_core.so  -fopenmp  /usr/lib/x86_64-linux-gnu/libpthread.a  -lm  /usr/lib/x86_64-linux-gnu/libdl.a  -Wl,--no-as-needed,"/var/lib/jenkins/pytorch/build/lib/libtorch_hip.so" -Wl,--as-needed  lib/libc10_hip.so  lib/libc10.so  /opt/rocm-6.3.4/lib/libMIOpen.so.1.0.60304  /opt/rocm/lib/libhiprtc.so.6.3.60304  -ldl  /opt/rocm-6.3.4/lib/libhipblas.so.2.3.60304  /opt/rocm-6.3.4/lib/libhipfft.so.0.1.60304  /opt/rocm-6.3.4/lib/libhiprand.so.1.1.60304  /opt/rocm-6.3.4/lib/librocrand.so.1.1.60304  /opt/rocm-6.3.4/lib/libhipsparse.so.1.1.0.60304  /opt/rocm-6.3.4/lib/libhipsolver.so.0.3.60304  /opt/rocm-6.3.4/lib/libhipblaslt.so.0.10.60304  /opt/rocm/lib/libamdhip64.so.6.3.60304  -lrt  -Wl,-rpath-link,/opt/rocm-6.3.4/lib && /opt/conda/envs/py_3.10/bin/cmake -E __run_co_compile --lwyu="ldd;-u;-r" --source=bin/static_runtime_bench && :
/usr/bin/ld: /var/lib/jenkins/pytorch/build/lib/libtorch_hip.so: undefined reference to `fmha_bwd(fmha_bwd_traits, fmha_bwd_args, ck_tile::stream_config const&)'
/usr/bin/ld: /var/lib/jenkins/pytorch/build/lib/libtorch_hip.so: undefined reference to `fmha_fwd(fmha_fwd_traits, fmha_fwd_args, ck_tile::stream_config const&)'
collect2: error: ld returned 1 exit status
[7795/7892] Linking CXX executable bin/MaybeOwned_test

@rocm-repo-management-api
Copy link

rocm-repo-management-api bot commented Apr 1, 2025

Jenkins build for 6943678cdc3d79c37906161e88df783828f8d403 commit finished as FAILURE
Links: Blue Ocean view / Build artifacts

Detected error during Pytorch building:

[7797/7892] Linking CXX shared library lib/libbackend_with_compiler.so
Warning: Unused direct dependencies:
	/var/lib/jenkins/pytorch/build/lib/libtorch.so
	/var/lib/jenkins/pytorch/build/lib/libtorch_hip.so
[7798/7892] Linking CXX executable bin/dispatch_key_set_test
FAILED: bin/dispatch_key_set_test 
: && /opt/cache/bin/c++ -D_GLIBCXX_USE_CXX11_ABI=1 -fvisibility-inlines-hidden -DUSE_PTHREADPOOL -DNDEBUG -DUSE_KINETO -DLIBKINETO_NOCUPTI -DLIBKINETO_NOXPUPTI=ON -DUSE_FBGEMM -DUSE_PYTORCH_QNNPACK -DUSE_XNNPACK -DSYMBOLICATE_MOBILE_DEBUG_HANDLE -O2 -fPIC -Wall -Wextra -Werror=return-type -Werror=non-virtual-dtor -Werror=range-loop-construct -Werror=bool-operation -Wnarrowing -Wno-missing-field-initializers -Wno-type-limits -Wno-array-bounds -Wno-unknown-pragmas -Wno-unused-parameter -Wno-strict-overflow -Wno-strict-aliasing -Wno-stringop-overflow -Wsuggest-override -Wno-psabi -Wno-error=old-style-cast -Wno-missing-braces -fdiagnostics-color=always -faligned-new -Wno-unused-but-set-variable -Wno-maybe-uninitialized -fno-math-errno -fno-trapping-math -Werror=format -Wno-stringop-overflow -DHAVE_AVX512_CPU_DEFINITION -DHAVE_AVX2_CPU_DEFINITION -O3 -DNDEBUG -DNDEBUG -rdynamic     -Wl,--no-as-needed caffe2/CMakeFiles/dispatch_key_set_test.dir/__/aten/src/ATen/test/dispatch_key_set_test.cpp.o -o bin/dispatch_key_set_test -L/lib/intel64   -L/lib/intel64_win   -L/lib/win-x64 -Wl,-rpath,/lib/intel64:/lib/intel64_win:/lib/win-x64:/opt/conda/envs/py_3.10/lib:/var/lib/jenkins/pytorch/build/lib:/opt/rocm-6.3.4/lib:/opt/rocm/lib:  lib/libgtest_main.a  -lstdc++  -Wl,--no-as-needed,"/var/lib/jenkins/pytorch/build/lib/libtorch.so" -Wl,--as-needed  -Wl,--no-as-needed,"/var/lib/jenkins/pytorch/build/lib/libtorch_cpu.so" -Wl,--as-needed  lib/libprotobuf.a  /opt/conda/envs/py_3.10/lib/libmkl_intel_lp64.so  /opt/conda/envs/py_3.10/lib/libmkl_gnu_thread.so  /opt/conda/envs/py_3.10/lib/libmkl_core.so  -fopenmp  /usr/lib/x86_64-linux-gnu/libpthread.a  -lm  /usr/lib/x86_64-linux-gnu/libdl.a  -Wl,--no-as-needed,"/var/lib/jenkins/pytorch/build/lib/libtorch_hip.so" -Wl,--as-needed  lib/libc10_hip.so  lib/libc10.so  /opt/rocm-6.3.4/lib/libMIOpen.so.1.0.60304  /opt/rocm/lib/libhiprtc.so.6.3.60304  -ldl  /opt/rocm-6.3.4/lib/libhipblas.so.2.3.60304  /opt/rocm-6.3.4/lib/libhipfft.so.0.1.60304  /opt/rocm-6.3.4/lib/libhiprand.so.1.1.60304  /opt/rocm-6.3.4/lib/librocrand.so.1.1.60304  /opt/rocm-6.3.4/lib/libhipsparse.so.1.1.0.60304  /opt/rocm-6.3.4/lib/libhipsolver.so.0.3.60304  /opt/rocm-6.3.4/lib/libhipblaslt.so.0.10.60304  /opt/rocm/lib/libamdhip64.so.6.3.60304  lib/libgtest.a  -Wl,-rpath-link,/opt/rocm-6.3.4/lib && /opt/conda/envs/py_3.10/bin/cmake -E __run_co_compile --lwyu="ldd;-u;-r" --source=bin/dispatch_key_set_test && :
/usr/bin/ld: /var/lib/jenkins/pytorch/build/lib/libtorch_hip.so: undefined reference to `fmha_bwd(fmha_bwd_traits, fmha_bwd_args, ck_tile::stream_config const&)'
/usr/bin/ld: /var/lib/jenkins/pytorch/build/lib/libtorch_hip.so: undefined reference to `fmha_fwd(fmha_fwd_traits, fmha_fwd_args, ck_tile::stream_config const&)'
collect2: error: ld returned 1 exit status
[7799/7892] Linking CXX executable bin/kernel_function_test

@rocm-repo-management-api
Copy link

rocm-repo-management-api bot commented Apr 2, 2025

Jenkins build for 6943678cdc3d79c37906161e88df783828f8d403 commit finished as FAILURE
Links: Blue Ocean view / Build artifacts

Detected error during Pytorch building:

                 from /var/lib/jenkins/pytorch/aten/src/ATen/hip/HIPSparseDescriptors.cpp:2:
/opt/rocm-6.3.4/include/hipsparse/hipsparse.h:13781:5: note: declared here
13781 |     HIPSPARSE_ORDER_COLUMN HIPSPARSE_DEPRECATED_MSG("Please use HIPSPARSE_ORDER_COL instead")
      |     ^~~~~~~~~~~~~~~~~~~~~~
[7343/7892] Building CXX object caffe2/CMakeFiles/torch_hip.dir/__/aten/src/ATen/native/hip/Blas.cpp.o
FAILED: caffe2/CMakeFiles/torch_hip.dir/__/aten/src/ATen/native/hip/Blas.cpp.o 
/opt/cache/bin/sccache /opt/cache/bin/c++ -DAT_PER_OPERATOR_HEADERS -DFLASHATTENTION_DISABLE_ALIBI -DFMT_HEADER_ONLY=1 -DHAVE_MALLOC_USABLE_SIZE=1 -DHAVE_MMAP=1 -DHAVE_SHM_OPEN=1 -DHAVE_SHM_UNLINK=1 -DIDEEP_USE_MKL -DMINIZ_DISABLE_ZIP_READER_CRC32_CHECKS -DONNXIFI_ENABLE_EXT=1 -DONNX_ML=1 -DONNX_NAMESPACE=onnx_torch -DPYTORCH_LAYERNORM_FAST_RECIPROCAL -DROCM_VERSION=60304 -DTORCH_ENABLE_LLVM -DTORCH_HIP_BUILD_MAIN_LIB -DTORCH_HIP_VERSION=603 -DUSE_C10D_GLOO -DUSE_C10D_NCCL -DUSE_CK_FLASH_ATTENTION -DUSE_DISTRIBUTED -DUSE_EXTERNAL_MZCRC -DUSE_FLASH_ATTENTION -DUSE_MEM_EFF_ATTENTION -DUSE_NCCL -DUSE_PROF_API=1 -DUSE_ROCM -DUSE_RPC -DUSE_TENSORPIPE -D_FILE_OFFSET_BITS=64 -D__HIP_PLATFORM_AMD__ -D__HIP_PLATFORM_AMD__=1 -Dtorch_hip_EXPORTS -I/var/lib/jenkins/pytorch/build/aten/src -I/var/lib/jenkins/pytorch/aten/src -I/var/lib/jenkins/pytorch/build -I/var/lib/jenkins/pytorch -I/var/lib/jenkins/pytorch/cmake/../third_party/benchmark/include -I/opt/llvm/include -I/var/lib/jenkins/pytorch/third_party/onnx -I/var/lib/jenkins/pytorch/build/third_party/onnx -I/var/lib/jenkins/pytorch/nlohmann -I/opt/rocm/hcc/include -I/opt/rocm/rocblas/include -I/opt/rocm/hipsparse/include -I/opt/rocm/include/rccl -I/var/lib/jenkins/pytorch/aten/src/THH -I/var/lib/jenkins/pytorch/aten/src/ATen/hip -I/var/lib/jenkins/pytorch/aten/src/ATen/../../../third_party/composable_kernel/include -I/var/lib/jenkins/pytorch/aten/src/ATen/../../../third_party/composable_kernel/library/include -I/var/lib/jenkins/pytorch/third_party/fmt/include -I/var/lib/jenkins/pytorch/aten/src/ATen/native/transformers/hip/flash_attn/ck -I/var/lib/jenkins/pytorch/build/caffe2/aten/src -I/var/lib/jenkins/pytorch/aten/src/ATen/.. -I/var/lib/jenkins/pytorch/torch/include -I/var/lib/jenkins/pytorch/c10/hip/../.. -I/var/lib/jenkins/pytorch/c10/.. -I/var/lib/jenkins/pytorch/torch/csrc/api -I/var/lib/jenkins/pytorch/torch/csrc/api/include -I/var/lib/jenkins/pytorch/build/third_party/gloo/hip -isystem /opt/rocm-6.3.4/include -isystem /var/lib/jenkins/pytorch/build/third_party/gloo -isystem /var/lib/jenkins/pytorch/cmake/../third_party/gloo -isystem /var/lib/jenkins/pytorch/cmake/../third_party/tensorpipe/third_party/libuv/include -isystem /var/lib/jenkins/pytorch/cmake/../third_party/googletest/googlemock/include -isystem /var/lib/jenkins/pytorch/cmake/../third_party/googletest/googletest/include -isystem /var/lib/jenkins/pytorch/third_party/protobuf/src -isystem /opt/conda/envs/py_3.10/include -isystem /var/lib/jenkins/pytorch/third_party/XNNPACK/include -isystem /var/lib/jenkins/pytorch/third_party/ittapi/include -isystem /var/lib/jenkins/pytorch/cmake/../third_party/eigen -isystem /var/lib/jenkins/pytorch/third_party/ideep/mkl-dnn/include/oneapi/dnnl -isystem /var/lib/jenkins/pytorch/third_party/ideep/include -isystem /var/lib/jenkins/pytorch/INTERFACE -isystem /var/lib/jenkins/pytorch/third_party/nlohmann/include -isystem /opt/rocm/include -isystem /opt/rocm-6.3.4/include/hiprand -isystem /opt/rocm-6.3.4/include/rocrand -isystem /opt/rocm/magma/include -D_GLIBCXX_USE_CXX11_ABI=1 -fvisibility-inlines-hidden -DUSE_PTHREADPOOL -DNDEBUG -DUSE_KINETO -DLIBKINETO_NOCUPTI -DLIBKINETO_NOXPUPTI=ON -DUSE_FBGEMM -DUSE_PYTORCH_QNNPACK -DUSE_XNNPACK -DSYMBOLICATE_MOBILE_DEBUG_HANDLE -O2 -fPIC -Wall -Wextra -Werror=return-type -Werror=non-virtual-dtor -Werror=range-loop-construct -Werror=bool-operation -Wnarrowing -Wno-missing-field-initializers -Wno-type-limits -Wno-array-bounds -Wno-unknown-pragmas -Wno-unused-parameter -Wno-strict-overflow -Wno-strict-aliasing -Wno-stringop-overflow -Wsuggest-override -Wno-psabi -Wno-error=old-style-cast -Wno-missing-braces -fdiagnostics-color=always -faligned-new -Wno-unused-but-set-variable -Wno-maybe-uninitialized -fno-math-errno -fno-trapping-math -Werror=format -Wno-stringop-overflow -DHAVE_AVX512_CPU_DEFINITION -DHAVE_AVX2_CPU_DEFINITION -O3 -DNDEBUG -DNDEBUG -std=gnu++17 -fPIC -DMKL_HAS_SBGEMM -DTORCH_USE_LIBUV -DCAFFE2_USE_GLOO -Wall -Wextra -Wdeprecated -Wno-unused-parameter -Wno-missing-field-initializers -Wno-array-bounds -Wno-unknown-pragmas -Wno-strict-overflow -Wno-strict-aliasing -Wunused-function -Wunused-variable -Wunused-but-set-variable -Wno-maybe-uninitialized -fvisibility=hidden -O2 -fPIC -D__HIP_PLATFORM_AMD__=1 -DCUDA_HAS_FP16=1 -DUSE_ROCM -D__HIP_NO_HALF_OPERATORS__=1 -D__HIP_NO_HALF_CONVERSIONS__=1 -DTORCH_HIP_VERSION=603 -Wno-shift-count-negative -Wno-shift-count-overflow -Wno-duplicate-decl-specifier -DCAFFE2_USE_MIOPEN -DTHRUST_DEVICE_SYSTEM=THRUST_DEVICE_SYSTEM_HIP -std=c++17 -DHIPBLAS_V2 -DHIPBLASLT_VEC_EXT -D_GLIBCXX_USE_CXX11_ABI=1 -DHIP_ENABLE_WARP_SYNC_BUILTINS -DHIP_VERSION=6 -DUSE_MIOPEN -MD -MT caffe2/CMakeFiles/torch_hip.dir/__/aten/src/ATen/native/hip/Blas.cpp.o -MF caffe2/CMakeFiles/torch_hip.dir/__/aten/src/ATen/native/hip/Blas.cpp.o.d -o caffe2/CMakeFiles/torch_hip.dir/__/aten/src/ATen/native/hip/Blas.cpp.o -c /var/lib/jenkins/pytorch/aten/src/ATen/native/hip/Blas.cpp
cc1plus: warning: command-line option ‘-Wno-duplicate-decl-specifier’ is valid for C/ObjC but not for C++
In file included from /var/lib/jenkins/pytorch/aten/src/ATen/native/hip/Blas.cpp:15:
/var/lib/jenkins/pytorch/aten/src/ATen/hip/tunable/TunableGemm.h:25:10: fatal error: c10/util/Float8_e8m0fnu.h: No such file or directory
   25 | #include <c10/util/Float8_e8m0fnu.h>

@akashveramd akashveramd closed this Apr 3, 2025
@akashveramd akashveramd deleted the release/2.6_ck_2 branch April 3, 2025 02:14
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant