You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: content/examples/cuda-hip/README.md
+1Lines changed: 1 addition & 0 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -34,3 +34,4 @@ Here are some suggestions for playing around:
34
34
* implement a code using 1 GPU and do a reduction on a vector
35
35
* based on the [CUDA blog streams](https://developer.nvidia.com/blog/how-overlap-data-transfers-cuda-cc/) tray to implement a code doing vector additions using streams ovelap data transfers and computations.
36
36
* based on the [CUDA blog reduction](https://developer.download.nvidia.com/assets/cuda/files/reduction.pdf) try to implement a code doing a sum reduction.
37
+
* check on LUMI the memory bandwidth in the [memory transpose](https://github.com/ENCCS/gpu-programming/tree/main/content/examples/cuda-hip/hip/04_matrix_transpose) example.
0 commit comments