Skip to content

Commit 1594a91

Browse files
authored
Update README.md
1 parent 1fd2414 commit 1594a91

File tree

1 file changed

+1
-1
lines changed
  • content/examples/cuda-hip/hip/04_matrix_transpose

1 file changed

+1
-1
lines changed

content/examples/cuda-hip/hip/04_matrix_transpose/README.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -31,7 +31,7 @@ __global__ void transpose__naive_kernel(float *in, float *out, int width, int he
3131
out[out_index] = in[in_index];
3232
}
3333
```
34-
The index `in_index` increases with `threadIdx.x`, two adjacent threads, `threadIdx.x` and `threadIdx.x+1`, access elements near each other in the gloabl memory. This ensures coalesced reads. On the other hand the writing is strided. Two adjacent threads write to location in memory far away from each other by `height`.
34+
The index `in_index` increases with `threadIdx.x`, two adjacent threads, `threadIdx.x` and `threadIdx.x+1`, access elements near each other in the global memory. This ensures coalesced reads. On the other hand the writing is strided. Two adjacent threads write to location in memory far away from each other by `height`.
3535

3636
## Transpose with shared memory
3737
Shared Memory (SM) can be used in order to avoid the uncoalesced writing mentioned above.

0 commit comments

Comments
 (0)