Add PCIe bandwidth measurement test #3

gsitaram · 2022-10-01T00:27:12Z

Hi @Luke20000429, @joydddd, you can use this simple standalone to measure the bandwidth achieved over PCIe and compare and contrast transfers of

1 large buffer vs multiple small buffers
using pinned memory vs pageable memory
using hipMemcpy vs hipMemcpyAsync
There is a convenient run script that you can use to tune your sweep over various parameter values.

My conclusions are the following:

The performance gets close to peak and is the same whether you transfer a large buffer of size 128MB or 16 small buffers of size 8MB.
Using pinned memory is better even for hipMemcpy
The performance of hipMemcpyAsync seems to be better even if we just transfer one time (i.e., iter=1)
Performance fluctuates when we test on the GPU in our workstation, it is more stable when testing a GPU on a server.

YMMV, so it is best to test on your end with the cards you have access to.

ooreilly · 2022-10-17T16:51:59Z

Is this code relevant for ksw2? I don't see any dependencies on ksw2.

Add PCIe bandwidth measurement test

2aaac58

Provide feedback