Feature
Modern x86 and arm processors support prefetch instructions to ensure memory that's needed in the near future is in the cache.
Benefit
Improved performance for appropriately placed prefetches of needed memory.
Implementation
Since prefetches are nonsemantic, it should be relatively easy to implement. It may need some consideration in regards to spectre and the like though (which I don't know much about) and instruction reordering (to make sure the prefetch stays approximately the same "distance" to the actual load as in the ir).
Alternatives
Calling an external function that contains just the prefetch, which would incur overhead from the call.
Feature
Modern x86 and arm processors support prefetch instructions to ensure memory that's needed in the near future is in the cache.
Benefit
Improved performance for appropriately placed prefetches of needed memory.
Implementation
Since prefetches are nonsemantic, it should be relatively easy to implement. It may need some consideration in regards to spectre and the like though (which I don't know much about) and instruction reordering (to make sure the prefetch stays approximately the same "distance" to the actual load as in the ir).
Alternatives
Calling an external function that contains just the prefetch, which would incur overhead from the call.