Skip to content

Commit d38955c

Browse files
committed
OROCHI 2.0
- support a lot more CUDA/HIP functions compared to OROCHI 1. Should be almost exhaustive. - we will keep one branch per version of CUDA/HIP, (example of branch name: 'release/hip5.7_cuda12.2'), so developers can switch on branches depending on their environment. If you need a combination that doesn't exist, open a Issue. - Change compared to OROCHI 1: you need to install the CUDA SDK corresponding to the branch you are using. for example, if you use branch release/hip5.7_cuda12.2, install CUDA SDK 12.2. However CUDA will still be dynamically loaded at runtime, only includes of the SDK are used at compile time. - new Demo for Textures - new Demo for D3D12 interop - some refactoring/improvement of OrochiUtils was done. - Orochi.h can be included in the kernel files to have the oro* names - The binding and naming between HIP/CUDA has been improved and developed in a way it should be easier to maintain for future versions. - most Orochi/OrochiUtils API has not been changed so updating the project from Orochi 1.0 to 2.0 should be straightforward.
1 parent 65de35c commit d38955c

File tree

125 files changed

+33087
-6214
lines changed

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

125 files changed

+33087
-6214
lines changed

.gitignore

+4
Original file line numberDiff line numberDiff line change
@@ -21,3 +21,7 @@ dist/**
2121
.DS_Store
2222
.vs/
2323
build/
24+
25+
result.xml
26+
UnitTest/bitcodes/*.fatbin
27+
Test/SimpleD3D12/cache/**

Orochi/GpuMemory.h

+48
Original file line numberDiff line numberDiff line change
@@ -1,3 +1,25 @@
1+
//
2+
// Copyright (c) 2021-2024 Advanced Micro Devices, Inc. All rights reserved.
3+
//
4+
// Permission is hereby granted, free of charge, to any person obtaining a copy
5+
// of this software and associated documentation files (the "Software"), to deal
6+
// in the Software without restriction, including without limitation the rights
7+
// to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
8+
// copies of the Software, and to permit persons to whom the Software is
9+
// furnished to do so, subject to the following conditions:
10+
//
11+
// The above copyright notice and this permission notice shall be included in
12+
// all copies or substantial portions of the Software.
13+
//
14+
// THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
15+
// IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
16+
// FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
17+
// AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
18+
// LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
19+
// OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN
20+
// THE SOFTWARE.
21+
//
22+
123
#pragma once
224

325
#include <Orochi/OrochiUtils.h>
@@ -91,9 +113,35 @@ class GpuMemory final
91113
*this = std::move( tmp );
92114
}
93115

116+
/// @brief Asynchronous version of 'resize' using a given Orochi stream.
117+
/// @param new_size The new memory size after the function is called.
118+
/// @param copy If true, the function will copy the data to the newly created memory space as well.
119+
/// @param stream The Orochi stream used for the underlying operations.
120+
void resizeAsync( const size_t new_size, const bool copy = false, oroStream stream = 0 ) noexcept
121+
{
122+
if( new_size <= m_capacity )
123+
{
124+
m_size = new_size;
125+
return;
126+
}
127+
128+
GpuMemory tmp( new_size );
129+
130+
if( copy )
131+
{
132+
OrochiUtils::copyDtoDAsync( tmp.m_data, m_data, m_size, stream );
133+
}
134+
135+
*this = std::move( tmp );
136+
}
137+
94138
/// @brief Reset the memory space so that all bits inside are cleared to zero.
95139
void reset() noexcept { OrochiUtils::memset( m_data, 0, m_size * sizeof( T ) ); }
96140

141+
/// @brief Asynchronous version of 'reset' using a given Orochi stream.
142+
/// @param stream The Orochi stream used for the underlying operations.
143+
void resetAsync( oroStream stream = 0 ) noexcept { OrochiUtils::memsetAsync( m_data, 0, m_size * sizeof( T ), stream ); }
144+
97145
/// @brief Copy the data from device memory to host.
98146
/// @param host_ptr The host pointer.
99147
/// @param host_data_size The size of the host memory which represents the number of elements.

Orochi/Orochi.cpp

+3,320-410
Large diffs are not rendered by default.

Orochi/Orochi.h

+1,199-703
Large diffs are not rendered by default.

0 commit comments

Comments
 (0)