Skip to content

Commit f60ea04

Browse files
committed
Chore: update versions for 0.3 release
1 parent f49291a commit f60ea04

File tree

16 files changed

+66
-66
lines changed

16 files changed

+66
-66
lines changed

crates/blastoff/Cargo.toml

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -8,7 +8,7 @@ repository = "https://github.com/Rust-GPU/Rust-CUDA"
88
[dependencies]
99
bitflags = "1.3.2"
1010
cublas_sys = { version = "0.1", path = "../cublas_sys" }
11-
cust = { version = "0.2", path = "../cust", features = ["impl_num_complex"] }
11+
cust = { version = "0.3", path = "../cust", features = ["impl_num_complex"] }
1212
num-complex = "0.4.0"
1313

1414
[package.metadata.docs.rs]

crates/cuda_builder/Cargo.toml

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -1,6 +1,6 @@
11
[package]
22
name = "cuda_builder"
3-
version = "0.2.0"
3+
version = "0.3.0"
44
edition = "2021"
55
authors = ["Riccardo D'Ambrosio <[email protected]>", "The rust-gpu Authors"]
66
license = "MIT OR Apache-2.0"
@@ -9,7 +9,7 @@ repository = "https://github.com/Rust-GPU/Rust-CUDA"
99
readme = "../../README.md"
1010

1111
[dependencies]
12-
rustc_codegen_nvvm = { version = "0.2", path = "../rustc_codegen_nvvm" }
12+
rustc_codegen_nvvm = { version = "0.3", path = "../rustc_codegen_nvvm" }
1313
nvvm = { path = "../nvvm", version = "0.1" }
1414
serde = { version = "1.0.130", features = ["derive"] }
1515
serde_json = "1.0.68"

crates/cuda_std/CHANGELOG.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -2,7 +2,7 @@
22

33
Notable changes to this project will be documented in this file.
44

5-
## Unreleased
5+
## 0.2.2 - 2/7/22
66

77
- Thread/Block/Grid index/dim intrinsics now hint to llvm that their range is in some bound declared by CUDA. Hopefully allowing for more optimizations.
88

crates/cudnn/Cargo.toml

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -6,4 +6,4 @@ version = "0.1.0"
66

77
[dependencies]
88
bitflags = "1.3.2"
9-
cust = {version = "0.2.2", path = "../cust"}
9+
cust = {version = "0.3.0", path = "../cust"}

crates/cust/CHANGELOG.md

Lines changed: 46 additions & 46 deletions
Original file line numberDiff line numberDiff line change
@@ -2,7 +2,7 @@
22

33
Notable changes to this project will be documented in this file.
44

5-
## [Unreleased]
5+
## 0.3.0 - 2/7/22
66

77
### TLDR
88

@@ -31,62 +31,62 @@ pull in `cust_core` in GPU crates for deriving `DeviceCopy` without cfg shenanig
3131

3232
### Removed
3333

34-
- Deleted `DeviceBox::wrap`, use `DeviceBox::from_raw`.
35-
- Deleted `DeviceSlice::as_ptr` and `DeviceSlice::as_mut_ptr`. Use `DeviceSlice::as_device_ptr` then `DevicePointer::as_(mut)_ptr`.
36-
- Deleted `DeviceSlice::chunks` and consequently `DeviceChunks`.
37-
- Deleted `DeviceSlice::chunks_mut` and consequently `DeviceChunksMut`.
38-
- Deleted `DeviceSlice::from_slice` and `DeviceSlice::from_slice_mut` because it was unsound.
39-
- Deleted `DevicePointer::as_raw_mut` (use `DevicePointer::as_mut_ptr`).
40-
- Deleted `DevicePointer::wrap` (use `DevicePointer::from_raw`).
34+
- `DeviceBox::wrap`, use `DeviceBox::from_raw`.
35+
- `DeviceSlice::as_ptr` and `DeviceSlice::as_mut_ptr`. Use `DeviceSlice::as_device_ptr` then `DevicePointer::as_(mut)_ptr`.
36+
- `DeviceSlice::chunks` and consequently `DeviceChunks`.
37+
- `DeviceSlice::chunks_mut` and consequently `DeviceChunksMut`.
38+
- `DeviceSlice::from_slice` and `DeviceSlice::from_slice_mut` because it was unsound.
39+
- `DevicePointer::as_raw_mut` (use `DevicePointer::as_mut_ptr`).
40+
- `DevicePointer::wrap` (use `DevicePointer::from_raw`).
4141
- `DeviceSlice` no longer implements `Index` and `IndexMut`, switching away from `[T]` made this impossible to implement.
4242
Instead you can now use `DeviceSlice::index` which behaves the same.
4343
- `vek` is no longer re-exported.
4444

4545
### Deprecated
4646

47-
- Deprecated `Module::from_str`, use `Module::from_ptx` and pass `&[]` for options.
48-
- Deprecated `Module::load_from_string`, use `Module::from_ptx_cstr`.
47+
- `Module::from_str`, use `Module::from_ptx` and pass `&[]` for options.
48+
- `Module::load_from_string`, use `Module::from_ptx_cstr`.
4949

5050
### Added
5151

52-
- Added `cust::memory::LockedBox`, same as `LockedBuffer` except for single elements.
53-
- Added `cust::memory::cuda_malloc_async`.
54-
- Added `cust::memory::cuda_free_async`.
55-
- Added `impl AsyncCopyDestination<LockedBox<T>> for DeviceBox<T>` for async HtoD/DtoH memcpy.
56-
- Added `DeviceBox::new_async`.
57-
- Added `DeviceBox::drop_async`.
58-
- Added `DeviceBox::zeroed_async`.
59-
- Added `DeviceBox::uninitialized_async`.
60-
- Added `DeviceBuffer::uninitialized_async`.
61-
- Added `DeviceBuffer::drop_async`.
62-
- Added `DeviceBuffer::zeroed`.
63-
- Added `DeviceBuffer::zeroed_async`.
64-
- Added `DeviceBuffer::cast`.
65-
- Added `DeviceBuffer::try_cast`.
66-
- Added `DeviceSlice::set_8` and `DeviceSlice::set_8_async`.
67-
- Added `DeviceSlice::set_16` and `DeviceSlice::set_16_async`.
68-
- Added `DeviceSlice::set_32` and `DeviceSlice::set_32_async`.
69-
- Added `DeviceSlice::set_zero` and `DeviceSlice::set_zero_async`.
70-
- Added the `bytemuck` feature which is enabled by default.
71-
- Added mint integration behind `impl_mint`.
72-
- Added half integration behind `impl_half`.
73-
- Added glam integration behind `impl_glam`.
74-
- Added experimental linux external memory import APIs through `cust::external::ExternalMemory`.
75-
- Added `DeviceBuffer::as_slice`.
76-
- Added `DeviceVariable`, a simple wrapper around `DeviceBox<T>` and `T` which allows easy management of a CPU and GPU version of a type.
77-
- Added `DeviceMemory`, a trait describing any region of GPU memory that can be described with a pointer + a length.
78-
- Added `memcpy_htod`, a wrapper around `cuMemcpyHtoD_v2`.
79-
- Added `mem_get_info` to query the amount of free and total memory.
80-
- Added `DevicePointer::as_ptr` and `DevicePointer::as_mut_ptr` for `*const T` and `*mut T`.
81-
- Added `DevicePointer::from_raw` for `CUdeviceptr -> DevicePointer<T>` with a safe function.
82-
- Added `DevicePointer::cast`.
83-
- Added dependency on `cust_core` for `DeviceCopy`.
84-
- Added `ModuleJitOption`, `JitFallback`, `JitTarget`, and `OptLevel` for specifying options when loading a module. Note that
52+
- `cust::memory::LockedBox`, same as `LockedBuffer` except for single elements.
53+
- `cust::memory::cuda_malloc_async`.
54+
- `cust::memory::cuda_free_async`.
55+
- `impl AsyncCopyDestination<LockedBox<T>> for DeviceBox<T>` for async HtoD/DtoH memcpy.
56+
- `DeviceBox::new_async`.
57+
- `DeviceBox::drop_async`.
58+
- `DeviceBox::zeroed_async`.
59+
- `DeviceBox::uninitialized_async`.
60+
- `DeviceBuffer::uninitialized_async`.
61+
- `DeviceBuffer::drop_async`.
62+
- `DeviceBuffer::zeroed`.
63+
- `DeviceBuffer::zeroed_async`.
64+
- `DeviceBuffer::cast`.
65+
- `DeviceBuffer::try_cast`.
66+
- `DeviceSlice::set_8` and `DeviceSlice::set_8_async`.
67+
- `DeviceSlice::set_16` and `DeviceSlice::set_16_async`.
68+
- `DeviceSlice::set_32` and `DeviceSlice::set_32_async`.
69+
- `DeviceSlice::set_zero` and `DeviceSlice::set_zero_async`.
70+
- the `bytemuck` feature which is enabled by default.
71+
- mint integration behind `impl_mint`.
72+
- half integration behind `impl_half`.
73+
- glam integration behind `impl_glam`.
74+
- experimental linux external memory import APIs through `cust::external::ExternalMemory`.
75+
- `DeviceBuffer::as_slice`.
76+
- `DeviceVariable`, a simple wrapper around `DeviceBox<T>` and `T` which allows easy management of a CPU and GPU version of a type.
77+
- `DeviceMemory`, a trait describing any region of GPU memory that can be described with a pointer + a length.
78+
- `memcpy_htod`, a wrapper around `cuMemcpyHtoD_v2`.
79+
- `mem_get_info` to query the amount of free and total memory.
80+
- `DevicePointer::as_ptr` and `DevicePointer::as_mut_ptr` for `*const T` and `*mut T`.
81+
- `DevicePointer::from_raw` for `CUdeviceptr -> DevicePointer<T>` with a safe function.
82+
- `DevicePointer::cast`.
83+
- dependency on `cust_core` for `DeviceCopy`.
84+
- `ModuleJitOption`, `JitFallback`, `JitTarget`, and `OptLevel` for specifying options when loading a module. Note that
8585
`ModuleJitOption::MaxRegisters` does not seem to work currently, but NVIDIA is looking into it.
8686
You can achieve the same goal by compiling the ptx to cubin using nvcc then loading that: `nvcc --cubin foo.ptx -maxrregcount=REGS`
87-
- Added `Module::from_fatbin`.
88-
- Added `Module::from_cubin`.
89-
- Added `Module::from_ptx` and `Module::from_ptx_cstr`.
87+
- `Module::from_fatbin`.
88+
- `Module::from_cubin`.
89+
- `Module::from_ptx` and `Module::from_ptx_cstr`.
9090
- `Stream`, `Module`, `Linker`, `Function`, `Event`, `UnifiedBox`, `ArrayObject`, `LockedBuffer`, `LockedBox`, `DeviceSlice`, `DeviceBuffer`, and `DeviceBox` all now impl `Send` and `Sync`, this makes
9191
it much easier to write multigpu code. The CUDA API is fully thread-safe except for graph objects.
9292

crates/cust/Cargo.toml

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,6 +1,6 @@
11
[package]
22
name = "cust"
3-
version = "0.2.2"
3+
version = "0.3.0"
44
# Big thanks to the original author of rustacuda <3
55
authors = [
66
"Riccardo D'Ambrosio <[email protected]>",

crates/cust_derive/Cargo.toml

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,6 +1,6 @@
11
[package]
22
name = "cust_derive"
3-
version = "0.1.0"
3+
version = "0.1.1"
44
authors = ["Brook Heisler <[email protected]>", "Riccardo D'Ambrosio <[email protected]>"]
55
edition = "2018"
66
license = "MIT OR Apache-2.0"

crates/gpu_rand/Cargo.toml

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,6 +1,6 @@
11
[package]
22
name = "gpu_rand"
3-
version = "0.1.2"
3+
version = "0.1.3"
44
authors = ["The Rand Project Developers", "The Rust CUDA Project Developers"]
55
license = "MIT OR Apache-2.0"
66
edition = "2021"

crates/optix/Cargo.toml

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -16,7 +16,7 @@ impl_glam=["cust/impl_glam", "glam"]
1616
impl_half=["cust/impl_half", "half"]
1717

1818
[dependencies]
19-
cust = { version = "0.2", path = "../cust", features=["impl_mint"] }
19+
cust = { version = "0.3", path = "../cust", features=["impl_mint"] }
2020
cust_raw = { version = "0.11.2", path = "../cust_raw" }
2121
cfg-if = "1.0.0"
2222
bitflags = "1.3.2"

crates/optix/examples/ex02_pipeline/Cargo.toml

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -13,4 +13,4 @@ device = { path = "./device" }
1313

1414
[build-dependencies]
1515
find_cuda_helper = { version = "0.2", path = "../../../find_cuda_helper" }
16-
cuda_builder = { version = "0.2", path = "../../../cuda_builder" }
16+
cuda_builder = { version = "0.3", path = "../../../cuda_builder" }

crates/optix/examples/ex04_mesh/Cargo.toml

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -16,4 +16,4 @@ glam = { version = "0.20", features=["cuda"] }
1616

1717
[build-dependencies]
1818
find_cuda_helper = { version = "0.2", path = "../../../find_cuda_helper" }
19-
cuda_builder = { version = "0.2", path = "../../../cuda_builder" }
19+
cuda_builder = { version = "0.3", path = "../../../cuda_builder" }

crates/rustc_codegen_nvvm/Cargo.toml

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,6 +1,6 @@
11
[package]
22
name = "rustc_codegen_nvvm"
3-
version = "0.2.3"
3+
version = "0.3.0"
44
authors = [
55
"Riccardo D'Ambrosio <[email protected]>",
66
"The Rust Project Developers",

examples/cuda/cpu/add/Cargo.toml

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -4,8 +4,8 @@ version = "0.1.0"
44
edition = "2021"
55

66
[dependencies]
7-
cust = { version = "0.2", path = "../../../../crates/cust" }
7+
cust = { version = "0.3", path = "../../../../crates/cust" }
88
nanorand = "0.6.1"
99

1010
[build-dependencies]
11-
cuda_builder = { version = "0.2", path = "../../../../crates/cuda_builder" }
11+
cuda_builder = { version = "0.3", path = "../../../../crates/cuda_builder" }

examples/cuda/cpu/path_tracer/Cargo.toml

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -6,7 +6,7 @@ edition = "2018"
66
[dependencies]
77
vek = { version = "0.15", features = ["bytemuck", "mint"] }
88
bytemuck = { version = "1.7.2", features = ["derive"] }
9-
cust = { version = "0.2", path = "../../../../crates/cust", features = ["impl_vek"] }
9+
cust = { version = "0.3", path = "../../../../crates/cust", features = ["impl_vek"] }
1010
image = "0.23.14"
1111
path_tracer_gpu = { path = "../../gpu/path_tracer_gpu" }
1212
gpu_rand = { version = "0.1", path = "../../../../crates/gpu_rand" }
@@ -21,4 +21,4 @@ sysinfo = "0.20.5"
2121
anyhow = "1.0.53"
2222

2323
[build-dependencies]
24-
cuda_builder = { version = "0.2", path = "../../../../crates/cuda_builder" }
24+
cuda_builder = { version = "0.3", path = "../../../../crates/cuda_builder" }

examples/optix/denoiser/Cargo.toml

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -6,6 +6,6 @@ edition = "2021"
66
[dependencies]
77
optix = { version = "0.1", path = "../../../crates/optix" }
88
structopt = "0.3"
9-
cust = { version = "0.2", path = "../../../crates/cust", features = ["impl_vek"] }
9+
cust = { version = "0.3", path = "../../../crates/cust", features = ["impl_vek"] }
1010
image = "0.23.14"
1111
vek = { version = "0.15.1" }

guide/src/features.md

Lines changed: 3 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -46,11 +46,11 @@ around to adding it yet.
4646
| ------------ | ------------- | ----- |
4747
| CUDA Runtime API || The CUDA Runtime API is for CUDA C++, we use the driver API |
4848
| CUDA Driver API | 🟨 | Most functions are implemented, but there is still a lot left to wrap because it is gigantic |
49-
| cuBLAS ||
49+
| cuBLAS || In-progress |
5050
| cuFFT ||
5151
| cuSOLVER ||
5252
| cuRAND || cuRAND only works with the runtime API, we have our own general purpose GPU rand library called `gpu_rand` |
53-
| cuDNN ||
53+
| cuDNN || In-progress |
5454
| cuSPARSE ||
5555
| AmgX ||
5656
| cuTENSOR ||
@@ -102,7 +102,7 @@ on things used by the wide majority of users.
102102
| SIMD Video Instructions ||
103103
| Cooperative Groups ||
104104
| Dynamic Parallelism ||
105-
| Stream Ordered Memory | |
105+
| Stream Ordered Memory | ✔️ |
106106
| Graph Memory Nodes ||
107107
| Unified Memory | ✔️ |
108108
| `__restrict__` || Not needed, you get that performance boost automatically through rust's noalias :) |

0 commit comments

Comments
 (0)