Add efficient Cross-Entropy by cuda kernel to accelerate training speed and reduce cross-entropy memory usage during training.#995
Open
cb521 wants to merge 18 commits intoNVIDIA:mainfrom cb521:add_efficient_cross_entropy
+622
Commits
Commits on Jun 14, 2024
- committedbinc
- committedbinc
Commits on Jun 16, 2024
- committedbinc
- committedbinc
Commits on Jul 7, 2024
- committedbinc