Skip to content

Commit e7c8cb7

Browse files
authored
clarify old megatron code with warning
1 parent 40bbd34 commit e7c8cb7

File tree

1 file changed

+5
-6
lines changed

1 file changed

+5
-6
lines changed

megatron/README.md

+5-6
Original file line numberDiff line numberDiff line change
@@ -1,15 +1,14 @@
1-
# Megatron-DeepSpeed
1+
# Not maintained / deprecated
22

3-
The 3 Megatron-LM snapshots in this repo. are no longer being maintained.
3+
> __Warning__
4+
> all future/current changes are now in new [Megatron-DeepSpeed](https://github.com/microsoft/Megatron-DeepSpeed).
45
5-
Please use the new [Megatron-DeepSpeed fork](https://github.com/microsoft/Megatron-DeepSpeed).
6-
7-
## Notes on 3 deprecated Megatron folders in this repository
6+
### Notes on 3 deprecated Megatron folders in this repository
87

98
Megatron-LM : This is a fairly old snapshot of Megatron-LM , and we have been using it show case the earlier features of DeepSpeed. This does not contain ZeRO-3 or 3D parallelism.
109

1110
Megatron-LM-v1.1.5-3D_parallelism: This is a relatively new Megatron (Oct 2020), but before Megatron started supporting 3D parallelism. We ported this version to showcase how to use 3D parallelism inside DeepSpeed with Megatron.
1211

1312
Megatron-LM-v1.1.5-ZeRO3: The underlying Megatron version is same as the 3D_parallelism but it does not contain the 3D parallelism port. It however contains the most recent advances in DeepSpeed including ZeRO-3, ZeRO-3 Offload and ZeRO-Infinity. We did this separately from 3D parallelism port to isolate the changes required for each of them and to avoid users combining them together which is not supported, and will likely lead to more confusion.
1413

15-
3D parallelism is quite similar in both DeepSpeed and new Megatron, we don't have plans to support their combination. The Megatron-DeepSpeed repository supports DeepSpeed's 3D parallelism (pipeline-parallelism inside DeepSpeed and megatron/mpu-based tensor-parallelism).
14+
3D parallelism is quite similar in both DeepSpeed and new Megatron, we don't have plans to support their combination. The Megatron-DeepSpeed repository supports DeepSpeed's 3D parallelism (pipeline-parallelism inside DeepSpeed and megatron/mpu-based tensor-parallelism).

0 commit comments

Comments
 (0)