Releases: lucidrains/gateloop-transformer
Releases · lucidrains/gateloop-transformer
0.0.18
additional swish gate for gateloop module
0.0.16
state transition should act on per gate loop head
0.0.15
increase default frac gradient for state transition projection
0.0.14
add an assert and encourage researchers to play around with heads
0.0.12
fix a misunderstanding, thanks to main author @tobiaskatsch for the d…
0.0.11
able to ablate state transitions
0.0.10
need to see something before deciding whether to invest time in cuda …
0.0.8
allow for training full attention with rotary + data dependent xpos s…
0.0.7
misunderstood how activation functions were applied
0.0.6
0.0.6