Transformers

Description:

This repo is a collection of PyTorch implementations of Transformer architectures with simple flexible config for ease of experimentation. The goal is learning and experimentation.

Tests:

Tests can be run using pytest from the root directory. There are also online colabs that should test any new architecture added to the repo on shakespeare character prediction.

basic transformer
MoE transformer
Relative Attention

As well as this each architecture and layer should be benchmarked for speed using:

Transformer-benchmarks
Runtime Comparison

Resources:

Attention Is All You Need
On Layer Normalization in the Transformer Architecture
minGPT
The Annotated Transformer
d2l-vision-transformer
vector-quantize-pytorch

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

README.md

README.md

Transformers

Description:

Tests:

Resources:

Files

README.md

Latest commit

History

README.md

File metadata and controls

Transformers

Description:

Tests:

Resources: