Skip to content

Latest commit

 

History

History
27 lines (18 loc) · 1.48 KB

README.md

File metadata and controls

27 lines (18 loc) · 1.48 KB

Transformers

Description:

This repo is a collection of PyTorch implementations of Transformer architectures with simple flexible config for ease of experimentation. The goal is learning and experimentation.

Tests:

Tests can be run using pytest from the root directory. There are also online colabs that should test any new architecture added to the repo on shakespeare character prediction.

  1. basic transformer
  2. MoE transformer
  3. Relative Attention

As well as this each architecture and layer should be benchmarked for speed using:

  1. Transformer-benchmarks
  2. Runtime Comparison

Resources:

  1. Attention Is All You Need
  2. On Layer Normalization in the Transformer Architecture
  3. minGPT
  4. The Annotated Transformer
  5. d2l-vision-transformer
  6. vector-quantize-pytorch