Skip to content

MultiHeadAttention parameter setting #180

@LXXiaogege

Description

@LXXiaogege

Is the output linear layer parameter of the MultiHeadAttention class incorrectly set in mha.py file? in_features should be heads*d_k?

Activity

LXXiaogege

LXXiaogege commented on May 2, 2023

@LXXiaogege
Author

The get_positional_encoding method of position encoder generates an error when d_model is set to odd

vpj

vpj commented on Jun 30, 2023

@vpj
Member

Our implementation assumes that heads * d_k = d_model. Need to change that

self-assigned this
on Jun 30, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Metadata

Metadata

Assignees

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

    Development

    No branches or pull requests

      Participants

      @vpj@LXXiaogege

      Issue actions

        MultiHeadAttention parameter setting · Issue #180 · labmlai/annotated_deep_learning_paper_implementations