MultiHeadAttention parameter setting #180

Open

Open

MultiHeadAttention parameter setting#180

Assignees

Labels

opened

on Apr 30, 2023

Is the output linear layer parameter of the MultiHeadAttention class incorrectly set in mha.py file? in_features should be heads*d_k?

Author

The get_positional_encoding method of position encoder generates an error when d_model is set to odd

Member

Our implementation assumes that heads * d_k = d_model. Need to change that

added

on Jun 30, 2023

self-assigned this

on Jun 30, 2023

to join this conversation on GitHub. Already have an account? Sign in to comment

Metadata

Assignees

vpj

Labels

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Participants