Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix two small problems #291

Open
wants to merge 2 commits into
base: main
Choose a base branch
from
Open

Fix two small problems #291

wants to merge 2 commits into from

Conversation

janEbert
Copy link
Contributor

Problems:

  • BERT/T5 dataset handles already corrupted indices incorrectly.
  • GPT tokenizer vocab size does not include special tokens.

This PR fixes these issues. Since the changes are so small, I didn't bother creating separate PRs but please tell me if you need it separate.

hyoo referenced this pull request in hyoo/Megatron-DeepSpeed Apr 21, 2023
* [QOL] Log the nodelist.

* Tweak.

* Tweak

* Tweak.

* lint
@github-actions
Copy link

Marking as stale. No activity in 60 days. Remove stale label or comment or this will be closed in 7 days.

@github-actions github-actions bot added the stale No activity in 60 days on issue or PR label Jul 10, 2023
@github-actions
Copy link

No activity on stale PR in 21 days.

@github-actions github-actions bot closed this Jul 18, 2023
@jon-barker jon-barker reopened this Jul 19, 2023
@github-actions github-actions bot removed the stale No activity in 60 days on issue or PR label Jul 20, 2023
@github-actions
Copy link

Marking as stale. No activity in 60 days.

@github-actions github-actions bot added the stale No activity in 60 days on issue or PR label Sep 18, 2023
rraminen pushed a commit to rraminen/Megatron-LM that referenced this pull request Dec 12, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
stale No activity in 60 days on issue or PR
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants