Skip to content

Conversation

@stephantul
Copy link
Contributor

This PR adds a configurable pad token to the training pipeline. In previous versions, we always assumed this token was [PAD], which is almost always the case, but isn't necessarily true. This lets the user configure the pad token directly, while setting [PAD] as the default.

@stephantul stephantul marked this pull request as ready for review September 7, 2025 18:32
@stephantul stephantul requested a review from Pringled September 7, 2025 18:32
@codecov
Copy link

codecov bot commented Sep 7, 2025

Codecov Report

✅ All modified and coverable lines are covered by tests.

Files with missing lines Coverage Δ
model2vec/train/base.py 100.00% <100.00%> (ø)
🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@stephantul stephantul merged commit 55b955a into main Sep 8, 2025
7 checks passed
@stephantul stephantul deleted the configure-pad-token branch September 8, 2025 11:57
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants