Add attention_mask to signature_columns #4459

shubhamjain0594 · 2025-11-05T19:22:19Z

What does this PR do?

It seems that during addition of assistant_mask_only mechanism, the signature was modified to remove attention_mask from required labels. This seems a behaviour that I believe is wrong as all models generally need attention_mask as parameter to work with. So adding this back. It is especially useful when training with preprocessed tokenized dataset.

Before submitting

This PR fixes a typo or improves the docs (you can dismiss the other checks if that's the case).
Did you read the contributor guideline,
Pull Request section?
Was this discussed/approved via a GitHub issue? Please add a link
to it if that's the case.
Did you make sure to update the documentation with your changes?
Did you write any new necessary tests?

Who can review?

Anyone in the community is free to review the PR once the tests have passed. Feel free to tag
members/contributors who may be interested in your PR.

qgallouedec

Thank you for your contribution.
We don't actually need "attention_mask" in the signature, because:

we don't get it from tokenization, so there is no "attention_mask" column anyway;
the collator doesn't use the attention mask, but builds it from the input_ids.

So, unless I'm mistaken, this PR can be closed.

shubhamjain0594 · 2025-11-05T20:57:58Z

We use tokenizer with encode_plus instead of just encode which does return the attention mask.

import transformers

tokenizer = transformers.AutoTokenizer.from_pretrained("meta-llama/Llama-3.2-1B-Instruct")
tokenizer.encode_plus("Hello, world!")

To give you some context on why we need it: we have overridden evaluation loop in SFTTrainer so that we can evaluate by generating the complete sequence instead of just predicting the next token. To do we pass a processed dataset for this, a custom evaluation data collator that simply adds padding to the input text and uses attention_mask for generation during evaluation. But the RemoveUnusedColumns collator removes attention_mask from the data which leads to error in loss calculation.

shubhamjain0594 · 2025-11-06T10:20:32Z

Also another place this can be used: To test robustness of the models to small perturbations. In the dataset during preprocessing, we modify attention_mask to add zeros at random places to see how does model respond.

qgallouedec · 2025-11-06T21:27:57Z

Ok, thanks for the clarification, so this is a requirement that originates from the customization. in your case I think the easiest is to do

from trl import SFTTrainer as _SFTTrainer

class SFTTrainer(_SFTTrainer):
    def _set_signature_columns_if_needed(self):
        if self._signature_columns is None:
            self._signature_columns = ["input_ids", "labels", "attention_mask", "seq_lengths", "completion_mask", "assistant_masks"]

shubhamjain0594 · 2025-11-07T12:57:19Z

Thanks @qgallouedec for the tip. This is the workaround I have right now.

Though it does feel that this is not compatible with following usage of SFT: someone passes a preprocessed dataset, custom data_collator and setting skip_prepare_dataset: True. As in that case you would expect everything should still work.

qgallouedec · 2025-11-14T01:21:31Z

would it work if you just pass remove_unused_columns=False in the config?

shubhamjain0594 · 2025-11-15T15:16:16Z

@qgallouedec, I think this might work. Will test it. And close PR if it works. Thank you :)

qgallouedec · 2025-11-22T01:21:52Z

I'll close this PR, feel free to re-open if needed :)

Add attention_mask to signature_columns

faf7aea

qgallouedec reviewed Nov 5, 2025

View reviewed changes

qgallouedec closed this Nov 22, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Add attention_mask to signature_columns #4459

Add attention_mask to signature_columns #4459

Uh oh!

shubhamjain0594 commented Nov 5, 2025

Uh oh!

qgallouedec left a comment

Uh oh!

shubhamjain0594 commented Nov 5, 2025 •

edited

Loading

Uh oh!

shubhamjain0594 commented Nov 6, 2025 •

edited

Loading

Uh oh!

qgallouedec commented Nov 6, 2025

Uh oh!

shubhamjain0594 commented Nov 7, 2025

Uh oh!

qgallouedec commented Nov 14, 2025

Uh oh!

shubhamjain0594 commented Nov 15, 2025

Uh oh!

qgallouedec commented Nov 22, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Add attention_mask to signature_columns #4459

Add attention_mask to signature_columns #4459

Uh oh!

Conversation

shubhamjain0594 commented Nov 5, 2025

What does this PR do?

Before submitting

Who can review?

Uh oh!

qgallouedec left a comment

Choose a reason for hiding this comment

Uh oh!

shubhamjain0594 commented Nov 5, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

shubhamjain0594 commented Nov 6, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

qgallouedec commented Nov 6, 2025

Uh oh!

shubhamjain0594 commented Nov 7, 2025

Uh oh!

qgallouedec commented Nov 14, 2025

Uh oh!

shubhamjain0594 commented Nov 15, 2025

Uh oh!

qgallouedec commented Nov 22, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

shubhamjain0594 commented Nov 5, 2025 •

edited

Loading

shubhamjain0594 commented Nov 6, 2025 •

edited

Loading