Skip to content

remove unnecessary mask of batch infer #2034

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

wzy3650
Copy link
Contributor

@wzy3650 wzy3650 commented Feb 12, 2025

attention mask可以认为已经”包含了“padding mask,所以padding mask不会影响最终计算结果

@RVC-Boss
Copy link
Owner

这个你测过tensor值是否一样吗


就是在这一行前,是否为None2个结果作差求L1看是否为0

@wzy3650
Copy link
Contributor Author

wzy3650 commented Feb 12, 2025

这个你测过tensor值是否一样吗

就是在这一行前,是否为None2个结果作差求L1看是否为0

从整体比较整个tensor的话是不一样的,不一样的部分对应于被padding mask给mask掉那些的position;具体来说,当前主线代码会去手动mask这些位置,但实际上没必要,因为在attention mask作用下这些位置的结果在后续的操作中是直接被忽视的、不会影响到token预测,即最终预测出的token是完全一样的
实际上这个做法是在对齐训练里面的处理方式(forward_old函数),上面删掉的这些操作对token预测是冗余的

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants