Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

微调loss特别大 #11

Open
zixiliuUSC opened this issue Feb 9, 2022 · 1 comment
Open

微调loss特别大 #11

zixiliuUSC opened this issue Feb 9, 2022 · 1 comment

Comments

@zixiliuUSC
Copy link

我采用你们的模型进行微调,微调的数据是小说数据,尝试了两种输入结构,训练loss都是从8.8开始下降,valid ppl非常大。第一种是每行的结构为:{"prompt": 随机截取的小说文本,长度为256, "text": 一本小说删除换行符后拼接,长度约为10w字},第二种是每行结构为:{"prompt": text的前文,长度为256, "text": 把一本小说按512长度分段}。请问那种输入结构是正确的呢?我从代码来看你们用的应该是第一种,但是为什么loss还是这么大呢?

@duzx16
Copy link
Member

duzx16 commented Feb 14, 2022

应该是{"prompt": "", text: 小说文本}。prompt表示的是不需要生成的部分,如果你是想生成小说的话可以从头生成没有prompt。按512长度分段是程序自动完成的,不需要在数据里面完成

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants