We read every piece of feedback, and take your input very seriously.
To see all available qualifiers, see our documentation.
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
我采用你们的模型进行微调,微调的数据是小说数据,尝试了两种输入结构,训练loss都是从8.8开始下降,valid ppl非常大。第一种是每行的结构为:{"prompt": 随机截取的小说文本,长度为256, "text": 一本小说删除换行符后拼接,长度约为10w字},第二种是每行结构为:{"prompt": text的前文,长度为256, "text": 把一本小说按512长度分段}。请问那种输入结构是正确的呢?我从代码来看你们用的应该是第一种,但是为什么loss还是这么大呢?
The text was updated successfully, but these errors were encountered:
应该是{"prompt": "", text: 小说文本}。prompt表示的是不需要生成的部分,如果你是想生成小说的话可以从头生成没有prompt。按512长度分段是程序自动完成的,不需要在数据里面完成
Sorry, something went wrong.
No branches or pull requests
我采用你们的模型进行微调,微调的数据是小说数据,尝试了两种输入结构,训练loss都是从8.8开始下降,valid ppl非常大。第一种是每行的结构为:{"prompt": 随机截取的小说文本,长度为256, "text": 一本小说删除换行符后拼接,长度约为10w字},第二种是每行结构为:{"prompt": text的前文,长度为256, "text": 把一本小说按512长度分段}。请问那种输入结构是正确的呢?我从代码来看你们用的应该是第一种,但是为什么loss还是这么大呢?
The text was updated successfully, but these errors were encountered: