-
Notifications
You must be signed in to change notification settings - Fork 1.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
docs:Utilizing Llama4 long context window to do data generation by role playing without RAG #2155
base: master
Are you sure you want to change the base?
Conversation
…e playing without RAG
Check out this pull request on See visual diffs & provide feedback on Jupyter Notebooks. Powered by ReviewNB |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
thanks @zjrwtx left some comments
{ | ||
"cell_type": "code", | ||
"source": [ | ||
"!ls" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
maybe this can be deleted
"\n", | ||
"# add your topic input file\n", | ||
"print(Fore.YELLOW + \"add your basic content file:\")\n", | ||
"input_file =input(\"Enter the basic cotent file path (default basic_content.txt): \") or \"basic_content.txt\"\n", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Perhaps the path shouldn't be entered this way, as it doesn't align with common user habits. Moreover, the default value is "basic_content.txt", but the example doesn't provide the corresponding file.
"\n", | ||
"\n", | ||
"\n", | ||
"def load_input_file(file_path):\n", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
maybe more file formats can be supported,the IO module in CMAEL should be able to accomplish this
"topics_file = input(\"Topics file name (default generated_topics.txt): \") or \"generated_topics.txt\"\n", | ||
"output_dir = input(\"Output directory name (default generated_dialogues): \") or \"generated_dialogues\"\n", | ||
"num_dialogues = int(input(\"Number of dialogues to generate per topic (default 1): \") or 1)\n", | ||
"assistant_role = input(\"Assistant role name (default Python Programmer): \") or \"Python Programmer\"\n", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think the assistant_role
and user_role
can be generated by the LLM when creating the topic, rather than being entered manually. And similar to the above, perhaps the way the file is input should be adjusted as well?
thanks @fengju0213 ,i will change |
Utilizing Llama4 long context window to do data generation by role playing without RAG)
Description
Describe your changes in detail (optional if the linked issue already contains a detailed description of the changes).
Checklist
Go over all the following points, and put an
x
in all the boxes that apply.Fixes #issue-number
in the PR description (required)pyproject.toml
anduv lock
If you are unsure about any of these, don't hesitate to ask. We are here to help!