docs:Utilizing Llama4 long context window to do data generation by role playing without RAG #2155

zjrwtx · 2025-04-09T15:14:12Z

Utilizing Llama4 long context window to do data generation by role playing without RAG)

Description

Describe your changes in detail (optional if the linked issue already contains a detailed description of the changes).

Checklist

Go over all the following points, and put an x in all the boxes that apply.

I have read the CONTRIBUTION guide (required)
I have linked this PR to an issue using the Development section on the right sidebar or by adding Fixes #issue-number in the PR description (required)
I have checked if any dependencies need to be added or updated in pyproject.toml and uv lock
I have updated the tests accordingly (required for a bug fix or a new feature)
I have updated the documentation if needed:
I have added examples if this is a new feature

If you are unsure about any of these, don't hesitate to ask. We are here to help!

…e playing without RAG

review-notebook-app · 2025-04-09T15:14:18Z

Check out this pull request on

See visual diffs & provide feedback on Jupyter Notebooks.

Powered by ReviewNB

fengju0213

thanks @zjrwtx left some comments

fengju0213 · 2025-04-10T09:35:20Z

...tion/utilizing_llama4_long_context_window_for_role_playing_data_generation_without_rag.ipynb

+    {
+      "cell_type": "code",
+      "source": [
+        "!ls"


maybe this can be deleted

fengju0213 · 2025-04-10T09:55:15Z

...tion/utilizing_llama4_long_context_window_for_role_playing_data_generation_without_rag.ipynb

+        "\n",
+        "# add your topic input file\n",
+        "print(Fore.YELLOW + \"add your basic content file:\")\n",
+        "input_file =input(\"Enter the basic cotent file path (default basic_content.txt): \") or \"basic_content.txt\"\n",


Perhaps the path shouldn't be entered this way, as it doesn't align with common user habits. Moreover, the default value is "basic_content.txt", but the example doesn't provide the corresponding file.

fengju0213 · 2025-04-10T10:28:37Z

...tion/utilizing_llama4_long_context_window_for_role_playing_data_generation_without_rag.ipynb

+        "\n",
+        "\n",
+        "\n",
+        "def load_input_file(file_path):\n",


maybe more file formats can be supported,the IO module in CMAEL should be able to accomplish this

fengju0213 · 2025-04-10T10:39:01Z

...tion/utilizing_llama4_long_context_window_for_role_playing_data_generation_without_rag.ipynb

+        "topics_file = input(\"Topics file name (default generated_topics.txt): \") or \"generated_topics.txt\"\n",
+        "output_dir = input(\"Output directory name (default generated_dialogues): \") or \"generated_dialogues\"\n",
+        "num_dialogues = int(input(\"Number of dialogues to generate per topic (default 1): \") or 1)\n",
+        "assistant_role = input(\"Assistant role name (default Python Programmer): \") or \"Python Programmer\"\n",


I think the assistant_role and user_role can be generated by the LLM when creating the topic, rather than being entered manually. And similar to the above, perhaps the way the file is input should be adjusted as well?

zjrwtx · 2025-04-12T08:45:33Z

thanks @fengju0213 ,i will change

add Utilizing Llama4 long context window to do data generation by rol…

cf393f5

…e playing without RAG

zjrwtx changed the title ~~Utilizing Llama4 long context window to do data generation by role playing~~ docs:Utilizing Llama4 long context window to do data generation by role playing Apr 9, 2025

zjrwtx self-assigned this Apr 9, 2025

zjrwtx added documentation Improvements or additions to documentation use case labels Apr 9, 2025

zjrwtx changed the title ~~docs:Utilizing Llama4 long context window to do data generation by role playing~~ docs:Utilizing Llama4 long context window to do data generation by role playing with RAG Apr 9, 2025

zjrwtx marked this pull request as draft April 9, 2025 15:15

zjrwtx mentioned this pull request Apr 10, 2025

[Feature Request] Utilizing Llama4 long context window to do data generation by role playing with RAG #2160

Open

2 tasks

Merge branch 'master' into llama4_datagen

6cd737a

zjrwtx changed the title ~~docs:Utilizing Llama4 long context window to do data generation by role playing with RAG~~ docs:Utilizing Llama4 long context window to do data generation by role playing without RAG Apr 10, 2025

zjrwtx added 2 commits April 10, 2025 15:02

polish

32dccef

polish

9838474

zjrwtx requested a review from Wendong-Fan April 10, 2025 07:38

zjrwtx marked this pull request as ready for review April 10, 2025 07:38

zjrwtx requested a review from fengju0213 April 10, 2025 07:40

fengju0213 reviewed Apr 10, 2025

View reviewed changes

zjrwtx and others added 2 commits April 12, 2025 16:45

Merge branch 'master' into llama4_datagen

9c32083

Merge branch 'master' into llama4_datagen

d37d013

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

docs:Utilizing Llama4 long context window to do data generation by role playing without RAG #2155

docs:Utilizing Llama4 long context window to do data generation by role playing without RAG #2155

zjrwtx commented Apr 9, 2025 •

edited

Loading

review-notebook-app bot commented Apr 9, 2025

fengju0213 left a comment

fengju0213 Apr 10, 2025

fengju0213 Apr 10, 2025

fengju0213 Apr 10, 2025

fengju0213 Apr 10, 2025

zjrwtx commented Apr 12, 2025

docs:Utilizing Llama4 long context window to do data generation by role playing without RAG #2155

Are you sure you want to change the base?

docs:Utilizing Llama4 long context window to do data generation by role playing without RAG #2155

Conversation

zjrwtx commented Apr 9, 2025 • edited Loading

Description

Checklist

review-notebook-app bot commented Apr 9, 2025

fengju0213 left a comment

Choose a reason for hiding this comment

fengju0213 Apr 10, 2025

Choose a reason for hiding this comment

fengju0213 Apr 10, 2025

Choose a reason for hiding this comment

fengju0213 Apr 10, 2025

Choose a reason for hiding this comment

fengju0213 Apr 10, 2025

Choose a reason for hiding this comment

zjrwtx commented Apr 12, 2025

zjrwtx commented Apr 9, 2025 •

edited

Loading