File tree 4 files changed +37
-4
lines changed
recipes/A5000_24GB_x8/Mistral-7B-v0.1
4 files changed +37
-4
lines changed Original file line number Diff line number Diff line change
1
+ target_task : tasks/nlp/translation.md
2
+ base_model_id : yuiseki/Mistral-7B-v0.1-ja-wikipedia-v0.1
3
+ model_name : Mistral-7B-v0.1-ja-wikipedia-OpenMath-v0.1
4
+ output_base_dir : /data/output
5
+ dataset_id : kunishou/OpenMathInstruct-1-1.8m-ja
6
+ dataset_input_field_name : question_ja
7
+ dataset_output_field_name : generated_solution_ja
8
+ dataset_train_split_seed : 42
9
+ dataset_train_split_test_size : 0.2
10
+ lora_r : 8
11
+ lora_alpha : 16
12
+ lora_dropout : 0.05
13
+ train_claim_gpu_num : 4
14
+ train_per_device_train_batch_size : 8
15
+ train_gradient_accumulation_steps : 2
16
+ train_num_train_epochs : 4
Original file line number Diff line number Diff line change
1
+ target_task : tasks/text-generation/text2sql.md
2
+ base_model_id : yuiseki/Mistral-7B-v0.1-ja-wikipedia-v0.1
3
+ model_name : Mistral-7B-v0.1-ja-wikipedia-amenokaku-v0.1
4
+ output_base_dir : /data/output
5
+ dataset_id : kunishou/amenokaku-code-instruct
6
+ dataset_input_field_name : instruction
7
+ dataset_context_field_name : input
8
+ dataset_output_field_name : output
9
+ dataset_train_split_seed : 42
10
+ dataset_train_split_test_size : 0.2
11
+ lora_r : 8
12
+ lora_alpha : 16
13
+ lora_dropout : 0.05
14
+ train_claim_gpu_num : 4
15
+ train_per_device_train_batch_size : 8
16
+ train_gradient_accumulation_steps : 4
17
+ train_num_train_epochs : 4
Original file line number Diff line number Diff line change @@ -13,6 +13,6 @@ lora_r: 8
13
13
lora_alpha : 16
14
14
lora_dropout : 0.05
15
15
train_claim_gpu_num : 8
16
- train_per_device_train_batch_size : 2
17
- train_gradient_accumulation_steps : 8
16
+ train_per_device_train_batch_size : 1
17
+ train_gradient_accumulation_steps : 16
18
18
train_num_train_epochs : 2
Original file line number Diff line number Diff line change @@ -11,6 +11,6 @@ lora_r: 8
11
11
lora_alpha : 16
12
12
lora_dropout : 0.05
13
13
train_claim_gpu_num : 8
14
- train_per_device_train_batch_size : 2
15
- train_gradient_accumulation_steps : 8
14
+ train_per_device_train_batch_size : 1
15
+ train_gradient_accumulation_steps : 16
16
16
train_num_train_epochs : 2
You can’t perform that action at this time.
0 commit comments