File tree 4 files changed +34
-2
lines changed
recipes/A5000_24GB_x8/Mistral-7B-v0.1
4 files changed +34
-2
lines changed Original file line number Diff line number Diff line change
1
+ target_task : tasks/i18n/ja.md
2
+ base_model_id : mistralai/Mistral-7B-v0.1
3
+ model_name : Mistral-7B-v0.1-fr-wikipedia-v0.1
4
+ output_base_dir : output
5
+ dataset_id : wikimedia/wikipedia
6
+ dataset_load_config : 20231101.fr
7
+ dataset_input_field_name : text
8
+ dataset_train_split_seed : 42
9
+ dataset_train_split_test_size : 0.2
10
+ lora_r : 8
11
+ lora_alpha : 16
12
+ lora_dropout : 0.05
13
+ train_claim_gpu_num : 8
14
+ train_per_device_train_batch_size : 1
15
+ train_gradient_accumulation_steps : 16
16
+ train_num_train_epochs : 2
Original file line number Diff line number Diff line change
1
+ target_task : tasks/i18n/ja.md
2
+ base_model_id : mistralai/Mistral-7B-v0.1
3
+ model_name : Mistral-7B-v0.1-it-wikipedia-v0.1
4
+ output_base_dir : output
5
+ dataset_id : wikimedia/wikipedia
6
+ dataset_load_config : 20231101.it
7
+ dataset_input_field_name : text
8
+ dataset_train_split_seed : 42
9
+ dataset_train_split_test_size : 0.2
10
+ lora_r : 8
11
+ lora_alpha : 16
12
+ lora_dropout : 0.05
13
+ train_claim_gpu_num : 8
14
+ train_per_device_train_batch_size : 1
15
+ train_gradient_accumulation_steps : 16
16
+ train_num_train_epochs : 2
Original file line number Diff line number Diff line change @@ -13,4 +13,4 @@ lora_dropout: 0.05
13
13
train_claim_gpu_num : 8
14
14
train_per_device_train_batch_size : 1
15
15
train_gradient_accumulation_steps : 16
16
- train_num_train_epochs : 2
16
+ train_num_train_epochs : 2
Original file line number Diff line number Diff line change @@ -281,7 +281,7 @@ def load_model_and_tokenizer(model_id):
281
281
args = training_arguments ,
282
282
tokenizer = tokenizer ,
283
283
packing = False ,
284
- max_seq_length = 1024 ,
284
+ max_seq_length = 512 ,
285
285
)
286
286
287
287
#
You can’t perform that action at this time.
0 commit comments