Skip to content

Commit 3405136

Browse files
committed
Fix parsing of escaped and unicode characters from custom prompts
Signed-off-by: Alberto Mannari <[email protected]>
1 parent 3bfb311 commit 3405136

File tree

1 file changed

+3
-2
lines changed

1 file changed

+3
-2
lines changed

aiu_fms_testing_utils/scripts/drive_paged_programs.py

Lines changed: 3 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -195,11 +195,12 @@
195195
"Using custom prompts from user, programs parameter will be ignored as it will be determined by user prompt"
196196
)
197197
result = []
198-
with open(DATASET_PATH, "r") as file:
198+
with open(DATASET_PATH, "rb") as file:
199199
for line in file:
200-
res_line = line.strip()
200+
res_line = line.decode("unicode_escape").strip()
201201
result.append((res_line, get_pad_size(len(tokenizer.encode(res_line)))))
202202
custom_shape = (len(result), max([_[1] for _ in result]))
203+
dprint(f"Custom shape: {custom_shape}")
203204

204205
def __custom_line_sampler(*args, **kwargs):
205206
return_key = kwargs.get("return_key", False)

0 commit comments

Comments
 (0)