Feedback #1

JOrjales-CDDO · 2025-01-17T13:30:19Z

Hey David,

Wasn't sure where you wanted your feedback so using issues.

General things:

Docstrings on methods
Exception handling (
Type hints
If there is a minimum python version we should be using (potentially in the readme?)
ENUMs for model names instead of the Haiku ID string, same for anthropic version, maybe also the completion status?
It checks for the AWS profile - does region not need to be specified (is batch inference able to use LLMs from other regions that London doesn't have access to?)

Specific stuff:

Line 224 of batch_inference.py you reference valid_statuses for download_results method. Perhaps extend this to include what the actual status is?
Line 150 write_requests_locally: Add error handling on this in case for whatever reason writing fails.
Tried running batch_inference_example. Failed with error because rolearn on line 394 of batch_inference.py is role_arn="arn:aws:iam::992382722318:role/BatchInferenceRole", I think BatchInferenceRole needs to be referenced from the env
Add a maximum batch size? I saw there's a minimum in place already, maybe as part of the config inputs.
I saw you're writing requests locally - maybe some kind of cleanup function?
Poll progress: Expand this to show progressbar if possible?

gillespied · 2025-01-22T09:52:18Z

gillespied · 2025-01-22T09:55:41Z

I think you are right about enums, or some kind of data class to specify max tokens and the region. At the minute I've tested it with the cross region inference models so havent had the issue.
Will add to the backlog
Again good idea, will add to the back log
I dont know what the progress would show. You dont know when the job will start when you submit it. It might never start in the time period you set.

I will open new issues for 1-3. I'm just pushing a new release candidate.

Provide feedback