Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Feedback #1

Open
JOrjales-CDDO opened this issue Jan 17, 2025 · 2 comments
Open

Feedback #1

JOrjales-CDDO opened this issue Jan 17, 2025 · 2 comments

Comments

@JOrjales-CDDO
Copy link

JOrjales-CDDO commented Jan 17, 2025

Hey David,

Wasn't sure where you wanted your feedback so using issues.

General things:

  • Docstrings on methods
  • Exception handling (
  • Type hints
  • If there is a minimum python version we should be using (potentially in the readme?)
  • ENUMs for model names instead of the Haiku ID string, same for anthropic version, maybe also the completion status?
  • It checks for the AWS profile - does region not need to be specified (is batch inference able to use LLMs from other regions that London doesn't have access to?)

Specific stuff:

  • Line 224 of batch_inference.py you reference valid_statuses for download_results method. Perhaps extend this to include what the actual status is?
  • Line 150 write_requests_locally: Add error handling on this in case for whatever reason writing fails.
  • Tried running batch_inference_example. Failed with error because rolearn on line 394 of batch_inference.py is role_arn="arn:aws:iam::992382722318:role/BatchInferenceRole", I think BatchInferenceRole needs to be referenced from the env
  • Add a maximum batch size? I saw there's a minimum in place already, maybe as part of the config inputs.
  • I saw you're writing requests locally - maybe some kind of cleanup function?
  • Poll progress: Expand this to show progressbar if possible?
@gillespied
Copy link
Collaborator

gillespied commented Jan 22, 2025

Hello, thanks for the feedback:

General things:

Docstrings on methods

  • - Exception handling (
  • - Type hints
  • - If there is a minimum python version we should be using (potentially in the readme?)
  • - ENUMs for model names instead of the Haiku ID string, same for anthropic version, maybe also the completion status? 1
  • - It checks for the AWS profile - does region not need to be specified (is batch inference able to use LLMs from other regions that London doesn't have access to?) 1

Specific stuff:

  • Line 224 of batch_inference.py you reference valid_statuses for download_results method. Perhaps extend this to include what the actual status is?
  • Line 150 write_requests_locally: Add error handling on this in case for whatever reason writing fails. 2
  • Tried running batch_inference_example. Failed with error because rolearn on line 394 of batch_inference.py is role_arn="arn:aws:iam::992382722318:role/BatchInferenceRole", I think BatchInferenceRole needs to be referenced from the env
  • Add a maximum batch size? I saw there's a minimum in place already, maybe as part of the config inputs. 1
  • I saw you're writing requests locally - maybe some kind of cleanup function? 3
  • Poll progress: Expand this to show progressbar if possible? 4

@gillespied
Copy link
Collaborator

  1. I think you are right about enums, or some kind of data class to specify max tokens and the region. At the minute I've tested it with the cross region inference models so havent had the issue.
  2. Will add to the backlog
  3. Again good idea, will add to the back log
  4. I dont know what the progress would show. You dont know when the job will start when you submit it. It might never start in the time period you set.

I will open new issues for 1-3. I'm just pushing a new release candidate.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants