Skip to content

Support for Custom Imported Models with User-Friendly IDs #1

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 9 commits into from
Apr 23, 2025

Conversation

bdruth
Copy link

@bdruth bdruth commented Apr 16, 2025

Summary

  • Add support for custom models imported into AWS Bedrock
  • Create user-friendly model IDs that include the model name (e.g., mistral-7b-instruct-id:custom.a1b2c3d4)
  • Maintain backward compatibility with original AWS IDs
  • Preserve the user-friendly IDs in API responses

Key Features

  • Custom models are always enabled by default
  • Models are discovered automatically from the AWS account
  • User-friendly IDs are created based on model names
  • Complete documentation in CUSTOM_MODELS_IMPLEMENTATION.md
  • Support for both streaming and non-streaming requests
  • Support for model invocation across regions

Test plan

  1. Run the Models API to verify custom models appear with user-friendly IDs
  2. Test chat completions with both ID formats (user-friendly and original AWS format)
  3. Test streaming with custom models
  4. Verify that responses preserve the user-friendly IDs

🤖 Generated with Claude Code

bdruth and others added 3 commits April 16, 2025 11:45
This commit adds the ability to use custom models imported into AWS Bedrock
through the OpenAI-compatible API interface. Key features include:

- User-friendly model IDs that include the model name (e.g., mistral-7b-instruct-id:custom.a1b2c3d4)
- Support for both custom models and imported models
- Detailed documentation in CUSTOM_MODELS_IMPLEMENTATION.md
- Updates to README.md and Usage.md
- Custom models are always enabled by default

🤖 Generated with [Claude Code](https://claude.ai/code)

Co-Authored-By: Claude <[email protected]>
This file provides guidance for Claude Code assistant when working with this repository,
including commands and code style guidelines. Updated to use pipx for running ruff.

🤖 Generated with [Claude Code](https://claude.ai/code)

Co-Authored-By: Claude <[email protected]>
- Update schema.py to handle user-friendly custom model IDs
- Update model.py router to preserve user-friendly IDs in responses

🤖 Generated with [Claude Code](https://claude.ai/code)

Co-Authored-By: Claude <[email protected]>
@bdruth bdruth changed the title Feature/custom imported models Support for Custom Imported Models with User-Friendly IDs Apr 16, 2025
Copy link
Collaborator

@msharp9 msharp9 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seems overly complex, especially compared to this PR.

That said, if it works this gets us up and running until the feature is officially supported.

- Add EcrAccountId parameter to both Lambda and Fargate templates
- Default to the current official account ID (366590864501)
- Use the parameter in ECR repository URLs and IAM policy

This change allows users to easily deploy with custom ECR repositories.

🤖 Generated with [Claude Code](https://claude.ai/code)

Co-Authored-By: Claude <[email protected]>
@bdruth
Copy link
Author

bdruth commented Apr 17, 2025

Seems overly complex, especially compared to this PR.

That said, if it works this gets us up and running until the feature is officially supported.

@msharp9 - yeah - I think this PR would be simpler, too, but when I tested the naive implementation, the model ID you get back from Bedrock for custom models is a gross custom.as87df0923 string ... completely impenetrable. But, it does return the modelName in the Bedrock API, so at least some of the additional code is to make the ID that's exposed in the API more "friendly".

bdruth and others added 5 commits April 22, 2025 08:32
Enhances custom model streaming support to handle:
1. Missing contentType in stream chunks
2. Support for 'generation' field format used by some models

This ensures that custom imported models can be used with streaming mode.
Enhances custom model handling to support additional response formats:
1. Add support for 'generation' field in response
2. Add support for 'prompt_token_count' and 'generation_token_count' fields
3. Add debug logging of response keys to help troubleshoot

This ensures that custom imported models can be used with both
streaming and non-streaming modes.
These tests verify that:
1. Custom models appear in the models list API
2. Custom models can be invoked successfully with a non-empty response
3. Custom models support streaming mode

The tests require at least one custom model to be set up in the AWS account.
This utility script allows direct testing of custom imported models using boto3:
1. Tests direct invocation with the AWS Bedrock runtime
2. Tests streaming invocation with the AWS Bedrock runtime
3. Parses different response formats used by custom models
4. Extracts complete texts from streaming chunks

Useful for debugging or verifying custom model behavior outside of the gateway API.
- Extract helper function to reduce code duplication
- Simplify custom model listing logic
- Improve model lookup to handle friendly IDs and AWS IDs efficiently
- Add helper methods to extract completion text and usage
- Replace DEBUG blocks with logger.debug() calls
- Streamline validation and response handling

🤖 Generated with [Claude Code](https://claude.ai/code)

Co-Authored-By: Claude <[email protected]>
@bdruth bdruth merged commit fbb63de into main Apr 23, 2025
@bdruth bdruth deleted the feature/custom-imported-models branch April 23, 2025 21:12
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants