-
Notifications
You must be signed in to change notification settings - Fork 0
Support for Custom Imported Models with User-Friendly IDs #1
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
This commit adds the ability to use custom models imported into AWS Bedrock through the OpenAI-compatible API interface. Key features include: - User-friendly model IDs that include the model name (e.g., mistral-7b-instruct-id:custom.a1b2c3d4) - Support for both custom models and imported models - Detailed documentation in CUSTOM_MODELS_IMPLEMENTATION.md - Updates to README.md and Usage.md - Custom models are always enabled by default 🤖 Generated with [Claude Code](https://claude.ai/code) Co-Authored-By: Claude <[email protected]>
This file provides guidance for Claude Code assistant when working with this repository, including commands and code style guidelines. Updated to use pipx for running ruff. 🤖 Generated with [Claude Code](https://claude.ai/code) Co-Authored-By: Claude <[email protected]>
- Update schema.py to handle user-friendly custom model IDs - Update model.py router to preserve user-friendly IDs in responses 🤖 Generated with [Claude Code](https://claude.ai/code) Co-Authored-By: Claude <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Seems overly complex, especially compared to this PR.
That said, if it works this gets us up and running until the feature is officially supported.
- Add EcrAccountId parameter to both Lambda and Fargate templates - Default to the current official account ID (366590864501) - Use the parameter in ECR repository URLs and IAM policy This change allows users to easily deploy with custom ECR repositories. 🤖 Generated with [Claude Code](https://claude.ai/code) Co-Authored-By: Claude <[email protected]>
@msharp9 - yeah - I think this PR would be simpler, too, but when I tested the naive implementation, the model ID you get back from Bedrock for custom models is a gross |
Enhances custom model streaming support to handle: 1. Missing contentType in stream chunks 2. Support for 'generation' field format used by some models This ensures that custom imported models can be used with streaming mode.
Enhances custom model handling to support additional response formats: 1. Add support for 'generation' field in response 2. Add support for 'prompt_token_count' and 'generation_token_count' fields 3. Add debug logging of response keys to help troubleshoot This ensures that custom imported models can be used with both streaming and non-streaming modes.
These tests verify that: 1. Custom models appear in the models list API 2. Custom models can be invoked successfully with a non-empty response 3. Custom models support streaming mode The tests require at least one custom model to be set up in the AWS account.
This utility script allows direct testing of custom imported models using boto3: 1. Tests direct invocation with the AWS Bedrock runtime 2. Tests streaming invocation with the AWS Bedrock runtime 3. Parses different response formats used by custom models 4. Extracts complete texts from streaming chunks Useful for debugging or verifying custom model behavior outside of the gateway API.
- Extract helper function to reduce code duplication - Simplify custom model listing logic - Improve model lookup to handle friendly IDs and AWS IDs efficiently - Add helper methods to extract completion text and usage - Replace DEBUG blocks with logger.debug() calls - Streamline validation and response handling 🤖 Generated with [Claude Code](https://claude.ai/code) Co-Authored-By: Claude <[email protected]>
Summary
mistral-7b-instruct-id:custom.a1b2c3d4
)Key Features
Test plan
🤖 Generated with Claude Code