Skip to content

Question: Support for imported models? #99

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
thiagoscodelerae opened this issue Feb 4, 2025 · 2 comments · May be fixed by #97
Open

Question: Support for imported models? #99

thiagoscodelerae opened this issue Feb 4, 2025 · 2 comments · May be fixed by #97
Labels
enhancement New feature or request

Comments

@thiagoscodelerae
Copy link

Does this solution support the use of imported models? For example, I'm importing DeepSeek-R1-Distill-Llama-8B

@thiagoscodelerae thiagoscodelerae added the enhancement New feature or request label Feb 4, 2025
@sean-smith sean-smith linked a pull request Feb 6, 2025 that will close this issue
@sean-smith
Copy link
Contributor

@thiagoscodelerae I've got a branch that supports custom models and an open PR. If you setup via my branch you'll be all set:

  1. Clone my branch, see feat: Enable Imported models #97 for changes
git clone -b imported-models https://github.com/sean-smith/bedrock-access-gateway
cd bedrock-access-gateway/
  1. Build a new docker container, I use --platform linux/amd64 to ensure it works even if built on a apple silicon:
cd src/
docker buildx build --platform linux/amd64 -t bedrock-access-gateway .
  1. Create an ECR repo called bedrock-access-gateway
Image
  1. Tag and push to ECR:
aws ecr get-login-password --region <region> | docker login --username AWS --password-stdin <account-id>.dkr.ecr.<region>.amazonaws.com
docker tag bedrock-access-gateway:latest <account-id>.dkr.ecr.<region>.amazonaws.com/bedrock-access-gateway:latest
docker push <account-id>.dkr.ecr.<region>.amazonaws.com/bedrock-access-gateway:latest
  1. Create a stack with a custom ECR image, making sure to update the ECR image URI <account-id>.dkr.ecr.<region>.amazonaws.com/bedrock-access-gateway:latest and the ApiKeyParam if you didn't call it BedrockProxyAPIKey:
aws cloudformation create-stack --stack-name bedrock-access-gateway --template-body file://deployment/BedrockProxy.yaml --parameter ParameterKey=ApiKeyParam,ParameterValue=BedrockProxyAPIKey ParameterKey=EnableImportedModels,ParameterValue=true ParameterKey=ImageUri,ParameterValue=<account-id>.dkr.ecr.<region>.amazonaws.com/bedrock-access-gateway:latest
 --capabilities CAPABILITY_AUTO_EXPAND

Now you can call your custom model and it'll show up when you list custom models:

from openai import OpenAI

client = OpenAI()

# List available models
for model in client.models.list():
    print(model.id)

# call custom model by arn
completion = client.chat.completions.create(
    model="arn:aws:bedrock:us-west-2:<account-id>:imported-model/<model-id>",
    messages=[
        {
            "role": "user",
            "content": "Hello! please tell me a joke"
        }
    ],
)

Voila! See #97 for more details.

@thiagoscodelerae
Copy link
Author

@sean-smith thanks for sharing. The list models function is working fine but the custom call to the model is not working with the following error:

openai.BadRequestError: Error code: 400 - {'detail': "An error occurred (ValidationException) when calling the Converse operation: This action doesn't support the model that you provided. Try again with a supported text or chat model.

For now (local testing purposes before pushing to AWS), I'm using this fork below with a couple of changes. It is working fine so far.
Fork: https://github.com/didhd/bedrock-access-gateway

Changed this:
https://github.com/didhd/bedrock-access-gateway/blob/main/src/api/models/bedrock.py#L625

to:

    def get_message_text(self, response_body: dict) -> str:
        logger.info(response_body)
        return response_body["generation"]

    def get_message_finish_reason(self, response_body: dict) -> str:
        return response_body["stop_reason"]

    def get_message_usage(self, response_body: dict) -> tuple[int, int]:
        input_tokens = int(response_body.get("prompt_token_count", "0"))
        output_tokens = int(response_body.get("generation_token_count", "0"))
        return input_tokens, output_tokens

It might help you with your implementation.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants