Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Databricks Bad Request Error #4761

Open
Aljgutier opened this issue Dec 19, 2024 · 8 comments
Open

Databricks Bad Request Error #4761

Aljgutier opened this issue Dec 19, 2024 · 8 comments

Comments

@Aljgutier
Copy link

Aljgutier commented Dec 19, 2024

What happened?

trying to run AutoGen Databricks Hello World example

Running on - Databricks LTS 14.3 ML

%pip install autogen-agentchat==0.2.40 openai==1.21.2 typing_extensions==4.11.0 --upgrade

the rest of the configurations are as specified in the example

os.environ["DATABRICKS_TOKEN"] = "dapi...."

llm_config = {
    "config_list": [
        {
            "model": "databricks-dbrx-instruct",
            "api_key": str(os.environ["DATABRICKS_TOKEN"]),
            "base_url": str(os.getenv("DATABRICKS_HOST")) + "/serving-endpoints",
        }
    ],
}
import autogen

# Create Assistant and User
assistant = autogen.AssistantAgent(name="assistant", llm_config=llm_config)

user_proxy = autogen.UserProxyAgent(name="user", code_execution_config=False)

# Initiate chat from user_proxy side
chat_result = user_proxy.initiate_chat(assistant, message="What is MLflow?")

------
user (to assistant):

What is MLflow?
-------
BadRequestError: Error code: 400 - {'error_code': 'BAD_REQUEST', 'message': 'Bad request: json: unknown field "name"\n'}

...

.../openai/_base_client.py 922
return ...

What did you expect to happen?

I expected a response like in the referenced article. Something like

Sure, I'd be happy to explain MLflow to you. MLflow is an open-source platform for managing machine learning workflows. It was developed by Databricks and was open-sourced in 2018. MLflow provides a number of features to help data scientists and machine learning engineers manage the end-to-end machine learning lifecycle, including:

1. **MLflow Tracking**: This is a logging API that allows you to record and query experiments, including code, data, config, and results.
2. **MLflow Projects**: This is a format for packaging reusable and reproducible data science code, which can be run on different platforms.
3. **MLflow Models**: This is a convention for packaging machine learning models in multiple formats, making it easy to deploy in different environments.
4. **MLflow Model Registry**: This is a central repository to store, manage, and serve machine learning models.
...

How can we reproduce it (as minimally and precisely as possible)?

Try to run the code (as in the blog post) on a Databricks cluster. Does it still work. Note, in addition to the code above, I had to add to get around the error" "OpenAI Client.init() got an unexpected keyword argument proxies"

%pip install httpx==0.27.2

AutoGen version

0.2.40

Which package was this bug in

Core

Model used

No response

Python version

3.10

Operating system

Databricks LTS 14.3 ML

Any additional info you think would be helpful for fixing this bug

No response

@ekzhu
Copy link
Collaborator

ekzhu commented Dec 19, 2024

@tj-cycyota Could you take a look at this?

@tj-cycyota
Copy link
Contributor

Tracking this down - this notebook was developed in April 2024 on Autogen v0.2.25, which is no longer available on pypi. Something changed in the message handling logic (either in the OpenAI lib or Autogen) between 0.2.25 and 0.2.40 (which is what installs now).

@ekzhu the error is being thrown from the OpenAI SDK as there is an extra field "name" being submitted with the chat.completions request. The only place I can see this "name" field being populated is with some of the tool-calling functionality changes in Autogen.

{'error_code': 'BAD_REQUEST', 'message': 'Bad request: json: unknown field "name"\n'}

Here's the exact message being sent to OAI that throws the error, notice invalid schema for the message array. I'll also note this is exactly the quickstart example in the docsImage

[ { "content": "You are a helpful AI assistant.", "role": "system" }, { "content": "What is MLflow?", "role": "user", "name": "user" } ]

@tj-cycyota
Copy link
Contributor

This appears to the root issue: a key "name" is appended to every message before the LLM call, which may be invalid for non-OpenAI clients: https://github.com/microsoft/autogen/blob/0.2/autogen/agentchat/conversable_agent.py#L670

@tj-cycyota
Copy link
Contributor

@sonichi @marklysze I see Commit 77ae3c0 added a check to make sure every message dict includes the "name" key. Why was this needed? It breaks integrations with LLM providers that use the OpenAI SDK (in this case, Mosaic Model Serving)

@ekzhu
Copy link
Collaborator

ekzhu commented Jan 10, 2025

@tj-cycyota the v0.2 code requires the name field for orchestration. We just released v0.4.0 (stable) and decoupled the message types with OpenAI. It would be good to have a Databrick model client as an extension package. See extensions for existing model clients: https://microsoft.github.io/autogen/stable/user-guide/extensions-user-guide/discover.html

@Aljgutier
Copy link
Author

Hello @tj-cycyota and @ekzhu - thank you very much for tracking this. Indeed, this issue is unfortunate since, for me and our organization, Autogen is not usable in Databricks, which is the basis for our data science platform.

Is there any workaround, or will there be a fix?

Thank you again for your attention on this item

@ekzhu
Copy link
Collaborator

ekzhu commented Jan 13, 2025

@tj-cycyota , the openai Python SDK supports name field. So it's not just an AutoGen special thing.

See: https://github.com/openai/openai-python/blob/main/src/openai/types/chat/chat_completion_user_message_param.py#L20

@tj-cycyota
Copy link
Contributor

Thanks @ekzhu that makes sense.

@Aljgutier I'm in touch with the relevant Databricks engineers and this is going to be fixed ASAP. I don't have an exact date for you at this time.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants