Skip to content

🚨 Fix instructions when using Anthropic #2190

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed

Conversation

HamzaFarhan
Copy link
Contributor

As discussed here: https://pydanticlogfire.slack.com/archives/C083V7PMHHA/p1752350400101139
Thanks to Raymond for finding it.

@samuelcolvin
Copy link
Member

Thanks so much for this, could you provide a bit more explanation of what's going on.

Also can we add a test that covers this case?

@HamzaFarhan
Copy link
Contributor Author

HamzaFarhan commented Jul 12, 2025

The issue was anthropic models ignoring instructions.
It was working when using system_prompt either as a param or a decorator
It was not working when using instructions either way
An interesting problem when I compared the post requests:
system_prompt:

{
    "http.request.body.text": {
        "max_tokens": 4096,
        "messages": [{"role": "user", "content": [{"text": "What is my name?", "type": "text"}]}],
        "model": "claude-sonnet-4-20250514",
        "stream": False,
        "system": "my name is hamza",
    }
}

instructions:

{
    "http.request.body.text": {
        "max_tokens": 4096,
        "messages": [{"role": "user", "content": [{"text": "What is my name?", "type": "text"}]}],
        "model": "claude-sonnet-4-20250514",
        "stream": False,
        "system": """my name is hamza

""",
    }
}

When using instructions, the agent did not know my name.
Notice the difference. For some reason, because of this new line, it was not working.
I added a .strip() when sending the post request, and now it works.

@HamzaFarhan
Copy link
Contributor Author

Let me think of a test and add it for review

@HamzaFarhan
Copy link
Contributor Author

HamzaFarhan commented Jul 12, 2025

Ok, even more interesting:

import logfire
from dotenv import load_dotenv
from pydantic_ai import Agent

load_dotenv()
logfire.configure()
logfire.instrument_pydantic_ai()
logfire.instrument_httpx(capture_all=True)


agent = Agent(model="anthropic:claude-sonnet-4-20250514", instructions="my name is hamza")

prompt1 = "What is my name?"
prompt2 = "What is my name? what info do you have from prev messages?"

result1 = agent.run_sync(user_prompt=prompt1)
print(result1.output)

I don't have access to information about your name from our conversation. If you'd like me to know your name, you're welcome to tell me! Is there something specific you'd like me to help you with today?

result2 = agent.run_sync(user_prompt=prompt2)
print(result2.output)

Your name is Hamza, based on what you told me in your previous message where you said "my name is hamza."

That's the only information I have from our conversation history. I don't have access to any information about you from outside our current conversation thread, and I don't retain information between separate conversation sessions. Each conversation starts fresh for me.

Without the "fix", result1 says that it doesn't know my name, result2 correctly states my name.
With the "fix", both correctly state my name.

Maybe the instructions are just too small and get overpowered by claude's actual system prompt?
But the fix is still useful.

@HamzaFarhan
Copy link
Contributor Author

The overpowered theory might not be correct if just adding a .strip() is fixing it.

@samuelcolvin
Copy link
Member

this is weird, thanks for providing the example.

For reference, here's an MRE:

# /// script
# requires-python = ">=3.13"
# dependencies = [
#     "logfire[httpx]",
#     "pydantic-ai==0.4.2",
# ]
# ///
import logfire
from pydantic_ai import Agent

logfire.configure(service_name='testing-anthropic-instructions')
logfire.instrument_pydantic_ai()
logfire.instrument_httpx(capture_all=True)

agent = Agent(model='anthropic:claude-sonnet-4-0', instructions='my name is Samuel')

result1 = agent.run_sync(user_prompt='What is my name? what info do you have from prev messages?')
print(result1.output)

And the request body from Logfire:

image

@samuelcolvin
Copy link
Member

And confirmed that with this fix, Anthropic behaves correctly. 🤯

image image

Needs a test, then let's merge this.

@samuelcolvin
Copy link
Member

Okay, well it's not as simple as that, running

# /// script
# requires-python = ">=3.12"
# dependencies = [
#     "anthropic==0.52.0",
# ]
# ///
from anthropic import Anthropic

client = Anthropic()

message = client.messages.create(
    max_tokens=1024,
    model='claude-sonnet-4-0',
    system='My Name is Samuel',
    messages=[
        {
            'role': 'user',
            'content': 'what is my name',
        }
    ],
)
print(message.content)

Prints:

[TextBlock(citations=None, text="I don't have access to information about your name. You haven't told me what your name is in our conversation. If you'd like me to know your name, please feel free to share it with me!", type='text')]

E.g. it can't get this kind of information from system for some reason.

But if I run:

# /// script
# requires-python = ">=3.13"
# dependencies = [
#     "pydantic-ai==0.4.2",
# ]
# ///
from pydantic_ai import Agent

agent = Agent(model='anthropic:claude-sonnet-4-0', instructions='Reply in French')

result1 = agent.run_sync(user_prompt='what color is the sky?')
print(result1.output)

The agent reliably responds in french.

@HamzaFarhan
Copy link
Contributor Author

BUT

from anthropic import Anthropic

client = Anthropic()

message = client.messages.create(
    max_tokens=1024,
    model="claude-sonnet-4-0",
    system="reply in french",
    messages=[
        {
            "role": "user",
            "content": "what is the color of the sky?",
        }
    ],
)
print(message.content)

The agent does respond in french 🤔

@HamzaFarhan
Copy link
Contributor Author

Thoughts on the green "System prompt tips" at the start?
Looks like system should just be for role?
https://docs.anthropic.com/en/docs/build-with-claude/prompt-engineering/system-prompts

@HamzaFarhan
Copy link
Contributor Author

This is from a year ago which is way too long for AI: https://www.reddit.com/r/ClaudeAI/comments/1ek5dy3/comment/lgj3stf/?utm_source=share&utm_medium=web3x&utm_name=web3xcss&utm_term=1&utm_content=share_button

But as a last resort, we could send the instructions as the first user message instead and make this clear in the docs?
Even if it's dynamic, just update the first message.

@HamzaFarhan
Copy link
Contributor Author

Can confirm that is not an issue when the instructions string is long.

@Kludex Kludex requested a review from Copilot July 14, 2025 16:20
Copy link

@Copilot Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This PR refactors formatting in Anthropic-related tests and updates the Anthropic model implementation to trim system prompts and reorganize imports.

  • tests/models/test_anthropic.py: Reformatted ToolCallPart entries, added a new async test for instruction-following behavior, and adjusted snapshot formatting for Usage.
  • pydantic_ai_slim/pydantic_ai/models/anthropic.py: Expanded import list to multiple lines, added .strip() when passing system_prompt, and reformatted request parameter objects for readability.

Reviewed Changes

Copilot reviewed 2 out of 2 changed files in this pull request and generated 2 comments.

File Description
tests/models/test_anthropic.py Added test_anthropic_model_following_instructions, reformatted ToolCallPart and snapshot calls
pydantic_ai_slim/pydantic_ai/models/anthropic.py Reorganized imports, added .strip() on system_prompt, reformatted payload parameters for clarity
Comments suppressed due to low confidence (1)

tests/models/test_anthropic.py:969

  • [nitpick] The variable name m is not descriptive. Consider renaming it to model or anthropic_model for clarity.
    m = AnthropicModel('anthropic:claude-sonnet-4-20250514', provider=AnthropicProvider(api_key=anthropic_api_key))

@@ -238,7 +245,7 @@ async def _messages_create(
extra_headers.setdefault('User-Agent', get_user_agent())
return await self.client.beta.messages.create(
max_tokens=model_settings.get('max_tokens', 4096),
system=system_prompt or NOT_GIVEN,
system=system_prompt.strip() or NOT_GIVEN,
Copy link
Preview

Copilot AI Jul 14, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Using .strip() will remove all leading and trailing whitespace, including intentional newlines or spaces used for formatting the system prompt. Consider using .rstrip() if you only want to remove trailing whitespace, or explicitly handle trimming of blank-only strings.

Suggested change
system=system_prompt.strip() or NOT_GIVEN,
system=system_prompt.rstrip() if system_prompt.strip() else NOT_GIVEN,

Copilot uses AI. Check for mistakes.

tool_name='retrieve_entity_info',
args={'name': 'Alice'},
tool_call_id=IsStr(),
part_kind='tool-call',
),
ToolCallPart(
tool_name='retrieve_entity_info', args={'name': 'Bob'}, tool_call_id=IsStr(), part_kind='tool-call'
Copy link
Preview

Copilot AI Jul 14, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[nitpick] This ToolCallPart for Bob remains on one line while the others are broken out into multiple lines. For consistency and readability, apply the same multiline formatting here.

Suggested change
tool_name='retrieve_entity_info', args={'name': 'Bob'}, tool_call_id=IsStr(), part_kind='tool-call'
tool_name='retrieve_entity_info',
args={'name': 'Bob'},
tool_call_id=IsStr(),
part_kind='tool-call',

Copilot uses AI. Check for mistakes.

Copy link
Member

@Kludex Kludex left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The confusion here and in the slack thread seems to be based on a misleading example.

On the examples, you folks are doing system="my name is Samuel" when it should say something like system="the user's name is Samuel", which then would give the correct answer on every case for the question "What is my name?".

I'm unsure why it gives the "wrong expected answer" when the empty characters are removed. I think it doesn't harm to accept this PR, but it seems we would be accepting this PR for the wrong reason.


Also, if we do want this PR to be merged - can you please undo all the cosmetic changes? There are a lot in this PR.

@HamzaFarhan
Copy link
Contributor Author

Ooh so that's why the French one worked right?
Yeah sorry about the formatting. I guess ruff ignored pyproject.toml.
But it looks like we don't need this PR then.

@Kludex
Copy link
Member

Kludex commented Jul 15, 2025

Ooh so that's why the French one worked right?

Yes.


I'll close it then. 🙏

But it would be cool to understand the weird behavior anyway.

@Kludex Kludex closed this Jul 15, 2025
@HamzaFarhan HamzaFarhan deleted the fix-anthropic-instructions branch July 15, 2025 16:51
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants