🚨 Fix instructions when using Anthropic #2190

HamzaFarhan · 2025-07-12T21:18:28Z

As discussed here: https://pydanticlogfire.slack.com/archives/C083V7PMHHA/p1752350400101139
Thanks to Raymond for finding it.

samuelcolvin · 2025-07-12T21:26:47Z

Thanks so much for this, could you provide a bit more explanation of what's going on.

Also can we add a test that covers this case?

HamzaFarhan · 2025-07-12T21:35:43Z

The issue was anthropic models ignoring instructions.
It was working when using system_prompt either as a param or a decorator
It was not working when using instructions either way
An interesting problem when I compared the post requests:
system_prompt:

{
    "http.request.body.text": {
        "max_tokens": 4096,
        "messages": [{"role": "user", "content": [{"text": "What is my name?", "type": "text"}]}],
        "model": "claude-sonnet-4-20250514",
        "stream": False,
        "system": "my name is hamza",
    }
}

instructions:

{
    "http.request.body.text": {
        "max_tokens": 4096,
        "messages": [{"role": "user", "content": [{"text": "What is my name?", "type": "text"}]}],
        "model": "claude-sonnet-4-20250514",
        "stream": False,
        "system": """my name is hamza

""",
    }
}

When using instructions, the agent did not know my name.
Notice the difference. For some reason, because of this new line, it was not working.
I added a .strip() when sending the post request, and now it works.

HamzaFarhan · 2025-07-12T21:36:03Z

Let me think of a test and add it for review

HamzaFarhan · 2025-07-12T22:12:28Z

Ok, even more interesting:

import logfire
from dotenv import load_dotenv
from pydantic_ai import Agent

load_dotenv()
logfire.configure()
logfire.instrument_pydantic_ai()
logfire.instrument_httpx(capture_all=True)


agent = Agent(model="anthropic:claude-sonnet-4-20250514", instructions="my name is hamza")

prompt1 = "What is my name?"
prompt2 = "What is my name? what info do you have from prev messages?"

result1 = agent.run_sync(user_prompt=prompt1)
print(result1.output)

I don't have access to information about your name from our conversation. If you'd like me to know your name, you're welcome to tell me! Is there something specific you'd like me to help you with today?

result2 = agent.run_sync(user_prompt=prompt2)
print(result2.output)

Your name is Hamza, based on what you told me in your previous message where you said "my name is hamza."

That's the only information I have from our conversation history. I don't have access to any information about you from outside our current conversation thread, and I don't retain information between separate conversation sessions. Each conversation starts fresh for me.

Without the "fix", result1 says that it doesn't know my name, result2 correctly states my name.
With the "fix", both correctly state my name.

Maybe the instructions are just too small and get overpowered by claude's actual system prompt?
But the fix is still useful.

HamzaFarhan · 2025-07-12T23:00:10Z

The overpowered theory might not be correct if just adding a .strip() is fixing it.

samuelcolvin · 2025-07-13T01:12:37Z

this is weird, thanks for providing the example.

For reference, here's an MRE:

# /// script
# requires-python = ">=3.13"
# dependencies = [
#     "logfire[httpx]",
#     "pydantic-ai==0.4.2",
# ]
# ///
import logfire
from pydantic_ai import Agent

logfire.configure(service_name='testing-anthropic-instructions')
logfire.instrument_pydantic_ai()
logfire.instrument_httpx(capture_all=True)

agent = Agent(model='anthropic:claude-sonnet-4-0', instructions='my name is Samuel')

result1 = agent.run_sync(user_prompt='What is my name? what info do you have from prev messages?')
print(result1.output)

And the request body from Logfire:

samuelcolvin · 2025-07-13T01:15:59Z

And confirmed that with this fix, Anthropic behaves correctly. 🤯

Needs a test, then let's merge this.

samuelcolvin · 2025-07-13T01:24:07Z

Okay, well it's not as simple as that, running

# /// script
# requires-python = ">=3.12"
# dependencies = [
#     "anthropic==0.52.0",
# ]
# ///
from anthropic import Anthropic

client = Anthropic()

message = client.messages.create(
    max_tokens=1024,
    model='claude-sonnet-4-0',
    system='My Name is Samuel',
    messages=[
        {
            'role': 'user',
            'content': 'what is my name',
        }
    ],
)
print(message.content)

Prints:

[TextBlock(citations=None, text="I don't have access to information about your name. You haven't told me what your name is in our conversation. If you'd like me to know your name, please feel free to share it with me!", type='text')]

E.g. it can't get this kind of information from system for some reason.

But if I run:

# /// script
# requires-python = ">=3.13"
# dependencies = [
#     "pydantic-ai==0.4.2",
# ]
# ///
from pydantic_ai import Agent

agent = Agent(model='anthropic:claude-sonnet-4-0', instructions='Reply in French')

result1 = agent.run_sync(user_prompt='what color is the sky?')
print(result1.output)

The agent reliably responds in french.

HamzaFarhan · 2025-07-13T01:28:59Z

BUT

from anthropic import Anthropic

client = Anthropic()

message = client.messages.create(
    max_tokens=1024,
    model="claude-sonnet-4-0",
    system="reply in french",
    messages=[
        {
            "role": "user",
            "content": "what is the color of the sky?",
        }
    ],
)
print(message.content)

The agent does respond in french 🤔

HamzaFarhan · 2025-07-13T01:34:17Z

Thoughts on the green "System prompt tips" at the start?
Looks like system should just be for role?
https://docs.anthropic.com/en/docs/build-with-claude/prompt-engineering/system-prompts

HamzaFarhan · 2025-07-13T01:41:58Z

This is from a year ago which is way too long for AI: https://www.reddit.com/r/ClaudeAI/comments/1ek5dy3/comment/lgj3stf/?utm_source=share&utm_medium=web3x&utm_name=web3xcss&utm_term=1&utm_content=share_button

But as a last resort, we could send the instructions as the first user message instead and make this clear in the docs?
Even if it's dynamic, just update the first message.

HamzaFarhan · 2025-07-13T16:29:46Z

Can confirm that is not an issue when the instructions string is long.

Copilot

Pull Request Overview

This PR refactors formatting in Anthropic-related tests and updates the Anthropic model implementation to trim system prompts and reorganize imports.

tests/models/test_anthropic.py: Reformatted ToolCallPart entries, added a new async test for instruction-following behavior, and adjusted snapshot formatting for Usage.
pydantic_ai_slim/pydantic_ai/models/anthropic.py: Expanded import list to multiple lines, added .strip() when passing system_prompt, and reformatted request parameter objects for readability.

Reviewed Changes

Copilot reviewed 2 out of 2 changed files in this pull request and generated 2 comments.

File	Description
tests/models/test_anthropic.py	Added `test_anthropic_model_following_instructions`, reformatted `ToolCallPart` and snapshot calls
pydantic_ai_slim/pydantic_ai/models/anthropic.py	Reorganized imports, added `.strip()` on `system_prompt`, reformatted payload parameters for clarity

Comments suppressed due to low confidence (1)

tests/models/test_anthropic.py:969

[nitpick] The variable name m is not descriptive. Consider renaming it to model or anthropic_model for clarity.

    m = AnthropicModel('anthropic:claude-sonnet-4-20250514', provider=AnthropicProvider(api_key=anthropic_api_key))

Copilot · 2025-07-14T16:21:29Z

pydantic_ai_slim/pydantic_ai/models/anthropic.py

@@ -238,7 +245,7 @@ async def _messages_create(
            extra_headers.setdefault('User-Agent', get_user_agent())
            return await self.client.beta.messages.create(
                max_tokens=model_settings.get('max_tokens', 4096),
-                system=system_prompt or NOT_GIVEN,
+                system=system_prompt.strip() or NOT_GIVEN,


Using .strip() will remove all leading and trailing whitespace, including intentional newlines or spaces used for formatting the system prompt. Consider using .rstrip() if you only want to remove trailing whitespace, or explicitly handle trimming of blank-only strings.

Suggested change

system=system_prompt.strip() or NOT_GIVEN,

system=system_prompt.rstrip() if system_prompt.strip() else NOT_GIVEN,

Copilot · 2025-07-14T16:21:30Z

tests/models/test_anthropic.py

+                tool_name='retrieve_entity_info',
+                args={'name': 'Alice'},
+                tool_call_id=IsStr(),
+                part_kind='tool-call',
            ),
            ToolCallPart(
                tool_name='retrieve_entity_info', args={'name': 'Bob'}, tool_call_id=IsStr(), part_kind='tool-call'


[nitpick] This ToolCallPart for Bob remains on one line while the others are broken out into multiple lines. For consistency and readability, apply the same multiline formatting here.

Suggested change

tool_name='retrieve_entity_info', args={'name': 'Bob'}, tool_call_id=IsStr(), part_kind='tool-call'

tool_name='retrieve_entity_info',

args={'name': 'Bob'},

tool_call_id=IsStr(),

part_kind='tool-call',

Kludex

The confusion here and in the slack thread seems to be based on a misleading example.

On the examples, you folks are doing system="my name is Samuel" when it should say something like system="the user's name is Samuel", which then would give the correct answer on every case for the question "What is my name?".

I'm unsure why it gives the "wrong expected answer" when the empty characters are removed. I think it doesn't harm to accept this PR, but it seems we would be accepting this PR for the wrong reason.

Also, if we do want this PR to be merged - can you please undo all the cosmetic changes? There are a lot in this PR.

HamzaFarhan · 2025-07-15T09:10:11Z

Ooh so that's why the French one worked right?
Yeah sorry about the formatting. I guess ruff ignored pyproject.toml.
But it looks like we don't need this PR then.

Kludex · 2025-07-15T09:49:45Z

Ooh so that's why the French one worked right?

Yes.

I'll close it then. 🙏

But it would be cool to understand the weird behavior anyway.

🚨 Fix instructions when using Anthropic

6215a95

add test

f4486f2

fix test

7512d62

Kludex requested a review from Copilot July 14, 2025 16:20

Copilot AI reviewed Jul 14, 2025

View reviewed changes

Kludex requested changes Jul 15, 2025

View reviewed changes

Kludex assigned HamzaFarhan Jul 15, 2025

Kludex closed this Jul 15, 2025

HamzaFarhan deleted the fix-anthropic-instructions branch July 15, 2025 16:51

	system=system_prompt.strip() or NOT_GIVEN,
	system=system_prompt.rstrip() if system_prompt.strip() else NOT_GIVEN,

🚨 Fix instructions when using Anthropic #2190

🚨 Fix instructions when using Anthropic #2190

Uh oh!

Conversation

HamzaFarhan commented Jul 12, 2025

Uh oh!

samuelcolvin commented Jul 12, 2025

Uh oh!

HamzaFarhan commented Jul 12, 2025 • edited by samuelcolvin Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

HamzaFarhan commented Jul 12, 2025

Uh oh!

HamzaFarhan commented Jul 12, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

HamzaFarhan commented Jul 12, 2025

Uh oh!

samuelcolvin commented Jul 13, 2025

Uh oh!

samuelcolvin commented Jul 13, 2025

Uh oh!

samuelcolvin commented Jul 13, 2025

Uh oh!

HamzaFarhan commented Jul 13, 2025

Uh oh!

HamzaFarhan commented Jul 13, 2025

Uh oh!

HamzaFarhan commented Jul 13, 2025

Uh oh!

HamzaFarhan commented Jul 13, 2025

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull Request Overview

Reviewed Changes

Uh oh!

Copilot AI Jul 14, 2025

Choose a reason for hiding this comment

Uh oh!

Copilot AI Jul 14, 2025

Choose a reason for hiding this comment

Uh oh!

Kludex left a comment

Choose a reason for hiding this comment

Uh oh!

HamzaFarhan commented Jul 15, 2025

Uh oh!

Kludex commented Jul 15, 2025

Uh oh!

Uh oh!

HamzaFarhan commented Jul 12, 2025 •

edited by samuelcolvin

Loading

HamzaFarhan commented Jul 12, 2025 •

edited

Loading