Skip to content

Quoted charset values are not detected by get_encoding_from_headers #7234

@jmdotdev

Description

@jmdotdev

requests.utils.get_encoding_from_headers fails to detect charset values when they are surrounded by single or double quotes in the Content-Type header. In such cases, the function returns None instead of the correct encoding string.
Unquoted charset values continue to behave correctly.

Expected Result

When a quoted charset is present in the Content-Type header, the function should return the unquoted charset value.
Example:
headers = { "Content-Type": 'text/html; charset="utf-8"' }

Expected:
"utf-8"

Actual Result

The function returns:
None
for quoted charset values such as:

  •  `charset= "utf-8"`
    
  •  `charset= 'utf-8'`
    

Unquoted charset values (e.g., charset=utf-8) behave correctly.

Reproduction Steps

from requests.utils import get_encoding_from_headers

headers = {
    "Content-Type": 'text/html; charset="utf-8"'
}

print(get_encoding_from_headers(headers))

Output:
None

System Information

$ python -m requests.help
{
  "chardet": {
    "version": null
  },
  "charset_normalizer": {
    "version": "3.4.4"
  },
  "cryptography": {
    "version": ""
  },
  "idna": {
    "version": "3.11"
  },
  "implementation": {
    "name": "CPython",
    "version": "3.12.12"
  },
  "platform": {
    "release": "6.6.87.2-microsoft-standard-WSL2",
    "system": "Linux"
  },
  "pyOpenSSL": {
    "openssl_version": "",
    "version": null
  },
  "requests": {
    "version": "2.32.5"
  },
  "system_ssl": {
    "version": "30500040"
  },
  "urllib3": {
    "version": "2.6.3"
  },
  "using_charset_normalizer": true,
  "using_pyopenssl": false
}

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions