Skip to content

Remove MD5 hashing? #1258

@woodruffw

Description

@woodruffw

Is there an existing issue for this?

  • I have searched the existing issues (open and closed), and could not find an existing issue

What keywords did you use to search existing issues?

md5 hashing

Please describe why your using this option

This is a feature/change idea I'd like to get feedback on before implementing 🙂

TL;DR: PackageMetadata currently contains an md5_digest, which gets populated from HashManager, at least when MD5 isn't disabled by FIPS.

However, MD5 is (1) broken, and (2) completely ignored by PyPI anyways: it's both marked as an optional field and is completely ignored when processing the uploaded file. Consequently, computing the MD5 is primarily a waste of compute on the twine upload side, since it gets dropped immediately by the index.

So: I propose we remove md5_digest and drop it entirely from HashManager, leaving just sha256 and blake2b (both of which are still safe and are faster to compute on modern hardware to boot).

Anything else you'd like to mention?

This came to mind because of astral-sh/uv#10202 -- IMO uv publish's behavior (not providing optional broken hashes) is correct, and I think having uploaders unify on the behavior of not uploading broken hashes would be good for the ecosystem as a whole.

(FWICT this wouldn't be a breaking change on any private index implementations, but Artifactory might spit out a warning about it. So one potential downside here is a flood of issues from Artifactory user thinking that this is a twine bug rather than Artifactory bizarrely requiring MD5 on top of actually secure hashes. But I'm happy to triage those reports, if that'd help motivate making this change.)

Metadata

Metadata

Assignees

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions