-
Notifications
You must be signed in to change notification settings - Fork 319
Audit the use of unsafe in header/value.rs #419
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Conversation
Document the sound use of unsafe in HeaderValue::to_str().
The tests inlcude variation on the placement of unicode codepoints that encode to multi-byte UTF-8 and an instance of a [u8] with invalid UTF-8 that is still a valid HeaderValue.
The new code uses Iterator::try_fold() instead of external iteration to review the bytes in the HeaderValue and separate the printing of visible ascii bytes printing other bytes (and special casing b'"'). This new implementation may behave slightly different for a HeaderValue of length usize::MAX, but such a HeaderValue is likely to exhaust memory before the Debug impl comes into play (which also makes it hard to test this hypothetical edge case).
The comments describe the invariants that make the use of unsafe sound.
The simplification removes the need for one of the invariants that described how the use of unsafe is sound.
HeaderValue::from_maybe_shared_unchecked() is currently an unsafe function but it could be made a safe function without intoducing a soundness hole because it does not do anything directly that depends on `unsafe` and it isn't protecting any invariants of HeaderValue that are relied upon later when using `unsafe`. from_maybe_shared_unchecked() can accept bytes as a HeaderValue that are invalid bytes for what is described in RFC 7230 (section 3.2) but all such bytes would still be valid to convert directly to a str since they are all valid single-byte UTF-8.
HeaderValue::from_maybe_shared_unchecked had been marked unsafe but this was to indicate caller responsibility for api constraints and not caller responsibility for avoiding undefined behaviour. This change removes the use of unsafe for this purpose and instead uses a more explict doc comment to draw the caller's attention to the restrictions (as well as the continued use of the "_unchecked" suffix. This is not a breaking change (despite affecting the public API) because removing "unsafe" from a function still allows existing uses to compile. This use of "unsafe" was not protecting an invariant on the internal structre of HeaderValue that was relined on in unsafe blocks elsewhere in the module to ensure the absence of undefined behaviour. All (other) unsafe blocks in this module rely on local checking to ensure they avoid undefined behaviour.
Some of the comments describing the invariants necessary to ensure the sound use of "unsafe" refered to an older version of the reasoning that had used 2 invariants. This change makes them refer to "the invariant" instead of "invariant 1".
The latest commits (specifically 6364f79) make an opinionated change to marking As described in more detail in in the commit message for 6364f79, this is both sound and is not a breaking change. This change is taking the design view that I am making this change to have a concrete proposal for consideration. If there are reasons to prefer the other approach to |
Reorganized the
Debug
impl forHeaderValue
to make the invariants that its use ofunsafe
rely upon for soundness more obvious. This also adds a few more test cases for edges cases of theDebug
impl. This also adds comments to all of the uses ofunsafe
in header/value.rs to explain and document their soundness.One of the uses of
unsafe
in this file is in the functionfrom_maybe_shared_unchecked()
which is part of the public API of this crate. Currently this function is anunsafe
function, but ifunsafe
functions are intended to signify caller responsibility for preventing undefined behaviour then this function is safe. There are no parameters you can pass tofrom_maybe_shared_unchecked()
that will cause undefined behaviour.There are, however, parameters you can pass to
from_maybe_shared_unchecked()
that would create aHeaderValue
that did not comply with RFC 7230 (section 3.2). That is, aHeaderValue
that contained (for example) a NUL byte (0x00). Ifunsafe
functions are intended to signify caller responsibility for maintaining some precondition (not just avoiding undefined behaviour) then it may make sense to keepfrom_maybe_shared_unchecked()
as anunsafe
function.Removing
unsafe
fromfrom_maybe_shared_unchecked()
is an API change but it does not appear to be a breaking change.Before this is merged we have to make a decision on the approach to
unsafe
functions to use andmaymake further changes tofrom_maybe_shared_unchecked()
(I can at least improve the doc comments if theunsafe
should remain).This is part of #412.