Skip to content

Audit the use of unsafe in header/value.rs #419

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 8 commits into
base: master
Choose a base branch
from

Conversation

sbosnick
Copy link
Contributor

@sbosnick sbosnick commented May 3, 2020

Reorganized the Debug impl for HeaderValue to make the invariants that its use of unsafe rely upon for soundness more obvious. This also adds a few more test cases for edges cases of the Debug impl. This also adds comments to all of the uses of unsafe in header/value.rs to explain and document their soundness.

One of the uses of unsafe in this file is in the function from_maybe_shared_unchecked() which is part of the public API of this crate. Currently this function is an unsafe function, but if unsafe functions are intended to signify caller responsibility for preventing undefined behaviour then this function is safe. There are no parameters you can pass to from_maybe_shared_unchecked() that will cause undefined behaviour.

There are, however, parameters you can pass to from_maybe_shared_unchecked() that would create a HeaderValue that did not comply with RFC 7230 (section 3.2). That is, a HeaderValue that contained (for example) a NUL byte (0x00). If unsafe functions are intended to signify caller responsibility for maintaining some precondition (not just avoiding undefined behaviour) then it may make sense to keep from_maybe_shared_unchecked() as an unsafe function.

Removing unsafe from from_maybe_shared_unchecked() is an API change but it does not appear to be a breaking change.

Before this is merged we have to make a decision on the approach to unsafe functions to use and may make further changes to from_maybe_shared_unchecked() (I can at least improve the doc comments if the unsafe should remain).

This is part of #412.

sbosnick added 6 commits May 2, 2020 12:09
Document the sound use of unsafe in HeaderValue::to_str().
The tests inlcude variation on the placement of unicode codepoints that
encode to multi-byte UTF-8 and an instance of a [u8] with invalid UTF-8
that is still a valid HeaderValue.
The new code uses Iterator::try_fold() instead of external iteration to
review the bytes in the HeaderValue and separate the printing of visible
ascii bytes printing other bytes (and special casing b'"').

This new implementation may behave slightly different for a HeaderValue
of length usize::MAX, but such a HeaderValue is likely to exhaust memory
before the Debug impl comes into play (which also makes it hard to test
this hypothetical edge case).
The comments describe the invariants that make the use of unsafe sound.
The simplification removes the need for one of the invariants that
described how the use of unsafe is sound.
HeaderValue::from_maybe_shared_unchecked() is currently an unsafe
function but it could be made a safe function without intoducing a
soundness hole because it does not do anything directly that depends on
`unsafe` and it isn't protecting any invariants of HeaderValue that are
relied upon later when using `unsafe`.

from_maybe_shared_unchecked() can accept bytes as a HeaderValue that are
invalid bytes for what is described in RFC 7230 (section 3.2) but all
such bytes would still be valid to convert directly to a str since they
are all valid single-byte UTF-8.
sbosnick added 2 commits May 24, 2020 09:58
HeaderValue::from_maybe_shared_unchecked had been marked unsafe but this
was to indicate caller responsibility for api constraints and not caller
responsibility for avoiding undefined behaviour. This change removes the
use of unsafe for this purpose and instead uses a more explict doc
comment to draw the caller's attention to the restrictions (as well as
the continued use of the "_unchecked" suffix.

This is not a breaking change (despite affecting the public API) because
removing "unsafe" from a function still allows existing uses to compile.

This use of "unsafe" was not protecting an invariant on the internal
structre of HeaderValue that was relined on in unsafe blocks elsewhere
in the module to ensure the absence of undefined behaviour. All (other)
unsafe blocks in this module rely on local checking to ensure they avoid
undefined behaviour.
Some of the comments describing the invariants necessary to ensure the
sound use of "unsafe" refered to an older version of the reasoning that
had used 2 invariants. This change makes them refer to "the invariant"
instead of "invariant 1".
@sbosnick
Copy link
Contributor Author

The latest commits (specifically 6364f79) make an opinionated change to marking from_maybe_shared_unchecked as an unsafe function. Specifically it removes the unsafe from the function definition.

As described in more detail in in the commit message for 6364f79, this is both sound and is not a breaking change.

This change is taking the design view that unsafe function in the public API should be restricted to functions where the caller bears responsibility for preventing undefined behaviour and not as a general means of indicating caller responsibility for ensuring that API constraints unrelated to undefined behaviour are met.

I am making this change to have a concrete proposal for consideration. If there are reasons to prefer the other approach to unsafe functions in the public API then I can change this back and explain the constraints (and their relationship to avoiding undefined behaviour) through comments.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant