Description
The T-compiler and T-lang teams signed off here, that
bool
has the same representation as_Bool
where on every platform that Rust currently supports this implies that:
bool
has a size and an alignment of1
,true as i32 == 1
andfalse as i32 == 0
, EDIT: that is always true, the valid bit patterns ofbool
is what matters here, e.g., on a platform wherebool
has the same size asi8
, whethertransmute::<_, i8>(true) == 1
andtransmute::<_, i8>(false) == 0
These two properties are not guaranteed by Rust, and unsafe
code cannot rely on these. In the last UCG WG meeting it was unclear whether we want to guarantee these two properties or not. As @rkruppe pointed out, this would be guaranteeing something different and incompatible with what T-lang and T-compiler guaranteed.
Note: any change that the UCG WG proposes will have to go through the RFC process anyways, were it might be rejected. This issue is being raised with stakeholders to evaluate whether there is something that needs changing or not, and if so, whether the change is possible, has chances to achieve consensus, etc.
The following arguments have been raised (hope I did not miss any):
-
T-lang and T-compiler did not specify which version of the C standard
_Bool
conforms to. In C++20 and C20, P0907r4 (C++) and N2218 (C) specify that:bool
and_Bool
contain no padding bits (only value bits),1 == (int)true
and0 == (int)false
.
In some of the merged PRs of the UCG we have already specified that the platform's C implementation needs to comply with some, e.g., C17 or "latest" C standard properties (e.g. for
repr(C) struct
layout). If we end up requiring C20 for representation / validity ofrepr(C)
, we end up guaranteeing these properties. AFAICT the only property aboutbool
that would remain as implementation-defined is its size and alignment. -
In Representation of bool, integers and floating points #9 / added floating point/int summary #49 , we ended up requiring that C's
CHAR_BITS == 8
. This implies that ifCHAR_BITS != 8
thenbool
cannot have the same representation as_Bool
. Some stakeholders still wanted to be able to do C FFI with these platforms, e.g. @briansmith suggested that Rust should diagnose, e.g., usingbool
on FFI on these platforms (Representation of bool, integers and floating points #9 (comment)), but that interfacing with those platforms via C FFI (e.g. against assembly) should still be possible (e.g. in a DSP whereCHAR_BITS == 16
passing au16
to C / assembly / ... expecting achar
or 16 bit integer should be doable). -
What exactly T-lang and T-compiler actually ended up guaranteeing isn't 100% clear. In Lint for the reserved ABI of bool rust#46176 the decision seems to be that
bool == _Bool
, but the PR that was actually merged only mentions thatbool
has a size of1
: Document the size of bool rust#46156. This might be an oversight in the docs, and some have mention that the reference is "non-normative". @briansmith pointed out (here and here) thatbool
ABI (e.g. integer class or something else?), alignment, bit patterns denotingtrue
andfalse
, etc. don't appear to be properly documented. @gankro summarized the status quo in Rust Layout and ABIs document and mentioned that projects like Firefox rely on these extra guarantees for correctness (e.g. forbindgen
, etc. to work properly, see here.
There are a couple of comments by @withoutboats that I think show both T-lang and T-compiler's rationale and the spirit behind their decision, here:
I worry that if we don't specify
bool
as equivalent to C and C++ booleans, people will need to usec_bool
in FFI to be cross platform compatible.I worry that if we don't specify
bool
as byte sized, people will create astruct Bool(u8)
to get that guarantee & keep their structs small.
and here:
People could come to the conclusion that they need a
c_bool
type for their FFI to be forward compatible with platforms we don't yet support. I think defining it as the same representation as_Bool
/ C++bool
makes it the least likely someone does something painful to avoid entirely hypothetical problems.
So even if the docs say that bool
has a size of 1
, and that's it, I believe that this last comment shows that the spirit of T-lang and T-compiler decision was to spare people from creating a c_bool
type to be forward compatible on C FFI with platforms that we might never properly support.
I think that the open questions that have to be clarified, are:
- Should we require C20 compatibility for
_Bool
, or do we want to stay backwards compatible with C99/11/17 ? (in this case, people can only rely on, e.g.,true == 1
on targets where the platform implementation is C20 "conforming" at least w.r.t._Bool
) - Do we want to require that
bool
has a size and an alignment of1
? (in the hypothetical case that we ever support a platform where this is not the case, we could raise animproper_ctype
warning / error on the platform, or some other form of diagnostic, as @briansmith suggested). This would be a change incompatible withbool == _Bool
, might lead people to create and use ac_bool
type, etc.
cc @rkruppe @withoutboats @briansmith @gankro @joshtriplett @cuviper @whitequark @est31 @SimonSapin
Activity
briansmith commentedon Dec 6, 2018
Those are stated to be "proposals". I believe the C++ proposal might have been accepted by the C++ committee, but I don't know if the C committee accepted the proposal yet.
strega-nil commentedon Dec 7, 2018
as
will always maketrue as INTEGRAL_TYPE == 1
, to be clear, no matter what the representation oftrue
is (and the same for false)gnzlbg commentedon Dec 7, 2018
@ubsan yes, I should have used
transmute
, updated.gnzlbg commentedon Dec 7, 2018
The author of those proposals tweeted (really, this is the only source I have) about this proposal: https://twitter.com/jfbastien/status/989242576598327296?lang=en
I don't know which parts of p0907 and N2218 will make it into C and C++, but at least some of it is on track for C++20 and the next revision of the C standard.
RalfJung commentedon Dec 7, 2018
What is the problem with just saying "
bool
has the same size as C's_Bool
, here is what this means for popular platforms"? "bool
==_Bool
" and "bool
has size 1" are only contradicting if they are both interpreted as normative for all platforms; I do not think the latter is normative.TBH I am a bit surprised to see so much fuzz about this, I feel there is an aspect to this discussion that I am missing.
The one possible problem with this definition is that it means we can only ever port Rust to platforms that support C; however, I think that is an entirely theoretical concern at this point.
Gankra commentedon Dec 7, 2018
Defining
bool == _Bool
has no implications on c-less platforms, as it is implicitly "if we have a C platform to interoperate with".I don't have much respect for the boogeyman of weird bools, since aiui that's mostly
CHAR_BIT != 8
or wildly ancient platforms (win 95). We should blindly assert bool has the proposed c/cpp-20 definition, and also that we only interoperate with c platforms that support it.I might be willing to accept a weaker statement of "we don't interoperate with c's bool on such a weird platform" but that pushes people back to spreading the folklore that c_bool is useful and good.
gnzlbg commentedon Dec 7, 2018
All options have problems. The problem with that approach is that it doesn't guarantee that Rust
bool
is 1 byte, so those who might care about that might be left wondering: "Should I useu8
to store my booleans instead ofbool
just in case someone uses my code in some hypothetical platform in whichsize_of::<bool>() != 1
? " Note that this something that affects all Rust code, instead of just Rust's C FFI.If we would say that
size_of::<bool>() == 1
then some people might decide to usec_bool
in C FFI to be portable with hypothetical C platforms in which that does not hold. That looks like a lesser evil to me than leaving all Rust users, including those writing Rust code that does not interface with C, wondering about the size ofbool
.SimonSapin commentedon Dec 7, 2018
gnzlbg commentedon Dec 7, 2018
@SimonSapin preventing code from compiling is not the same producing portable code that compiles and works as expected.
An equivalent solution (that prevents some code from compiling) would be to not allow
bool
in C FFI in those platforms in whichbool != _Bool
. That would only affect Rust code doing C FFI, instead of affecting all Rust code, andbool
would still continue to work properly on C FFI as long as the code only targets platforms in whichbool == _Bool
.RalfJung commentedon Dec 7, 2018
The lang and compiler teams decided that this is not the case (i.e., they decided that
bool
might have size different from 1)! It is not our place to change that decision. We are mostly documenting behavior here.This was actually called out in rust-lang/rust#46176 (comment):
(emphasis mine)
I see no reason for us to overthrow their decision as part of this documentation process. If you want to change that decision, feel free to write an RFC, but that IMO has no place in the current UCG discussion phase.
RalfJung commentedon Dec 7, 2018
@SimonSapin Here is a version of that code that actually compiles: (from your very brief message, I had no idea whether you wanted to express that this is accepted or rejected by the compiler^^)
Looks like a bug to me. We also don't allow transmuting betweenu64
andusize
on 64bit platforms. So we should reject thistransmute
for the same reason, or else decide that we will never support a platform where_Bool
has a size different from 1, or else backtrack on rust-lang/rust#46176 (comment). "We", however, is the community as part of an RFC, or maybe T-lang + T-compiler, but not the UCG group.hanna-kruppe commentedon Dec 7, 2018
Huh? Of course we do. It compiles, and it's perfectly legitimate if you only target 64 bit platforms or properly
cfg
-gate the transmute.RalfJung commentedon Dec 7, 2018
You are right. Seems I misremembered. Even better, things are consistent then.
RalfJung commentedon Dec 7, 2018
A long discussion with @gnzlbg on Zulip uncovered why I am so confused here (in particular, I maintain that it makes no logical sense to bring up platforms with
CHAR_BITS != 8
in this discussion if we declared such platforms unsupported): @gnzlbg thinks that even if we say "for FFI we assume thatCHAR_SIZE == 8
", we still want to say some things about FFI on platforms whereCHAR_SIZE != 8
.I think we should close this issue, and just document the T-lang + T-compiler decision that
bool
shares size and alignment with_Bool
. And then some people that care should have a discussion about which platforms we support, where "no support" means "no support". And then, if there is agreement that it is worth supportingCHAR_BITS != 8
at least a bit, we can see how that affectsbool
.SimonSapin commentedon Dec 7, 2018
(Code in my previous comment did not compile because I forgot
::
in the turbofish syntax. But you don’t need to calltransmute
to get the magic size equality check, only to "instantiate" it with concrete input and output types.)2 remaining items
cuviper commentedon Dec 7, 2018
Historically, 32-bit
powerpc-darwin
was another one withsizeof(bool) == 4
.joshtriplett commentedon Dec 7, 2018
Cray didn't have byte-level addressing, but I don't think anyone needs to target the Cray architecture anymore.
I think we should mandate that
bool
is always the platform's_Bool
. And to the extent we need to, let's always assume 8-bit bytes, rather than spending time and resources future-proofing against a platform that doesn't yet exist.gnzlbg commentedon Dec 8, 2018
What does that buy us?
Pros of
bool == _Bool
:c_bool
on C FFI out of compatibility concerns with weird targetsCons of
bool == _Bool
:unsafe
code cannot rely onsize_of::<bool>() == 1
if it wants to be "truly portable"unsafe
code might not be able to rely ontransmute::<_, i32>(true) == 1
andtransmute::<_, i32>(false) == 0
if it wants to be truly portable (portable to C99 platforms for example)struct Bool(u8)
for use in Rust out of portability paranoiaOn one hand, I think that we should not make it impossible to write Rust that targets weird C platforms - there should not be room or excuses for people to use C. On the other hand, targeting weird C platforms is going to be hard anyways, most of the ecosystem might not be reused at least "as is" on those, etc. Is it worth it to leave most Rust users wondering about
bool
to make targeting these weird C platforms a slightly bit easier? I don't know. Can, e.g.,-sys
Rust C FFI wrappers be reused at all on those weird targets ? I don't know either.As @rkruppe argued, just because we leave those things as "implementation defined" does not mean that people won't assume that they hold everywhere if that holds on all platforms they care about. I don't care about platforms where
_Bool
doesn't have the properties above, so I just can assume that they hold. Thebool == _Bool
definition allows that, so as long as we word everything correctly, it's a good choice IMO.Ixrec commentedon Dec 9, 2018
To clarify,
c_bool
would become a legitimately useful type if we ever did add support for a "weird C platform"/platform with large bools, right?So as long as there's a fundamental conflict between guaranteeing the size on weird future platforms and guaranteeing C interop on weird future platforms, and "targeting weird C platforms is going to be hard anyways", it seems we gain a lot more from making
bool
predictably small for all the ordinary pure-Rust use cases and then keeping in mind that we'll have to introduce an officialc_bool
type if we ever add a platform where it would be useful (and we probably shouldn't do that before the "portability lint" becomes a real thing).strega-nil commentedon Dec 9, 2018
@Ixrec the team has decided that it's better to be compatible with C, than predictably 1 byte, with bit patterns exactly
0b0
and0b1
(although there's a close to zero percent chance that this decision ever gets tested)gnzlbg commentedon Dec 9, 2018
Gankra commentedon Dec 9, 2018
regardless of our decision, bool's sizing is no different from usize vs u32/u64, which I don't believe is considered an "interesting" problem in miri
RalfJung commentedon Dec 9, 2018
No, it implements an abstract machine where
bool
is as large as it is on the target platform. Sizes and layout of types in miri is entirely determined by rustc, using the same code as what is used for codegen.That's correct. But it seems that'll soon be fixed in C/C++ as well so it seems fine.
gnzlbg commentedon Dec 13, 2018
We discussed this briefly in today's meeting, so I think we can close this. We agreed on writing down something along these lines (the exact wording will be in a PR, we can discuss the nits there - cc @avadacatavra ):
Representation invariant of bool
Rust's
bool
has the same layout as C17's_Bool
, that is, its size and alignment are implementation-defined.note: on all platforms that Rust's currently supports, the size and alignment of
bool
are 1, and its ABI class isINTEGER
.Validity invariant of bool
Rust's
bool
has the same valid representations as C17's_Bool
, that is, two valid implementation-defined bit-patterns corresponding totrue
andfalse
- all other bit-patterns are invalid.note: on all platforms that Rust's currently supports,
0x0
is the bit-pattern offalse
, and0x1
is the bit-pattern oftrue
- all other bit-patterns are invalid.note: there are two proposals, N2218 for C and P0907 for C++, which propose defining
0x0
as the bit-pattern forfalse
and0x1
as the bit-pattern fortrue
in the C and C++ standards.We didn't say anything about the safety invariant of
bool
, but I don't think there is much to say about that yet. It was also suggested that we should warn onunsafe
code that makes assumptions about, at least, the size and alignment ofbool
. We might need to add a note about this, or maybe open a rust-lang/rust issue about doing that ?There was interest in documenting the "controversy" here but there was some agreement that there isn't really much controversy about this, just different goals that are at tension.
EDIT: briefly about the goals at tension: the unsafe code guidelines answer the question "What is
unsafe
code allowed to assume aboutbool
? ". From the point-of-view of answering this question, the goal would be a simple and clear answer, like:bool
has size 1,0x1
istrue
and0x0
is false. Otherwise,unsafe
code that assumes any of this is "wrong" in some sense, and people might end up writing more complicatedunsafe
code to avoid relying on these assumptions.The T-lang and T-compiler team considered many goals, like "people should use
bool
in C FFI instead ofc_bool
", "people should usebool
in Rust instead ofu8
", etc. Thebool == _Bool
decision balances all these goals. In practice, for users that write code (safe orunsafe
) that aims to target "mainstream" platforms, the distinction does not really matter.bool
FFI-safe? rust-lang/rust#95184