Skip to content

fix Issue 18016 - using uninitialized value is considered @safe but has undefined behavior #2260

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 1 commit into
base: master
Choose a base branch
from
Open
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
26 changes: 21 additions & 5 deletions spec/declaration.dd
Original file line number Diff line number Diff line change
Expand Up @@ -604,18 +604,34 @@ $(GNAME VoidInitializer):
$(P Normally, variables are initialized either with an explicit
$(GLINK Initializer) or are set to the default value for the
type of the variable. If the $(I Initializer) is $(D void),
however, the variable is not initialized. If its value is
used before it is set, undefined program behavior will result.
however, the variable is not initialized.
)

$(UNDEFINED_BEHAVIOR If a void initialized variable's value is
used before it is set, the behavior is undefined.
$(IMPLEMENTATION_DEFINED If a void initialized variable's value is
used before it is set, the value is implementation defined.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Implementation defined means that the implementation must define the behavior, meaning that all D compilers must have this defined in their docs, which is obviously not the case. Hence why I think this should remain undefined.

Copy link
Member

@PetarKirov PetarKirov Mar 4, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As for the bug report in question, I think void initialized variables should not be usable in @safe code, unless we add data flow to the frontend which can prove the safety, which is not planned for near future.
(Because while it can be argued that this is memory safe, it is neither type-safe, nor something that we should allow to be easily overlooked in code review, due to the huge potential for logic errors.)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The fact that the compilers don't define the behavior doesn't make it undefined according to the language spec. That's a bug against the implementation, not the spec.

As for safety, that means memory safety. Ints with random values are not memory unsafe.

Copy link
Member

@PetarKirov PetarKirov Mar 13, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Implementation defined behavior is something that you can rely on, given that you make sure you understand the differences between the implementations (if you want to make your code portable). Are you saying that you can rely to find particular values in a void initialized static array like char[128] arr = void; even when compiling with optimizations? I suspect that some optimizing backends like GCC and LLVM may consider such code (after data flow analysis) as having undefined behavior and subsequently generate arbitrary behaving programs.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Implementation defined value means the implementation defines what the value is, which may include "arbitrary bit pattern". Undefined behavior is altogether different, as it would include "execute arbitrary code". If the implementation does the latter, it is bug in that implementation.

Copy link
Contributor

@JinShil JinShil Jun 13, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This seems to imply that the implementation is free to define the contents of a void-initialized variable. The implementation may choose to define it as 0. I suppose a security-hardened implementation may choose to do that. The implementation could also define it to be "whatever value was last written to that memory location either by the currently running process or some other process", but that would still be defined. I'm leaning towards @WalterBright's interpretation.

I suspect that some optimizing backends like GCC and LLVM may consider such code (after data flow analysis) as having undefined behavior and subsequently generate arbitrary behaving programs.

That's what's kindof weird about this. If it is implementation-defined, then isn't the implementation free to define it as undefined behavior?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm nervous that you think this is @safe... why would you say code that interacts with verifiably uninitialised data is cool and normal? It could do anything as a side-effect, and it's basically a gift-wrapped invitation to anyone crafting exploits.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@WalterBright If the value is implementation defined, the implementation needs to define what the value is for each type will be if you use void-initialization. The point of void-initialization is to not initialize the memory. How can an implementation correctly specify what the value should be if it reuses random memory from other parts of the program? This change is bad, it requires implementations to default-initialize void-initialized variables.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

C compilers leave it implementation defined, and indeed some compilers leave it as whatever happens to be in that memory location, and others set it to zero. I think that's quite reasonable, and it isn't necessary to define it further.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

C compilers leave it implementation defined

Sorry to drop in at random (the PR got linked from a forum discussion thread), but just as a point of fact: the C99 spec §6.7.8 "Initialization" para 10 states: If an object that has automatic storage duration is not initialized explicitly, its value is indeterminate.

§3.17.2 defines an "indeterminate value" as "either an unspecified value or a trap representation". And §3.17.1 explicitly defines "implementation defined" as "unspecified value where each implementation documents how the choice is made".


---
import std.stdio;
void foo()
{
int x = void;
writeln(x); // will print garbage
writeln(x); // will likely print garbage
}
---
)

$(UNDEFINED_BEHAVIOR If a void initialized variable is a reference type
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does reference type wording cover types with indirections too?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A better way to word this paragraph would be:

Void-initialization of variables whose types have unsafe values is not allowed in @safe code.

and its value is dereferenced before it is set, the behavior is undefined.
Such initializations are not allowed in `@safe` code.

---
void foo()
{
int* p = void;
*p = 3; // undefined behavior
int x;
p = &x;
*p = 3; // ok
}
---
)
Expand Down