Skip to content

1.24.1 announcement #230

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 5 commits into from
Mar 2, 2018
Merged
Changes from 1 commit
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
257 changes: 257 additions & 0 deletions _posts/2018-03-01-Rust-1.24.1.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,257 @@
---
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't know if we're still planning on shipping today or waiting till tomorrow, but if we wait, the filename needs to change so that the URL is correct.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

shipping today, so this is resolved!

layout: post
title: "Announcing Rust 1.24.1"
author: The Rust Core Team
---

The Rust team is happy to announce a new version of Rust, 1.24.1. Rust is a
systems programming language focused on safety, speed, and concurrency.

If you have a previous version of Rust installed via rustup, getting Rust
1.24.1 is as easy as:

```bash
$ rustup update stable
```

If you don't have it already, you can [get `rustup`][install] from the
appropriate page on our website, and check out the [detailed release notes for
1.24.1][notes] on GitHub.

[install]: https://www.rust-lang.org/install.html
[notes]: https://github.com/rust-lang/rust/blob/stable/RELEASES.md#version-1241-2018-03-01

## What's in 1.24.1 stable

As you know, we tend not to release point releases. In this case, we had a few regressions in 1.24.0.
Each of them individually was not a "hair on fire" moment, but there's two reasons we decided to
produce a point release. First, collectively, they were big enough to consider as a whole. Second, we
want our attitude towards regressions to be "if we're not sure, default to reverting." Given that
it was not clear that we *shouldn't* release a point release, we decided to err on the side of
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I had to read this sentence 3 times to get it straight in my head. Is there another way we could word this?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Personally, I would nix this whole paragraph, and just say that "Several minor regressions were found which collectively merited a release."

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I actually am not fond of the phrasing starting at "First, collectively...". I wouldn't enumerate the reasons to produce a point release at all, in fact, because the main thing I think a typical reader will want enumerated is the set of regressions, and enumerating two lists is more confusing than enumerating one.

So I'd suggest replacing the entire paragraph with:

As you know, we avoid point releases: new features are released as part of the standard release cadence, and our release process and beta testing do a fairly good job of helping us avoid regressions in each new release. When regressions do occur, however, our philosophy is to err on the side of reverting. In this case, there were a few regressions in 1.24.0 that were collectively important enough to necessitate this release.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Dang, I thought I refreshed the page before drafting my comment, but apparently I didn't. @aturon is correct that we can eliminate the paragraph entirely if we want.

cautuion.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

*caution


A quick summary of the changes: there are four.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wonder if we may want to call out specifically that only two of these are actual regressions, and the other two were... well.. sort of "piled on" the release already happening.

Specifically the unwind abort stuff caused the lua regression, the linker files were a "regression" in the sense that debug builds stopped working in some cases on windows, and the index generator/cargo warnings aren't fixing regressions but rather just fixing builds.

I'd personally be ok not mentioning the error index stuff or cargo pieces at all, they're sort of "neutral motivation" for doing a point release anyway.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, I figured they're pretty small, and this stuff happened so rarely, that it's worth mentioning them. I'd be okay removing the long explanations, and just linking to the issue, maybe that's a decent compromise?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nah I think the longer explanations are fine, but could it be explicitly called out here though that the first two are regressions that we're fixing and the latter two are "changes on beta we decided were important enough to accelerate to 1.24.1, but are not fixing regressions"

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I mean, the error index stuff actually is a regression, in that self-rebuilds worked fine on earlier releases. But it's not something I would have requested a point release for, as I can easily patch this at the distro level (and I already have).

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it makes sense to call the Windows 7 TLS issue a regression as well. It might not have been caused by an immediate code change in Rust, but it was caused by the choice to use Github for the index, and inadequate preparation/research on our side to ensure there would be no impact from the TLS change. From an affected users point of view, builds went from working to not working with no action on their side. That's a regression IMO

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, this is basically the struggle with explaining this. I think they're all regressions, for the reasons laid out here, but weren't serious enough to normally request a release. I tried to explain that above, but I think that @aturon 's suggestion of conciseness is important too.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"there are four" seems unnecessary given the bullets :)


* Do not abort when unwinding through FFI
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You should make more clear up front that this is reverting the 1.24.0 change.

* Emit UTF-16 files for linker arguments on Windows
* Make the error index generator work again
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we can cut this that'd be great, it's really only relevant for distros AFAIK.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it felt really weird to me with so few changes to leave one out

* Cargo will warn on Windows 7 if an update is needed.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These last two are less immediately obvious than the first two. Is it worth linking to the respective issues here, even though they're explained below?


In general, if you haven't run into any build issues, the "do not abort when unwinding through FFI"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's good to have this summary, but it could be more clear. I think what you're saying is:

"If your code is continuing to build, then the only issue that may affect you is the unwinding issue."

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@aturon 's suggested replacement sentence could replace this entire paragraph except for the last sentence ("We plan on bringing..."), which I think is still important.

change is the only one you should care about. The others are also important, but if 1.24.0 has been
working well for you, this is the only change that affects you. We plan on bringing this behavior back
in 1.25 or 1.26, depending on how smoothly the new strategy goes.

With that, let's dig into the details!

### Do not abort when unwinding through FFI

Quoting [the 1.24 annoucement](https://blog.rust-lang.org/2018/02/15/Rust-1.24.html):

> There’s one other change we’d like to talk about here: undefined behavior.
> Rust generally strives to minimize undefined behavior, having none of it in
> safe code, and as little as possible in unsafe code. One area where you could
> invoke UB is when a panic! goes across an FFI boundary. In other words, this:

```rust
extern "C" fn panic_in_ffi() {
panic!("Test");
}
```

> This cannot work, as the exact mechanism of how panics work would have to
> be reconciled with how the "C" ABI works, in this example, or any other ABI
> in other examples.
>
> In Rust 1.24, this code will now abort instead of producing undefined behavior.

We have reverted this behavior, and 1.24.1 will reintroduce the undefined behavior.
Fundamentally, this is a soundness fix, and we still plan on reintroducing the
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

when you say "this is a soundness fix" it's not clear that you're referring to the original 1.24.0 change.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The lede is a bit buried here. You should say up front that, as implemented, it caused some breakage, so ... (and then the things you're saying.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also if possible this should briefly summarize who is affected.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Lemme update this, having read further: given how long and detailed the story you tell is, it's really important to provide a concise summary of what was affected.

abort, but are going to be rolling it out a bit more slowly, and with a different
implementation.

It started with [a bug filed against the `rlua`
crate](https://github.com/chucklefish/rlua/issues/71). `rlua` is a package
that provides high level bindings between Rust and the [Lua programming
language](https://www.lua.org/).
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is probably worth explicitly acknowledging here that the rlua implementation depends on undefined behavior that just happens to work on all known OSes as of Rust 1.23. (That is correct, right?)

Copy link

@kyren kyren Mar 1, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is probably worth explicitly acknowledging here that the rlua implementation depends on undefined behavior that just happens to work on all known OSes as of Rust 1.23. (That is correct, right?)

I'm going to attempt to explain the situation so that if I get something wrong somebody knowledgeable will correct me, because I really want to know the precise answer as well!

I think the question of whether what I was doing was / is UB is entirely a question of Rust specification. As far as I am aware, there wasn't much documentation or official decisions on whether or not a longjmp as implemented on unixes was something that should work across Rust frames, but it was clear that if the Rust frame being jumped over was "trivial", this "should not" cause any issues. The problem only arose from a combination of two factors, which is that longjmp on windows was implemented using SEH and I was not aware of this, and Rust's addition of automatic exception handling on extern fn boundaries.

Now, I think it could be argued that since there was definite documentation on unwinding through rust being disallowed, and SEH is certainly the sort of unwinding that was being referred to, rlua with rust 1.23.0 on windows was relying on UB that just happened to work. You could possibly also argue about the precise definition of "unwinding" and say that rlua was relying on UB categorically, I'm honestly not sure whether unwinding refers only to things like exception handling or also to simple stack manipulation. I think there was consensus in the bug report, though, that disallowing interacting with C APIs that longjmp is not great for Rust's C interop story, so it SEEMED that the decision was made that longjmp is something that SHOULD be supported, at least in the precise limited way that rlua was trying to use it, because otherwise there are quite a few C APIs that just become extremely painful to use, and would REQUIRE extremely complex C wrapper APIs. There was also talk about supporting interop with exceptions and unwinding more generally going forward, and doing a better job of checking conditions such as "nothing that implements Drop is on the stack", but obviously that wasn't going to be part of the current fix.

So, the end result is that.. you could argue that rlua was relying on UB or at least unspecified behavior but is now NOT relying on UB, simply because it was decided that it shouldn't be UB. It's kind of a messy question!

If I've gotten any of that wrong let me know, because I don't want to misrepresent the intentions of the people who ACTUALLY make these decisions.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I know I'm being verrrry hand wavey with the previous explanation, and if you feel skeptical about my reasoning, I don't blame you.

We were ultimately aware of how weird and potentially bad longjmping over Rust was, and rlua had an old issue with a pretty long discussion about potential fixes in it. The problem though, ultimately, was that the solutions were just really bad, and not being able to do this makes Rust / C interop feel very limited.

I hesitate even still to call the rust behavior a "bug" per se, it felt to me like the bug report was more a process to discuss an edge case of C interop, and decide what Rust should do going forward.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@kyren

I would assert that UB is always a matter of language specification! (C++ folks will get touchy on this point; anything that is specified by the OS is merely unspecified, not undefined.)

Since Rust didn't provide a guarantee of a particular behavior, I think I would indeed consider the situation (as I understand it) UB. But I also wouldn't necessarily consider that a "bug", because UB just means that the Rust language doesn't impose a behavioral requirement. In this case, it seems that cross-platform behavior was consistent and correct, even without a formal specification of the desired or actual behavior.

I would also say that, regardless of whether we call this "UB" or not, the ideal path would be to make Rust guarantee that this use-case will work in the future, because it seems clear that it's a valuable behavior. And, yes, that would be equivalent to saying that in 1.24.0 and earlier, it was UB, even though it works correctly in 1.23 and earlier; and that in 1.24.1+ it won't be, "simply because it was decided that it shouldn't be UB."

Copy link

@kyren kyren Mar 2, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh, okay we're on the same page then. I've had this conversation a few times, and sometimes when people say UB they mean it as a synonym for "definitely should not do it" or "the specification explicitly states that it is UB" or simply "you have a bug" and not "the specification does not explicitly allow it or does not address it yet". I was attempting to draw a distinction between "unspecified" and "undefined" that lumped in decisions yet to be made into "unspecified", which I guess is not really accurate. I think in C++ parlance they sometimes use the words unspecified or "platform specified" to delineate between behavior that they have actively decided compilers are allowed to do whatever they want, and behavior that is simply not part of the specification.

Communicating and reasoning about this is really hard, man!

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@kyren Yeah, I think we are on the same page. I didn't want to get too much into my personal gripes with the state of affairs in C and C++, but suffice to say that I think the attitude that "if you have UB you deserve whatever bad thing happens to you" is unhelpful at best. I have also come to believe that it's somewhat disingenuous and ahistorical; supercat's comments here were particularly enlightening to me. (They commented on almost every answer; their comments are, oddly, more valuable to me than their answer.)

You're right that "undefined" and "unspecified" are different, especially in C++ world. But the C++ standard explicitly states that "compilers are allowed to do whatever they want" and "behavior that is simply not part of the specification" is generally the same thing unless explicitly stated otherwise. I think that's...not a great approach, especially when the committee has little to no interest or incentive to reduce the amount of undefined behavior in the language over time.

In Rust things really are a little fuzzier, though, since we don't have a formal language specification the way C++ does. Then again, C++ got by without one for over a decade, which the UB zealots (sorry, can't help myself) conveniently tend to forget...


> Side note: `rlua` is maintained by [Chucklefish](https://chucklefish.org/),
> a game development studio from London that's using Rust. Lua is a very
> popular language to use for extending and scripting games. We care deeply about
> production Rust users, and so handling this was a very high priority for the
> Rust team.

In short, `rlua` error handling just... broke. On Windows, and only on Windows,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this "just... broke" phrase feels overly dramatic and vague.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(I think you can just cut the sentence)

any attempt to handle errors from Lua would simply abort. This makes `rlua`
unusable, as any error of any kind within Lua causes your program to die.

After digging in, the culpurit was found: `setjmp`/`longjmp`. These functions
are provided by the C standard library as a means of handling errors. You
first call `setjmp`, and then, at some later point in time, call `longjmp`.
When you do, control flow returns to where you had previously called
`setjmp`. This is often used as a way to implement exceptions, and sometimes,
even coroutines. Lua's implementation uses `setjmp`/`longjmp` [to implement
exceptions](https://www.lua.org/pil/24.3.html):

> Unlike C++ or Java, the C language does not offer an exception handling
> mechanism. To ameliorate this difficulty, Lua uses the setjmp facility from
> C, which results in a mechanism similar to exception handling. (If you
> compile Lua with C++, it is not difficult to change the code so that it uses
> real exceptions instead.)

The issue is this: what happens when some C code `setjmp`/`longjmp`'s through
Rust stack frames? Because drop checking and borrow checking know nothing
about this style of control flow, if you `longjmp` across a Rust stack
frame that has any type that's not `Copy` on its stack, undefined
behavior will result. However, if the jump happens entirely in C, this
should work just fine. This is how `rlua` was managing it: every call
into Lua is [wrapped with `lua_pcall`](https://www.lua.org/pil/24.3.2.html):

> When you write library functions for Lua, however, there is a standard way
> to handle errors. Whenever a C function detects an error, it simply calls
> `lua_error`, (or better yet `luaL_error`, which formats the error message and
> then calls `lua_error`). The `lua_error` function clears whatever needs to be
> cleared in Lua and jumps back to the `lua_pcall` that originated that
> execution, passing along the error message.

So, the question becomes: Why does this break? And why does it break on
Windows?

When we talked about `setjmp`/`longjmp` inititally, a key phrase here wasn't
highlighted. Here it is:

> After digging in, the culpurit was found: `setjmp`/`longjmp`. These functions
> are *provided by the C standard library* as a means of handling errors. You
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The " You" at the end of this line needs to be deleted.


These functions aren't part of the C language, but part of the standard
library. That means that platform authors implement these functions, and
their implementations may differ.

Windows has a concept called SEH, short for ["Structured Exception
Handling"](https://msdn.microsoft.com/en-us/library/windows/desktop/ms680657(v=vs.85).aspx).
Windows uses SEH to implement `setjmp`/`longjmp`, as the whole idea of SEH
is to have uniform error handling. For similar reasons, C++ exceptions use
SEH, as do Rust panics.

Before we can sort the exact details of what's happening, let's look at how `rlua`
works. `rlua` has an internal function, `protect_lua_call`, used to call into
Lua. Using it looks like this:

```rust
protect_lua_call(self.state, 0, 1, |state| {
ffi::lua_newtable(state);
})?;
```

That is, `protect_lua_call` takes some arguments, one of which is a closure. This
closure is passed to `lua_pcall`, which catches any `longjmp`s that may be thrown
by the code passed to it, aka, that closure.

Consider the code above, and imagine that `lua_newtable` here could call
`longjmp`. Here's what should happen:

1. `protect_lua_call` takes our closure, and passes it to `lua_pcall`.
2. `lua_pcall` calls `setjmp` to handle any errors, and invokes our closure.
2. Inside our closure, `lua_newtable` has an error, and calls `longjmp`.
3. The initial `lua_pcall` catches the `longjmp` with the `setjmp` it called earlier.
4. Everyone is happy.

However, `lua_newtable` is an `extern fn` to Rust. So, with the chances in
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this section may not be quite right in the sense that lua_newtable isn't necessarily the problem here (AFAIK) but rather the wrapper inside of the protect_lua_call function, do_call. It's true that lua_newtable is an extern fn but I think that's only b/c it's a macro in C (presumably?)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is the part i was gonna ask you about in gitter, heh.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah @alexcrichton is correct, all of those functions defined in the ffi module are just re-implementations of what is otherwise a macro in C. The easiest way to explain it might be to use some call that is a direct call into the Lua C API, and one that has a clear chance of longjmping to somebody familiar with Lua.. something like lua_settable would work fine? lua_newtable WILL longjmp on you, but it might not be widely known because it requires invoking errors in __gc metamethods and it's generally very much a corner case. When explaining it, you probably don't have to talk about do_call or anything, you can actually just use the closure itself as the example, since the closure is the one that's calling the callback that's longjmping, that's not allowed! You wouldn't really be able to call functions that longjmp at all, because you would always have to have some rust code in between to even name the functions you were calling.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also, side note, you made me stare at protect_lua_call for a while and find a bug, so thank you for that.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Eh, I take back what I said about describing the problem using the closure, because the problem is really at the other side of the boundary, where Rust is being called from C, and it doesn't highlight that. I'm not actually sure how to explain that well without going into the lua_CFunction protocol and messy internal details though?

1.24.1, it sets up a panic handler that will cause an abort. In other words,
the code sorta looks something like this psuedo code now:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

typo: psuedo
also So, with the chances, is that supposed to be changes?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yup!

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

And 1.24.1 should be 1.24.0 here, right? The panic handler is what we're reverting. (Unless I am completely lost, which is also possible.)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

that's correct, great catch


```rust
protect_lua_call(self.state, 0, 1, |state| {
let result = panic::catch_unwind(|| {
ffi::lua_newtable(state);
});

if result.is_err() {
process::abort();
}
})?;
```

Earlier, when discussing `setjmp`/`longjmp`, we said that the issue with it in
Rust code is that it doesn't handle Rust destructors. So, on every platform but
Windows, the above `catch_unwind` shenanigans is effectively ignored, so
everything works. However, on Windows, since both `setjmp`/`longjmp` and Rust
panics use SEH, the `longjmp` gets "caught", and runs the new abort code!

The [solution here](https://github.com/rust-lang/rust/pull/48572) is to
generate the abort handler, but in a way that `longjmp` won't trigger. Eventually,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This sentence is missing an "it"; it should end "...longjmp won't trigger it." Alternatively, "...won't trigger the handler."

it's likely we'll have to register our own personality functions and such, but
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd probably leave out this sentence about personalities, despite my own personal love of talking about "eh personalites" whenever exception handling comes up

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is probably at least worth mentioning that the solution described won't be sufficient in the long run, since that's otherwise not clear.

this is good enough for now. It's not 100% clear if this will make it into Rust 1.25;
if the landing is smooth, we may backport, otherwise, this functionality will
be back in 1.26.

### Emit UTF-16 files for linker arguments on Windows

In constrast with the previous bug, which is very complex and tough to understand,
this bug's impact is simple: if you have non-ASCII paths in the directory where
you invoke `rustc`, in 1.24, it would incorrectly error with a message like
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

s/it would/it may/

perhaps?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If it happened sometimes but not always, it should be "it could", not "it may".

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, my understanding is that this bug happens 100% of the time when the conditions were met. If that's not true, I'd go with "could", but I don't care a ton!


> fatal error LNK1181: cannot open input file

The PR that caused it, [#47507](https://github.com/rust-lang/rust/pull/47507),
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure how pedantic we want to be here, but the full story here would be that:

  • When incremental compilation is enabled, we pass a bunch more object files to the linker
  • Windows has relatively low limits on the length of a command line before a process fails to spawn
    • AKA when you enabled incremental compilation (as we did by default in 1.24.0) it would quickly blow the OS command line limits on windows
  • This PR fixed this "regression" by ensuring that debug builds work by default more often by doing the only thing we can do, passing arguments through a file rather than the command line

The lua explanation above though is already pretty long, though, so maybe not worth going into details here?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm happy with whatever level of pedantic everyone feels is appropriate. :)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So it looks like the paragraph quoted from #47507 in the release notes hints at but does not fully describe the problem 47507 is addressing, and does not at all indicate what actually changed in 47507.

Before bikeshedding possible verbiage for summarizing these points, one thing I don't understand is: if we were already generating these files on Windows when the cmd process failed to spawn, were the files already inappropriately encoded in UTF-8? If not, why did the encoding change in 47507, and if so, why do we consider the regression to have been "caused by" 47507?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is the main thing I'm still not clear on.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we should leave it as-is -- most people won't care, those that do probably know more than us / as much as us, so anything we say here will be overly verbose anyway.

In all likelihood this was actually technically caused by #44094, it's just that #475707 made it more likely (well, or incremental compilation, depending on how you look at it).

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I care, because I've worked with various build systems in the C++ world on Microsoft and am familiar with this problem. So, in particular, I'd like to know if 1.23 and earlier can run into this issue.

has a good explanation of the behavior that ended up causing the problem:

> When spawning a linker rustc has historically been known to blow OS limits
> for the command line being too large, notably on Windows. This is especially
> true of incremental compilation where there can be dozens of object files per
> compilation. The compiler currently has logic for detecting a failure to
> spawn and instead passing arguments via a file instead, but this failure
> detection only triggers if a process actually fails to spawn.

However, when generating that file, we were doing it incorrectly. As [the
docs state](https://docs.microsoft.com/en-gb/cpp/build/reference/unicode-support-in-the-compiler-and-linker#linker-response-files-and-def-files):

> Response files and DEF files can be either UTF-16 with a BOM, or ANSI.

We were providing a UTF-8 encoded file, with no
[BOM](https://en.wikipedia.org/wiki/Byte_order_mark). The fix is therefore
straightforward: produce a UTF-16 file with a BOM.

### Make the error index generator work again

When packaging Rust for various Linux distros, it was found that [building
1.24 with 1.24 fails](https://github.com/rust-lang/rust/issues/48308).
This was caused by an incorrect path, causing certain metadata to not
be generated properly.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This paragraph doesn't include the phrase "error index" at all and therefore (in my opinion) doesn't shed much light on what "Make the error index generator work" means. Could someone (who actually knows) provide a bit more detail?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is the least interesting of these bugs, and the post is already long. Since it only affects a very small number of people, I decided to skimp on explaining it.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@steveklabnik Maybe "certain metadata" could be replaced with "the index of (...???...) errors"? (I'm not exactly sure what error_index_generator is or does.)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm in favor of removing this from the blog post entirely -- it's tiny, and AFAIK only affects distros.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In that case, let's replace "Make the error index generator work again" with "Fix Rust's ability to build itself on certain Linux distros". I agree with @steveklabnik that there's not much of a good reason to leave out one of only four changes, but linking to the issue is sufficient for anyone interested to see what broke and how it was fixed.


### Cargo will warn on Windows 7 if an update is needed.

In February of 2017, [GitHub announced that they were dropping support for
weak cryptographic
standards](https://githubengineering.com/crypto-deprecation-notice/). One
year later, in February of 2018, [the deprecation period is over, and support
is
removed](https://blog.github.com/2018-02-23-weak-cryptographic-standards-removed/).
In general, this is a great thing.

Cargo uses GitHub to store the index of Crates.io, our package repository.
It also uses `libgit2` for `git` operations. `libgit2` uses
[WinHTTP](https://msdn.microsoft.com/en-us/library/windows/desktop/aa382925(v=vs.85).aspx)
for making HTTP calls. As part of the OS, its feature set depends on the OS you're using.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the phrase "As part of the OS" is unnecessary here...? If not, I'm not sure what it's intended to convey.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd leave it as-is I think, but not sure.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this saying that WinHTTP is part of the OS? I didn't realize that (didn't click on the link... oops). That does make some sense.

Maybe:

WinHTTP, part of the Windows OS, for making HTTP calls. The WinHTTP feature set depends on the specific version of Windows.


> This section uses "Windows 7" to mean "Windows 7, Windows Server 2018, and Windows Server 2012",
> because it's much shorter. The following applies to all three of these editions of Windows,
> however.

Windows 7 [received an update](https://support.microsoft.com/en-us/help/3140245/update-to-enable-tls-1-1-and-tls-1-2-as-a-default-secure-protocols-in)
in June of 2016 regarding TLS. Before the patch, Windows 7 would use TLS 1.0 by default. This update
allows for applications to use TLS 1.1 or 1.2 natively instead.

If your system has not received that update, then you'd still be using TLS 1.0. This means
that accessing GitHub would start to fail.

`libgit2` [created a fix](https://github.com/libgit2/libgit2/pull/4550), using the `WinHTTP` API
to request TLS 1.2. On master, we've [updated to fix this](https://github.com/rust-lang/cargo/pull/5091),
but for 1.24.1 stable, we're [issuing a warning](https://github.com/rust-lang/cargo/pull/5069),
suggesting that they upgrade their Windows version. There's a balance here: we could have backported
the `libgit2` fixes, but that'd require updating a lot of code, which feels incorrect for a point
release, especially since the issue does not affect patched systems.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested rewording for last sentence, removing the "balance" phrasing:

Although the libgit2 fix could have been backported, we felt that the code change footprint was too large for the point release, especially since..."


## Contributors to 1.24.1

[Thanks!](https://thanks.rust-lang.org/rust/1.24.1)