Skip to content

ENV as a normal Dict: populate at boot, copy when spawning children #36494

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 2 commits into from

Conversation

StefanKarpinski
Copy link
Member

@StefanKarpinski StefanKarpinski commented Jun 30, 2020

Instead of modifying the process environment in-place, which is inherently thread-unsafe because of how most libc libraries are implemented, this changes the ENV global to be populated from the process environment at Julia's startup time and when starting a child process it populates the child's environment with a copy of ENV so the process's environment is never modified.

@Keno
Copy link
Member

Keno commented Jun 30, 2020

I'm ok with it in general, but I'm not quite sure we can do it in 1.x, since there's certainly packages relying on being able to modify ENV and having that picked up by C libraries that they load.

@StefanKarpinski
Copy link
Member Author

There was a lot of concern in a recent discussion thread about the fact that modifying ENV is currently not thread-safe and this was the only viable solution anyone could come up with. Is that concern about thread safety not actual?

@Keno
Copy link
Member

Keno commented Jun 30, 2020

It's a problem, but somewhat tempered for the moment by threading being not very common and even when people use threading having these kinds of modifications happen mostly at load time. I'm just not sure we can do it now. At the very least, we'd need to survey existing packages and have a list of things that need to get fixed.

the child process. This means that `ENV` cannot be used to communicate via the
current proccess's environment with libraries.
"""
const ENV = Dict{String,String}()
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Don't you need a lock?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If it were possible to lock on an arbitrary object, then I'd say it's the callers responsibility to do the locking, but that's not possible so I guess we either need an official ENV_LOCK object or need to make this a threadsafe dict type, which I guess would therefore need to live in Base and might as well be exported.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A thread-safe Dict would be useful for many use-cases in addition to this one.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also, regarding being able to lock on arbitrary objects, this PR doesn't implement that directly but implements a struct that bundles objects and their locks together: #34400

@tkf
Copy link
Member

tkf commented Jun 30, 2020

This is a breaking change because, as discussed in #34726, modifying the process environment is the only way to communicate with some C libraries.

I think we need to make the thread-safe version opt-in with something like using Future: ENV, withenv. If the new API is not going to mutate/reflect the process environment, I think it might be better to use some kind of context variables system (e.g., #35833) to get rid of the lock completely.

@StefanKarpinski
Copy link
Member Author

I don't really have a dog in this fight: I neither particularly care about the libc thread safety issues, nor do I particularly care about C libraries that need to be communicated to via environment variables. I made this PR because I was looking at the relevant code and the upshot of the discussion in #34726 seemed to be that this was the best approach. So if people don't want this, feel free to close this PR.

@DilumAluthge
Copy link
Member

We could do a PkgEval run and fix the packages that break?

@tkf
Copy link
Member

tkf commented Jul 2, 2020

I think there are three concurrency concerns:

  • Trying to make C-level environment variables data-race free (DRF), at least when used via non-unsafe_ Julia API. Of course, this can only be a best-effort basis API since C libraries might mutate environment variables.
  • Making ENV[key] = value and ENV[key] DRF.
  • Making withenv(f, ...) or an equivalent API "safe"; i.e., concurrent executions rollback the set of environment variables to the one before. (This is what @staticfloat requested in API for setting default env for run() invocations #36440)

(The first two points would be identical if we keep ENV as the wrapper of the C-level environment variable.)

Note that using a lock is an OK solution for the first two points but rather terrible one for the last point. This is because withenv(f, ...) has to hold a lock while executing f. It'd make things DRF but you can't run, e.g., multiple processes concurrently using different executables installed via JLLs. If the lock is only at getindex/setindex!-level, we can't grantee that the environment variables are consistent during and after executing withenv (although it's DRF).

On top of these complications, there is a concern of backward compatibility. I think the best approach is to just add a lock in the current EnvDict (hope-for-the-best "DRF") and then implement Plan 1 of #34726 (comment) in Future (which requires something like #35833).

We could do a PkgEval run and fix the packages that break?

I think we are pretty sure that this is breaking. IIUC, Cairo.jl and Gtk.jl examples in #36440 (comment) using environment variables to configure external libraries. Even Base uses it.

Also (it's not particularity criticizing your comment, @DilumAluthge), I think we need to remember that there are tons of private code that is invisible to PkgEval. I think PkgEval should primary be used to catch the breaking changes we missed in the review but not for breaking the backward compatibility promise on purpose (although I know there are exceptions).

@DilumAluthge
Copy link
Member

That's a good point. Fixing the PkgEval failures would not be sufficient in this case.

@DilumAluthge DilumAluthge deleted the sk/env branch March 25, 2021 22:04
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants