Skip to content

create_app never finishes on Julia v1.11.1 when JULIA_NUM_THREADS is set #990

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
juanromerohb opened this issue Oct 29, 2024 · 15 comments
Labels
bug git bisect wanted regression Julia 1.11 Represents a regression between Julia 1.10 and Julia 1.11

Comments

@juanromerohb
Copy link

juanromerohb commented Oct 29, 2024

The same app I was successfully compiling in Julia v.1.10.6 under 15 minutes no longer compiles after updating Julia to v1.11.1.

It stucks +20 minutes on

⠸ [03m:30s] PackageCompiler: compiling nonincremental system image

The error that it shows me after Ctrl^C is

InterruptException:
Stacktrace:
  [1] try_yieldto(undo::typeof(Base.ensure_rescheduled))       
    @ Base .\task.jl:958
  [2] wait()
    @ Base .\task.jl:1022
  [3] wait(c::Base.GenericCondition{Base.Threads.SpinLock}; first::Bool)
    @ Base .\condition.jl:130
  [4] wait
    @ .\condition.jl:125 [inlined]
  [5] wait(x::Base.Process, syncd::Bool)
    @ Base .\process.jl:694
  [6] wait
    @ .\process.jl:687 [inlined]
  [7] success
    @ .\process.jl:556 [inlined]
  [8] run(::Cmd; wait::Bool)
    @ Base .\process.jl:513
  [9] run
    @ .\process.jl:510 [inlined]
 [10] #20
    @ C:\Users\juanh\.julia\packages\PackageCompiler\dFEAA\ext\TerminalSpinners.jl:157 [inlined]
 [11] spin(f::PackageCompiler.var"#20#22"{Cmd}, s::PackageCompiler.TerminalSpinners.Spinner{Base.TTY})
    @ PackageCompiler.TerminalSpinners C:\Users\juanh\.julia\packages\PackageCompiler\dFEAA\ext\TerminalSpinners.jl:164       
 [12] macro expansion
    @ C:\Users\juanh\.julia\packages\PackageCompiler\dFEAA\ext\TerminalSpinners.jl:157 [inlined]
 [13] create_sysimg_object_file(object_file::String, packages::Vector{String}, packages_sysimg::Set{Base.PkgId}; project::String, base_sysimage::String, precompile_execution_file::Vector{String}, precompile_statements_file::Vector{String}, cpu_target::String, script::Nothing, sysimage_build_args::Cmd, extra_precompiles::String, incremental::Bool, import_into_main::Bool)      
    @ PackageCompiler C:\Users\juanh\.julia\packages\PackageCompiler\dFEAA\src\PackageCompiler.jl:134
 [14] create_sysimg_object_file
    @ C:\Users\juanh\.julia\packages\PackageCompiler\dFEAA\src\PackageCompiler.jl:315 [inlined]
 [15] create_sysimage(packages::Vector{String}; sysimage_path::String, project::String, precompile_execution_file::Vector{String}, precompile_statements_file::Vector{String}, incremental::Bool, filter_stdlibs::Bool, cpu_target::String, script::Nothing, sysimage_build_args::Cmd, include_transitive_dependencies::Bool, base_sysimage::Nothing, julia_init_c_file::Nothing, julia_init_h_file::Nothing, version::Nothing, soname::Nothing, compat_level::String, extra_precompiles::String, import_into_main::Bool)
    @ PackageCompiler C:\Users\juanh\.julia\packages\PackageCompiler\dFEAA\src\PackageCompiler.jl:652
 [16] create_sysimage
    @ C:\Users\juanh\.julia\packages\PackageCompiler\dFEAA\src\PackageCompiler.jl:540 [inlined]
 [17] create_app(package_dir::String, app_dir::String; executables::Nothing, precompile_execution_file::Vector{String}, precompile_statements_file::Vector{String}, incremental::Bool, filter_stdlibs::Bool, force::Bool, c_driver_program::String, cpu_target::String, include_lazy_artifacts::Bool, sysimage_build_args::Cmd, include_transitive_dependencies::Bool, include_preferences::Bool, script::Nothing)
    @ PackageCompiler C:\Users\juanh\.julia\packages\PackageCompiler\dFEAA\src\PackageCompiler.jl:899
 [18] create_app(package_dir::String, app_dir::String)
    @ PackageCompiler C:\Users\juanh\.julia\packages\PackageCompiler\dFEAA\src\PackageCompiler.jl:842
 [19] top-level scope
    @ REPL[4]:1
 [20] eval
    @ .\boot.jl:430 [inlined]
 [21] eval
    @ .\Base.jl:130 [inlined]
 [22] repleval(m::Module, code::Expr, ::String)
    @ VSCodeServer c:\Users\juanh\.vscode\extensions\julialang.language-julia-1.127.2\scripts\packages\VSCodeServer\src\repl.jl:229
 [23] #112
    @ c:\Users\juanh\.vscode\extensions\julialang.language-julia-1.127.2\scripts\packages\VSCodeServer\src\repl.jl:192 [inlined]
 [24] with_logstate(f::VSCodeServer.var"#112#114"{Module, Expr, REPL.LineEditREPL, REPL.LineEdit.Prompt}, logstate::Base.CoreLogging.LogState)
    @ Base.CoreLogging .\logging\logging.jl:522
 [25] with_logger
    @ .\logging\logging.jl:632 [inlined]
 [26] (::VSCodeServer.var"#111#113"{Module, Expr, REPL.LineEditREPL, REPL.LineEdit.Prompt})()
    @ VSCodeServer c:\Users\juanh\.vscode\extensions\julialang.language-julia-1.127.2\scripts\packages\VSCodeServer\src\repl.jl:193
 [27] #invokelatest#2
    @ .\essentials.jl:1055 [inlined]
 [28] invokelatest(::Any)
    @ Base .\essentials.jl:1052
 [29] (::VSCodeServer.var"#64#65")()
    @ VSCodeServer c:\Users\juanh\.vscode\extensions\julialang.language-julia-1.127.2\scripts\packages\VSCodeServer\src\eval.jl:34

The difference I noted was that before PackageCompiler: compiling fresh sysimage (incremental=false), it throws me thousands of warnings like these

Warning: .drectve `-exclude-symbols:"julia_IRInterpretationState_54964.reloc_slot" ' unrecognized
Warning: .drectve `-exclude-symbols:"julia_YY.tarjanNOT.YY.379_64959.reloc_slot" ' unrecognized
Warning: .drectve `-exclude-symbols:jl_fvar_count_5 ' unrecognized
Warning: .drectve `-exclude-symbols:jl_fvar_ptrs_5 ' unrecognized
Warning: .drectve `-exclude-symbols:jl_clone_slots_5 ' unrecognized
Warning: .drectve `-exclude-symbols:jl_clone_idxs_5 ' unrecognized
Warning: corrupt .drectve at end of def file
Warning: .drectve `-exclude-symbols:jl_shard_tables ' unrecognized
Warning: .drectve `-exclude-symbols:jl_pgcstack_func_slot ' unrecognized
Warning: .drectve `-exclude-symbols:jl_pgcstack_key_slot ' unrecognized
Warning: .drectve `-exclude-symbols:jl_tls_offset ' unrecognized
Warning: .drectve `-exclude-symbols:jl_ptls_table ' unrecognized
Warning: .drectve `-exclude-symbols:jl_small_typeof ' unrecognized
@DilumAluthge
Copy link
Member

  1. Which operating system are you using?
  2. Do you have an MWE?

@DilumAluthge DilumAluthge changed the title create_app never finishes on Julia v1.11.1 create_app never finishes on Julia v1.11.1 Oct 29, 2024
@DilumAluthge DilumAluthge added bug regression Julia 1.11 Represents a regression between Julia 1.10 and Julia 1.11 labels Oct 29, 2024
@juanromerohb
Copy link
Author

  1. Which operating system are you using?
  2. Do you have an MWE?

I'm working on Windows 11 using Visual Studio Code with the Julia REPL extension..

A MWE I just tried starts by generating an empty Julia application:

using Pkg
Pkg.generate("MyApp")

Next, I modified the MyApp/src/MyApp.jl file to create a basic "Hello World" program:

module MyApp

function julia_main()::Cint
    println("Hello, World!")
end

end

Then, I run

using PackageCompiler
create_app("MyApp", "MyAppCompiled")

When I tried to compile the application using PackageCompiler:

PackageCompiler: bundled libraries:
  ├── Base:
  │    ├── libLLVM-16jl.dll - 81.921 MiB
  │    ├── libatomic-1.dll - 262.406 KiB
  │    ├── libdSFMT.dll - 114.227 KiB
  │    ├── libgcc_s_seh-1.dll - 767.898 KiB
  │    ├── libgfortran-5.dll - 11.162 MiB
  │    ├── libgmp-10.dll - 1.054 MiB
  │    ├── libgmp.dll - 1.054 MiB
  │    ├── libgmpxx-4.dll - 293.766 KiB
  │    ├── libgmpxx.dll - 293.766 KiB
  │    ├── libgomp-1.dll - 1.645 MiB
  │    ├── libjulia-codegen.dll - 103.560 MiB
  │    ├── libjulia-internal.dll - 12.976 MiB
  │    ├── libmpfr-6.dll - 2.526 MiB
  │    ├── libmpfr.dll - 2.527 MiB
  │    ├── libopenlibm.dll - 384.742 KiB
  │    ├── libpcre2-16-0.dll - 712.625 KiB
  │    ├── libpcre2-16.dll - 712.625 KiB
  │    ├── libpcre2-32-0.dll - 684.148 KiB
  │    ├── libpcre2-32.dll - 683.281 KiB
  │    ├── libpcre2-8-0.dll - 773.922 KiB
  │    ├── libpcre2-8.dll - 774.789 KiB
  │    ├── libpcre2-posix-3.dll - 127.914 KiB
  │    ├── libquadmath-0.dll - 1.137 MiB
  │    ├── libssp-0.dll - 144.766 KiB
  │    ├── libstdc++-6.dll - 25.186 MiB
  │    ├── libuv-2.dll - 984.586 KiB
  │    ├── libwinpthread-1.dll - 330.039 KiB
  │    ├── libz.dll - 233.195 KiB
  │    ├── libjulia.dll - 226.492 KiB
  ├── Stdlibs:
  │   ├── OpenBLAS_jll
  │   │   ├── libopenblas64_.dll - 37.669 MiB
  │   ├── libblastrampoline_jll
  │   │   ├── libblastrampoline-5.dll - 2.250 MiB
  Total library file size: 292.973 MiB
⢰ [01m:04s] PackageCompiler: creating compiler .ji image (incre✔ [04m:09s] PackageCompiler: creating compiler .ji image (incremental=false)
⣠ [01m:35s] PackageCompiler: compiling fresh sysimage (incremental=false)
[pid 17396] waiting for IO to finish:
 Handle type        uv_handle_t->data
⠋ [02m:52s] PackageCompiler: compiling fresh sysimage (incremental=false)

## [thousands of warning lines]

Warning: .drectve `-exclude-symbols:"julia_copytoNOT._61547.reloc_slot" ' unrecognized 
Warning: .drectve `-exclude-symbols:"julia_domsort_ssaNOT._68697.reloc_slot" ' unrecognized
Warning: .drectve `-exclude-symbols:"julia_heapifyNOT._68559.reloc_slot" ' unrecognized
Warning: .drectve `-exclude-symbols:"julia__unsetindexNOT._61390.reloc_slot" ' unrecognized
Warning: .drectve `-exclude-symbols:"julia_heapifyNOT._68550.reloc_slot" ' unrecognized
Warning: .drectve `-exclude-symbols:"julia_valid_typeof_tparam_65525.reloc_slot" ' unrecognized
⣠ [05m:08s] PackageCompiler: compiling fresh sysimage (incremental=false) unrecognized

## [thousands of warning lines]

Warning: .drectve `-exclude-symbols:jl_clone_idxs_5 ' unrecognized
Warning: corrupt .drectve at end of def file
Warning: .drectve `-exclude-symbols:jl_shard_tables ' unrecognized
Warning: .drectve `-exclude-symbols:jl_pgcstack_func_slot ' unrecognized
Warning: .drectve `-exclude-symbols:jl_pgcstack_key_slot ' unrecognized
Warning: .drectve `-exclude-symbols:jl_tls_offset ' unrecognized
Warning: .drectve `-exclude-symbols:jl_ptls_table ' unrecognized
Warning: .drectve `-exclude-symbols:jl_small_typeof ' unrecognized
✔ [05m:08s] PackageCompiler: compiling fresh sysimage (incremental=false)
Precompiling project...
  1 dependency successfully precompiled in 4 seconds
⠸ [01m:39s] PackageCompiler: compiling nonincremental system image
⢰ [20m:17s] PackageCompiler: compiling nonincremental system image

@DilumAluthge
Copy link
Member

Any chance you could do a git bisect on Julia between 1.10 and 1.11, to identify the first broken commit?

@juanromerohb
Copy link
Author

I don't know how to do that but I tried it in 1.11.0 and I obtained the same bug

@markusgumbel
Copy link

I have a very similar or probably the same problem on Ubuntu and can confirm the issue.

Julia Version 1.11.1
Commit 8f5b7ca12ad (2024-10-16 10:53 UTC)
Build Info:
Official https://julialang.org/ release
Platform Info:
OS: Linux (x86_64-linux-gnu)
CPU: 8 × Intel(R) Core(TM) i7-8565U CPU @ 1.80GHz
WORD_SIZE: 64
LLVM: libLLVM-16.0.6 (ORCJIT, skylake)
Threads: 6 default, 0 interactive, 3 GC (on 8 virtual cores)
Environment:
JULIA_HOME = /home/markus/.juliaup
JULIA_NUM_THREADS = 6

@matthewgcooper
Copy link

I am having the same problem using the same MWE. I ran multiple tests and found out that 1.11.1 only gets stuck if the env variable JULIA_NUM_THREADS is set. The tests I performed are listed below:

  • Julia 1.10.5 run with either 1 thread, "-t auto", or with the env var set to "auto". These three cases work as expected and take about 3 minutes to make an app.
  • Julia 1.11.1 run with 1 thread or "-t auto". These two cases also work as expected with similar times.
  • Julia 1.11.1 run with env var set to "auto". This case breaks with the compilation never ending (see below - I waited for 20mins).

✔ [01m:50s] PackageCompiler: creating compiler .ji image (incremental=false)
⠋ [01m:39s] PackageCompiler: compiling fresh sysimage (incremental=false)
Precompiling project...
1 dependency successfully precompiled in 2 seconds
✖ [20m:29s] PackageCompiler: compiling nonincremental system image

The specs for the failed case are as follows:

Julia Version 1.10.5
Commit 6f3fdf7b362 (2024-08-27 14:19 UTC)
Build Info:
Official https://julialang.org/ release
Platform Info:
OS: Linux (x86_64-linux-gnu)
Ubuntu 22.04.4 LTS
uname: Linux 6.8.0-47-generic #47~22.04.1-Ubuntu SMP PREEMPT_DYNAMIC Wed Oct 2 16:16:55 UTC 2 x86_64 x86_64
CPU: 13th Gen Intel(R) Core(TM) i7-13800H:
speed user nice sys idle irq
#1-20 1001 MHz 435481 s 2002 s 86515 s 6243735 s 0 s
Memory: 31.034202575683594 GB (19788.57421875 MB free)
Uptime: 81331.34 sec
Load Avg: 3.25 3.15 3.03
WORD_SIZE: 64
LIBM: libopenlibm
LLVM: libLLVM-15.0.7 (ORCJIT, goldmont)
Threads: 20 default, 0 interactive, 10 GC (on 20 virtual cores)
Environment:
JULIA_NUM_THREADS = auto
TERMINATOR_DBUS_PATH = /net/tenshu/Terminator2
WINDOWPATH = 2

@DilumAluthge
Copy link
Member

@juanromerohb Can you check if JULIA_NUM_THREADS is defined in your environment when you are running create_app()?

@juanromerohb
Copy link
Author

@juanromerohb Can you check if JULIA_NUM_THREADS is defined in your environment when you are running create_app()?

I obtain the same results as @matthewgcooper

@DilumAluthge DilumAluthge changed the title create_app never finishes on Julia v1.11.1 create_app never finishes on Julia v1.11.1 when JULIA_NUM_THREADS is set Nov 21, 2024
@VPetukhov
Copy link

This is still a problem with PackageCompiler v2.2.0 and Julia 1.11.2.

@bclyons12
Copy link

I had the same problem with create_sysimage. With JULIA_NUM_THREADS set, the compiler hung for hours. Deleting that environmental variable let my extensive package compile in ~45 minutes. Here's the system that hung with [9b87118b] PackageCompiler v2.2.0:

Julia Version 1.11.2
Commit 5e9a32e7af2 (2024-12-01 20:02 UTC)
Build Info:
  Official https://julialang.org/ release
Platform Info:
  OS: Linux (x86_64-linux-gnu)
  CPU: 128 × AMD EPYC 7513 32-Core Processor
  WORD_SIZE: 64
  LLVM: libLLVM-16.0.6 (ORCJIT, znver3)
Threads: 10 default, 0 interactive, 5 GC (on 128 virtual cores)
Environment:
  LD_LIBRARY_PATH = /fusion/usc/opt/nvidia/hpc_sdk/Linux_x86_64/20.11/comm_libs/nvshmem/lib:/fusion/usc/opt/nvidia/hpc_sdk/Linux_x86_64/20.11/comm_libs/nccl/lib:/fusion/usc/opt/nvidia/hpc_sdk/Linux_x86_64/20.11/math_libs/lib64:/fusion/usc/opt/nvidia/hpc_sdk/Linux_x86_64/20.11/compilers/lib:/fusion/usc/opt/nvidia/hpc_sdk/Linux_x86_64/20.11/cuda/lib64
  JULIA_NUM_THREADS = 10

Making @orso82 aware

danlooo added a commit to EarthyScience/RQADeforestation.jl that referenced this issue Jan 31, 2025
enforce single threaded call of PackageCompiler.create_app
see JuliaLang/PackageCompiler.jl#990

* change code fold

* Fold line and not block

* single line command

* chmod all read julia

* change project dir

* add chown

* enforce pull before run

* add fold

* Fix stuck compiling nonincremental system image by enforcing single thread

* Ensure files are always owned by user

* Unset env var JULIA_NUM_THREADS

* do not depend on dir existence for chown
@paulmelis
Copy link

paulmelis commented Apr 16, 2025

Any chance you could do a git bisect on Julia between 1.10 and 1.11, to identify the first broken commit?

I'm not the OP, but have been trying to bisect this issue as a fun side-challenge. However, I'm consistently running into this error when calling create_app("MyApp", "MyAppCompiled", force=true) and I can't figure out if it's actually reporting some issue in the MyApp code or not:

melis@blackbox 18:49:/tmp/t$ JULIA_NUM_THREADS=1 julia +1.10.9 --project=. t.jl 
  Generating  project MyApp:
    MyApp/Project.toml
    MyApp/src/MyApp.jl
...
✔ [01m:44s] PackageCompiler: creating compiler .ji image (incremental=false)
⠙ [01m:18s] PackageCompiler: compiling fresh sysimage (incremental=false)
[pid 43305] waiting for IO to finish:
 Handle type        uv_handle_t->data
 timer              0xf9a1df0->0x7b3ae8a49360
✔ [03m:04s] PackageCompiler: compiling fresh sysimage (incremental=false)
  ◐ MyApp
⢰ [00m:12s] PackageCompiler: compiling nonincremental system imagepackage `MyApp` did not define the expected module `MyApp`, check for typos in package module name
Stacktrace:
⣄ [00m:13s] PackageCompiler: compiling nonincremental system imageerror(s::String)
⡆ [00m:13s] PackageCompiler: compiling nonincremental system imageBase ./error.jl:35
 [2] __require_prelocked(uuidkey::Base.PkgId, env::Nothing)
   @ Base ./loading.jl:1884
 [3] #invokelatest#2
   @ ./essentials.jl:892 [inlined]
 [4] invokelatest
   @ ./essentials.jl:889 [inlined]
 [5] _require_prelocked
   @ ./loading.jl:1875 [inlined]
 [6] _require_prelocked
   @ ./loading.jl:1872 [inlined]
 [7] macro expansion
   @ ./lock.jl:267 [inlined]
 [8] require(uuidkey::Base.PkgId)
   @ Base ./loading.jl:1867
 [9] top-level scope
   @ /tmp/jl_pwVyBe98Hc:6
in expression starting at /tmp/jl_pwVyBe98Hc:6
...

Is there anything wrong with defining module MyApp as below?

melis@blackbox 18:55:/tmp/t$ cat MyApp/src/MyApp.jl 
module MyApp

function julia_main()::Cint
    println("Hello, World!")
end

end

Edit: hmm, it seems to be caused by having a single script that generates the MyApp package, followed by calling create_app(). If I do the latter step in a new Julia session the module check doesn't fail

@paulmelis
Copy link

paulmelis commented Apr 17, 2025

Okay, bisecting took ages at 15 minutes per iteration, but it converges for me on JuliaLang/julia@ab1dda2. That's a threading-related commit, but that's also the only thing I can add since I'm not deep into the Julia code 😃. Just trying to get this issue closer to solved.

Git bisect log is below, started from 1.10.9 being good, 1.11.0 being bad. Removed ./usr in the Julia repo before a rebuild, used [email protected] for most test runs due to #954 being applied in the bisecting range. Other than that preparation was pretty simple:

>>> using Pkg
>>> Pkg.generate("MyApp")

Followed by updating MyApp/src/MyApp.jl to

module MyApp

function julia_main()::Cint
    println("Hello, World!")
end

end

Actual test runs during bisection using JULIA_NUM_THREADS=7. Removed .toml files and MyAppCompiled before each run (left MyApp in place). Interactively started Julia, added [email protected] as mentioned, then do create_app("MyApp", "MyAppCompiled", force=true) and checked if the final compiling nonincremental system image finished within roughly 1 minute (but letting it run to 5 minutes to be sure).

Edit 2: okay, just noticed that create_app() consistently does not get stuck when starting Julia with -t 7 instead of using JULIA_NUM_THREADS=7. It finishes the compiling nonincremental system image step in less than a minute when using -t 7 (similar observation in #990 (comment) above). Re-testing with JULIA_NUM_THREADS=7 a couple of times shows it does get stuck consistently. This is with a build of ab1dda237f as mentioned above.

Edit 3: when re-testing ab1dda237f with -t auto (which results in Threads.nthreads() of 8 in my case) the create_app() call does not get stuck.

git bisect start
# status: waiting for both good and bad commits
# good: [5595d20a2877560583cd4891ce91605d10b1bb75] set VERSION to 1.10.9 (#57695)
git bisect good 5595d20a2877560583cd4891ce91605d10b1bb75
# bad: [501a4f25c2b7626c5e368cd01d1c050b70bafdb9] set VERSION to 1.11.0 (#55953)
git bisect bad 501a4f25c2b7626c5e368cd01d1c050b70bafdb9
# skip: [0ba6ec2d2282937a084d7e5e5a0b026dc953bb31] Restore link to list of packages in Base docs (#50353)
git bisect skip 0ba6ec2d2282937a084d7e5e5a0b026dc953bb31
# skip: [53f1eb82b8a4265def61ab677746b63a583ef865] bugfix for dot of Hermitian{noncommutative} (#52333)
git bisect skip 53f1eb82b8a4265def61ab677746b63a583ef865
# skip: [5e9cd58c5f5aca41b7e041f42f9b977cc4093448] Use optimised string search methods for substrings, too (#52424)
git bisect skip 5e9cd58c5f5aca41b7e041f42f9b977cc4093448
# skip: [22a027607d4f5f5e0a78617762711e272f798594] Make Ctrl-D not hang in the fallback repl (#51384)
git bisect skip 22a027607d4f5f5e0a78617762711e272f798594
# good: [187e8c2222878c68b2afc9295ab8dc61773bd7f2] Add `BracketedSort` a new, faster algorithm for `partialsort` and friends (#52006)
git bisect good 187e8c2222878c68b2afc9295ab8dc61773bd7f2
# bad: [2bd4cf8090f2c651543f562a4c469d73e4b15bd6] Avoid allocations in views of views (#53231)
git bisect bad 2bd4cf8090f2c651543f562a4c469d73e4b15bd6
# bad: [bd3eab649c3ec9d405256afc599f37197f8daec7] static-show: improve accuracy of some printings (#52799)
git bisect bad bd3eab649c3ec9d405256afc599f37197f8daec7
# bad: [2c2ea3aa649fcea7a7b889c80e211b84cf6f2510] Document environment variable JULIA_PKG_PRESERVE_TIERED_INSTALLED (#52362)
git bisect bad 2c2ea3aa649fcea7a7b889c80e211b84cf6f2510
# good: [c731edb861540fc1b6737ebbe240ca83b2dbd913] irinterp: skip `nothing` statement when analyzing `:nothrow` and `:noub` (#52417)
git bisect good c731edb861540fc1b6737ebbe240ca83b2dbd913
# good: [456951fbbb0d47e45f49dd7942d5f7571831c049] Export several genericmemory-related functions from C (#52475)
git bisect good 456951fbbb0d47e45f49dd7942d5f7571831c049
# good: [15244663a345a98e2919837fbd6d9e0ad997d5ba] follow up #52309 (#52499)
git bisect good 15244663a345a98e2919837fbd6d9e0ad997d5ba
# good: [5195da2ca006d3e63dcd8973c7d867c88381f6be] Improve linear indexing performance for FastSubArrays (#45371)
git bisect good 5195da2ca006d3e63dcd8973c7d867c88381f6be
# bad: [ad2d770e9110022392509b3270a12418f28246b9] test: fix timeout changed by #52461 accidentally (#52534)
git bisect bad ad2d770e9110022392509b3270a12418f28246b9
# good: [3495404f9dba5bf4186debf3ee59d704c44ae556] handle data-race on nrunning==0 from scheduler_delete_thread
git bisect good 3495404f9dba5bf4186debf3ee59d704c44ae556
# bad: [ab1dda237fdabfd0fd38eb6c51ce219e5e15c1f4] add missing increment of nrunning for jl_adopt_thread
git bisect bad ab1dda237fdabfd0fd38eb6c51ce219e5e15c1f4
# first bad commit: [ab1dda237fdabfd0fd38eb6c51ce219e5e15c1f4] add missing increment of nrunning for jl_adopt_thread

Edit: on an Arch Linux sytem

julia> versioninfo()
Julia Version 1.11.0-DEV.1094
Commit ab1dda237f* (2023-12-13 21:24 UTC)
Platform Info:
  OS: Linux (x86_64-pc-linux-gnu)
  CPU: 8 × Intel(R) Core(TM) i7-6700K CPU @ 4.00GHz
  WORD_SIZE: 64
  LLVM: libLLVM-15.0.7 (ORCJIT, skylake)
  Threads: 1 on 8 virtual cores

@paulmelis
Copy link

paulmelis commented Apr 18, 2025

Could the difference in behaviour between setting JULIA_NUM_THREADS (lockup) and using -t <n> (works) be caused by the former also passing that env to the Julia subprocess actually running compiler.jl, while the latter does not pass -t <n> to the subprocess?

@PatrickHaecker
Copy link
Contributor

Could the difference in behaviour between setting JULIA_NUM_THREADS (lockup) and using -t <n> (works) be caused by the former also passing that env to the Julia subprocess actually running compiler.jl, while the latter does not pass -t <n> to the subprocess?

I think this is a very likely hypothesis. I can confirm that the build fails with the JULIA_NUM_THREADS set to auto while it works (even without #1041) when the environment variable is deleted in the script immediately before calling create_app.

This is also in conformance with ab1dda2 which changed some uv mutex which could lead to the effect that a mutex never gets released again in the multi-threaded case which would fully explain the behavior that the CPU load goes to 0, but the process never finishes when effectively waiting on the mutex forever.
Note, I am only speculating on the possible effects of the uv mutex. I haven't actually identified a bug in the source code.

@PatrickHaecker
Copy link
Contributor

I can confirm that passing JULIA_NUM_THREADS to the process compiling the final system image is the problem. I'll prepare a targeted PR to only deactivate it there.

PatrickHaecker added a commit to PatrickHaecker/PackageCompiler.jl that referenced this issue Apr 22, 2025
Make sure, that the final system image is built single-threaded and
override any values set by "-t", "--threads" in sysimage_build_args`
or provided via JULIA_NUM_THREADS.
This is needed until the underlying bug is fixed (see JuliaLang#963 and especially
JuliaLang#990 containing a `git bisect` to the commit introducing the problem)

Fixes JuliaLang#963 and JuliaLang#990
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug git bisect wanted regression Julia 1.11 Represents a regression between Julia 1.10 and Julia 1.11
Projects
None yet
Development

No branches or pull requests

8 participants