-
-
Notifications
You must be signed in to change notification settings - Fork 5.6k
fix allocations macro with more functions #58057
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
This part was dropped from the commit |
Move allocations macro to Test, since it is clearly just implementing testing heuristics, and change the implementation to add a function call (closure) wrapper so that the inner code can be optimized, even if the outer scope is `@latestworld`. The implementation creates a closure if necessary (the code is complex) or just makes a call if the expression is simple (just a call on symbols).
63caedb
to
35e7208
Compare
@@ -59,12 +58,14 @@ has_fma = Dict( | |||
@test clamp(typemax(UInt16), Int16) === Int16(32767) | |||
|
|||
# clamp should not allocate a BigInt for typemax(Int16) | |||
x = big(2) ^ 100 | |||
@test (@allocated clamp(x, Int16)) == 0 | |||
let x = big(2) ^ 100 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why are these let
s now needed?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
they aren't with the current macro implementation, but just to make this test a bit more clear, and more stable against the next time we change the macro again 😂
@KristofferC do you want PkgEval before or after this, so you can know about breakage |
Might as well run one here, otherwise things might get lost in the noise. |
@nanosoldier |
The package evaluation job you requested has completed - possible new issues were detected. Report summary❗ Packages that crashed2 packages crashed only on the current version.
6 packages crashed on the previous version too. ✖ Packages that failed25 packages failed only on the current version.
1260 packages failed on the previous version too. ✔ Packages that passed tests27 packages passed tests only on the current version.
5301 packages passed tests on the previous version too. ~ Packages that at least loaded9 packages successfully loaded only on the current version.
2946 packages successfully loaded on the previous version too. ➖ Packages that were skipped altogether916 packages were skipped on the previous version too. |
Looks like this breaks 2 packages (ExtendableFEMBase relies on using this to test that local variables updated inside a for loop to not allocate, which is incompatible with the updated test; and GraphicsMath relies on StaticArrays.MArarys doing lots of unsound things, which apparently LLVM is only willing to tolerate in very small amounts, so any tiny tweak to that test and LLVM detects that the optimization is invalid) and fixes 1 package (KiteUtils) |
The packages that fail their allocation tests now (that I have a regex for are): PtrArrays If this fixes one package and breaks two, is it worth doing? In what way those it improve things? |
@nanosoldier |
The package evaluation job you requested has completed - no new issues were detected. Report summary✖ Packages that failed1 packages failed on the previous version too. ✔ Packages that passed tests1 packages passed tests only on the current version.
~ Packages that at least loaded22 packages successfully loaded on the previous version too. |
That is not the correct link from nanosoldier. Is it possible to get the right one or rerun that? |
@nanosoldier |
The package evaluation job you requested has completed - no new issues were detected. Report summary✖ Packages that failed12 packages failed on the previous version too. ✔ Packages that passed tests12 packages passed tests only on the current version.
|
That looks good! Only 8 / 24 that seem to still error with allocated related reasons. |
Yes, the rest probably rely on MArrays or something with similar issues. The only other case I saw was this: |
So another alternative is this code: function try_get_vars!(@nospecialize(ex), vars::Vector{Symbol})
ex isa Symbol && push!(vars, ex)
ex isa Expr || return true
is_quoted(ex) && return true
# An allow-list of all expression types that don't change meaning (much) if
# we hoist their possible variable accesses into arguments, given that we
# don't allow :(=) in this list. We do allow conditionals, even though that
# means we could cause an UndefVar error to appear. We also hoist what may
# be global accesses, even though that means we change the order of evaluation.
if ex.head ∉ (:call, :., :var"'", :tuple, :ref, :vect, :hcat, :vcat, :ncat, :row, :nrow,
:block, :if, :&&, :||, :?, :let, :try, :catch, :finally, :global, :local)
return false
end
for a in ex.args
try_get_vars!(a, vars) || return false
end
return true
end
macro allocated(ex)
vars = Symbol[]
try_get_vars!(ex, vars) || empty!(vars)
body = quote
b0 = Ref{Int64}(0)
b1 = Ref{Int64}(0)
gc_bytes(b0)
$(esc(ex))
gc_bytes(b1)
return b1[] - b0[]
end
vars = map(esc, vars)
return remove_linenums!(:(((($(vars...)),) -> $body)($(vars...))))
end This doesn't have the specialization heuristics override anymore however, so it would break some tests, but our macroexpand.scm scope lisp pass is too buggy to be able to write this macro with that correctly specified (with all of the
|
Change the implementation of the allocations to always add a function call (closure) wrapper so that the inner code can be optimized, even if the outer scope is `@latestworld`. The implementation creates a closure if necessary (the code is complex) or just makes a call if the expression is simple (just a call on symbols). Many packages in the wild are observed to already be doing this themselves. (cherry picked from commit a47b58f)
Change the implementation of the allocations to always add a function call (closure) wrapper so that the inner code can be optimized, even if the outer scope is `@latestworld`. The implementation creates a closure if necessary (the code is complex) or just makes a call if the expression is simple (just a call on symbols). Many packages in the wild are observed to already be doing this themselves. (cherry picked from commit a47b58f)
Change the implementation of the allocations to always add a function call (closure) wrapper so that the inner code can be optimized, even if the outer scope is `@latestworld`. The implementation creates a closure if necessary (the code is complex) or just makes a call if the expression is simple (just a call on symbols). Many packages in the wild are observed to already be doing this themselves. (cherry picked from commit a47b58f)
I think this broke the tests of MOI due to the classic changing a captured variable issue: % julia +nightly --project=.
_
_ _ _(_)_ | Documentation: https://docs.julialang.org
(_) | (_) (_) |
_ _ _| |_ __ _ | Type "?" for help, "]?" for Pkg help.
| | | | | | |/ _` | |
| | |_| | | | (_| | | Version 1.13.0-DEV.468 (2025-04-29)
_/ |\__'_|_|_|\__'_| | Commit 2270fcdcc65 (0 days old master)
|__/ |
julia> using Test
julia> import MathOptInterface as MOI
julia> function test_main()
r = MOI.Nonlinear.OperatorRegistry()
x = [1.1, 1.0]
H = zeros(2, 2)
@test (@allocated MOI.Nonlinear.eval_multivariate_hessian(r, :^, H, x)) == 0
return
end
test_main (generic function with 1 method)
julia> function test_main_captured_x()
r = MOI.Nonlinear.OperatorRegistry()
x = [1.1, 1.0]
H = zeros(2, 2)
@test (@allocated MOI.Nonlinear.eval_multivariate_hessian(r, :^, H, x)) == 0
x = [1.1, 2.0]
return
end
test_main_captured_x (generic function with 1 method)
julia> function test_main_let()
r = MOI.Nonlinear.OperatorRegistry()
x = [1.1, 1.0]
H = zeros(2, 2)
let x = x
@test (@allocated MOI.Nonlinear.eval_multivariate_hessian(r, :^, H, x)) == 0
end
x = [1.1, 2.0]
return
end
test_main_let (generic function with 1 method)
julia> test_main()
julia> test_main_captured_x()
Test Failed at REPL[4]:5
Expression: #= REPL[4]:5 =# @allocated(MOI.Nonlinear.eval_multivariate_hessian(r, :^, H, x)) == 0
Evaluated: 112 == 0
ERROR: There was an error during testing
julia> test_main_let()
julia> |
It appears tests always fail to complete on PkgEval within the timeout, so we excluded that from the PR here. That is likely simply a matter of permitting quoted symbols. Something like: diff --git a/base/timing.jl b/base/timing.jl
index 9e3a4cf1284..f60f9d0ac53 100644
--- a/base/timing.jl
+++ b/base/timing.jl
@@ -495,7 +495,8 @@ function is_simply_call(@nospecialize ex)
for a in ex.args
a isa QuoteNode && continue
a isa Symbol && continue
- Base.is_self_quoting(a) && continue
+ is_self_quoting(a) && continue
+ is_quoted(a) && continue
return false
end
return true
@@ -516,6 +517,7 @@ julia> @allocated rand(10^6)
```
"""
macro allocated(ex)
+ ex = macroexpand(__module__, ex)
if !is_simply_call(ex)
ex = :((() -> $ex)())
end
@@ -541,6 +543,7 @@ julia> @allocations rand(10^6)
This macro was added in Julia 1.9.
"""
macro allocations(ex)
+ ex = macroexpand(__module__, ex)
if !is_simply_call(ex)
ex = :((() -> $ex)())
end |
Yeah the tests are a bit of a beast to run. I've partitioned the CI job to bring down the runtimes. The skip in PkgEval is one reason I have the tests running on nightly to catch these earlier. I don't know if that's the right fix though? The issue is that
is now equivalent to
so |
… macro optimization Refs #58057
Change the implementation of the allocations to always add a function call (closure) wrapper so that the inner code can be optimized, even if the outer scope is `@latestworld`. The implementation creates a closure if necessary (the code is complex) or just makes a call if the expression is simple (just a call on symbols). Many packages in the wild are observed to already be doing this themselves.
Move allocations macro to Test, since it is clearly just implementing testing heuristics, and change the implementation to add a function call (closure) wrapper so that the inner code can be optimized, even if the outer scope is
@latestworld
. The implementation creates a closure if necessary (the code is complex) or just makes a call if the expression is simple (just a call on symbols).