Skip to content

fix allocations macro with more functions #58057

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1 commit into from
Apr 24, 2025
Merged

Conversation

vtjnash
Copy link
Member

@vtjnash vtjnash commented Apr 9, 2025

Move allocations macro to Test, since it is clearly just implementing testing heuristics, and change the implementation to add a function call (closure) wrapper so that the inner code can be optimized, even if the outer scope is @latestworld. The implementation creates a closure if necessary (the code is complex) or just makes a call if the expression is simple (just a call on symbols).

@vtjnash vtjnash requested a review from JeffBezanson April 9, 2025 21:05
@Keno
Copy link
Member

Keno commented Apr 9, 2025

Move allocations macro to Test

This part was dropped from the commit

Move allocations macro to Test, since it is clearly just implementing
testing heuristics, and change the implementation to add a function call
(closure) wrapper so that the inner code can be optimized, even if the
outer scope is `@latestworld`. The implementation creates a closure if
necessary (the code is complex) or just makes a call if the expression
is simple (just a call on symbols).
@vtjnash vtjnash force-pushed the jn/test-allocations branch from 63caedb to 35e7208 Compare April 10, 2025 14:30
@@ -59,12 +58,14 @@ has_fma = Dict(
@test clamp(typemax(UInt16), Int16) === Int16(32767)

# clamp should not allocate a BigInt for typemax(Int16)
x = big(2) ^ 100
@test (@allocated clamp(x, Int16)) == 0
let x = big(2) ^ 100
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why are these lets now needed?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

they aren't with the current macro implementation, but just to make this test a bit more clear, and more stable against the next time we change the macro again 😂

@vtjnash
Copy link
Member Author

vtjnash commented Apr 15, 2025

@KristofferC do you want PkgEval before or after this, so you can know about breakage

@KristofferC
Copy link
Member

Might as well run one here, otherwise things might get lost in the noise.

@vtjnash
Copy link
Member Author

vtjnash commented Apr 18, 2025

@nanosoldier runtests()

@nanosoldier
Copy link
Collaborator

The package evaluation job you requested has completed - possible new issues were detected.
The full report is available.

Report summary

❗ Packages that crashed

2 packages crashed only on the current version.

  • An internal error was encountered: 1 packages
  • A segmentation fault happened: 1 packages

6 packages crashed on the previous version too.

✖ Packages that failed

25 packages failed only on the current version.

  • Illegal method overwrites during precompilation: 1 packages
  • Package has test failures: 7 packages
  • Package tests unexpectedly errored: 1 packages
  • Tests became inactive: 3 packages
  • Test duration exceeded the time limit: 13 packages

1260 packages failed on the previous version too.

✔ Packages that passed tests

27 packages passed tests only on the current version.

  • Other: 27 packages

5301 packages passed tests on the previous version too.

~ Packages that at least loaded

9 packages successfully loaded only on the current version.

  • Other: 9 packages

2946 packages successfully loaded on the previous version too.

➖ Packages that were skipped altogether

916 packages were skipped on the previous version too.

@vtjnash
Copy link
Member Author

vtjnash commented Apr 21, 2025

Looks like this breaks 2 packages (ExtendableFEMBase relies on using this to test that local variables updated inside a for loop to not allocate, which is incompatible with the updated test; and GraphicsMath relies on StaticArrays.MArarys doing lots of unsound things, which apparently LLVM is only willing to tolerate in very small amounts, so any tiny tweak to that test and LLVM detects that the optimization is invalid) and fixes 1 package (KiteUtils)

@KristofferC
Copy link
Member

The packages that fail their allocation tests now (that I have a regex for are):

PtrArrays
Adapt
StructArrays
MarchingCubes
BitBasis
Kronecker
GeometryTypes
AccessorsExtra
FixedPointDecimals
SparseMatricesCOO
KeywordCalls
KiteUtils
OperatorScaling
PlayingCards
OptimalSortingNetworks
Stencils
ConvexHulls2d
ShiftedProximalOperators
MultiThreadedCaches
MallocArrays
StaticArraysBlasInterfaces
SHTns
BoundedDegreeGraphs
NeumannKelvin

If this fixes one package and breaks two, is it worth doing? In what way those it improve things?

@vtjnash
Copy link
Member Author

vtjnash commented Apr 21, 2025

@nanosoldier runtests(["PtrArrays", "Adapt", "StructArrays", "MarchingCubes", "BitBasis", "Kronecker", "GeometryTypes", "AccessorsExtra", "FixedPointDecimals", "SparseMatricesCOO", "KeywordCalls", "KiteUtils", "OperatorScaling", "PlayingCards", "OptimalSortingNetworks", "Stencils", "ConvexHulls2d", "ShiftedProximalOperators", "MultiThreadedCaches", "MallocArrays", "StaticArraysBlasInterfaces", "SHTns", "BoundedDegreeGraphs", "NeumannKelvin"])

@nanosoldier
Copy link
Collaborator

The package evaluation job you requested has completed - no new issues were detected.
The full report is available.

Report summary

✖ Packages that failed

1 packages failed on the previous version too.

✔ Packages that passed tests

1 packages passed tests only on the current version.

  • Other: 1 packages

~ Packages that at least loaded

22 packages successfully loaded on the previous version too.

@vtjnash
Copy link
Member Author

vtjnash commented Apr 21, 2025

That is not the correct link from nanosoldier. Is it possible to get the right one or rerun that?

@vtjnash
Copy link
Member Author

vtjnash commented Apr 21, 2025

@nanosoldier runtests(["PtrArrays", "Adapt", "StructArrays", "MarchingCubes", "BitBasis", "Kronecker", "GeometryTypes", "AccessorsExtra", "FixedPointDecimals", "SparseMatricesCOO", "KeywordCalls", "KiteUtils", "OperatorScaling", "PlayingCards", "OptimalSortingNetworks", "Stencils", "ConvexHulls2d", "ShiftedProximalOperators", "MultiThreadedCaches", "MallocArrays", "StaticArraysBlasInterfaces", "SHTns", "BoundedDegreeGraphs", "NeumannKelvin"], vs=":release-1.12")

@nanosoldier
Copy link
Collaborator

The package evaluation job you requested has completed - no new issues were detected.
The full report is available.

Report summary

✖ Packages that failed

12 packages failed on the previous version too.

✔ Packages that passed tests

12 packages passed tests only on the current version.

  • Other: 12 packages

@KristofferC
Copy link
Member

KristofferC commented Apr 22, 2025

That looks good! Only 8 / 24 that seem to still error with allocated related reasons.

@vtjnash
Copy link
Member Author

vtjnash commented Apr 22, 2025

Yes, the rest probably rely on MArrays or something with similar issues. The only other case I saw was this:
allocs = @allocated mul!(xB, B', y)
which doesn't qualify for the optimized (non-capturing) form due to the ' there. We could potentially do a more substantial macro rewrite to collect all Symbols and make them arguments to the function if there are no = assignments.

@vtjnash
Copy link
Member Author

vtjnash commented Apr 22, 2025

So another alternative is this code:

function try_get_vars!(@nospecialize(ex), vars::Vector{Symbol})
    ex isa Symbol && push!(vars, ex)
    ex isa Expr || return true
    is_quoted(ex) && return true
    # An allow-list of all expression types that don't change meaning (much) if
    # we hoist their possible variable accesses into arguments, given that we
    # don't allow :(=) in this list. We do allow conditionals, even though that
    # means we could cause an UndefVar error to appear. We also hoist what may
    # be global accesses, even though that means we change the order of evaluation.
    if ex.head  (:call, :., :var"'", :tuple, :ref, :vect, :hcat, :vcat, :ncat, :row, :nrow,
        :block, :if, :&&, :||, :?, :let, :try, :catch, :finally, :global, :local)
        return false
    end
    for a in ex.args
        try_get_vars!(a, vars) || return false
    end
    return true
end

macro allocated(ex)
    vars = Symbol[]
    try_get_vars!(ex, vars) || empty!(vars)
    body = quote
        b0 = Ref{Int64}(0)
        b1 = Ref{Int64}(0)
        gc_bytes(b0)
        $(esc(ex))
        gc_bytes(b1)
        return b1[] - b0[]
    end
    vars = map(esc, vars)
    return remove_linenums!(:(((($(vars...)),) -> $body)($(vars...))))
end

This doesn't have the specialization heuristics override anymore however, so it would break some tests, but our macroexpand.scm scope lisp pass is too buggy to be able to write this macro with that correctly specified (with all of the () and that causes it to simply crash when written out like this.

    ws = map(i -> Symbol("_$i"), 1:length(vars))
    vars = map(esc, vars)
    args = map((v, w) -> :($v::$w), vars, ws)
    sig = :( ($(args...),) where {$(ws...)} )
    ex = remove_linenums!(:(($sig -> $body)($(vars...))))

@vtjnash vtjnash merged commit a47b58f into master Apr 24, 2025
8 checks passed
@vtjnash vtjnash deleted the jn/test-allocations branch April 24, 2025 15:22
@vtjnash vtjnash added the backport 1.12 Change should be backported to release-1.12 label Apr 24, 2025
KristofferC pushed a commit that referenced this pull request Apr 26, 2025
Change the implementation of the allocations to always add a function
call (closure) wrapper so that the inner code can be optimized, even if
the outer scope is `@latestworld`. The implementation creates a closure
if necessary (the code is complex) or just makes a call if the
expression is simple (just a call on symbols). Many packages in the
wild are observed to already be doing this themselves.

(cherry picked from commit a47b58f)
KristofferC pushed a commit that referenced this pull request Apr 26, 2025
Change the implementation of the allocations to always add a function
call (closure) wrapper so that the inner code can be optimized, even if
the outer scope is `@latestworld`. The implementation creates a closure
if necessary (the code is complex) or just makes a call if the
expression is simple (just a call on symbols). Many packages in the
wild are observed to already be doing this themselves.

(cherry picked from commit a47b58f)
KristofferC pushed a commit that referenced this pull request Apr 29, 2025
Change the implementation of the allocations to always add a function
call (closure) wrapper so that the inner code can be optimized, even if
the outer scope is `@latestworld`. The implementation creates a closure
if necessary (the code is complex) or just makes a call if the
expression is simple (just a call on symbols). Many packages in the
wild are observed to already be doing this themselves.

(cherry picked from commit a47b58f)
@KristofferC KristofferC mentioned this pull request Apr 29, 2025
53 tasks
@odow
Copy link
Contributor

odow commented Apr 30, 2025

I think this broke the tests of MOI due to the classic changing a captured variable issue:

% julia +nightly --project=.
               _
   _       _ _(_)_     |  Documentation: https://docs.julialang.org
  (_)     | (_) (_)    |
   _ _   _| |_  __ _   |  Type "?" for help, "]?" for Pkg help.
  | | | | | | |/ _` |  |
  | | |_| | | | (_| |  |  Version 1.13.0-DEV.468 (2025-04-29)
 _/ |\__'_|_|_|\__'_|  |  Commit 2270fcdcc65 (0 days old master)
|__/                   |

julia> using Test

julia> import MathOptInterface as MOI

julia> function test_main()
           r = MOI.Nonlinear.OperatorRegistry()
           x = [1.1, 1.0]
           H = zeros(2, 2)
           @test (@allocated MOI.Nonlinear.eval_multivariate_hessian(r, :^, H, x)) == 0
           return
       end
test_main (generic function with 1 method)

julia> function test_main_captured_x()
           r = MOI.Nonlinear.OperatorRegistry()
           x = [1.1, 1.0]
           H = zeros(2, 2)
           @test (@allocated MOI.Nonlinear.eval_multivariate_hessian(r, :^, H, x)) == 0
           x = [1.1, 2.0]
           return
       end
test_main_captured_x (generic function with 1 method)

julia> function test_main_let()
           r = MOI.Nonlinear.OperatorRegistry()
           x = [1.1, 1.0]
           H = zeros(2, 2)
           let x = x
               @test (@allocated MOI.Nonlinear.eval_multivariate_hessian(r, :^, H, x)) == 0
           end
           x = [1.1, 2.0]
           return
       end
test_main_let (generic function with 1 method)

julia> test_main()

julia> test_main_captured_x()
Test Failed at REPL[4]:5
  Expression: #= REPL[4]:5 =# @allocated(MOI.Nonlinear.eval_multivariate_hessian(r, :^, H, x)) == 0
   Evaluated: 112 == 0
ERROR: There was an error during testing

julia> test_main_let()

julia>

@vtjnash
Copy link
Member Author

vtjnash commented Apr 30, 2025

It appears tests always fail to complete on PkgEval within the timeout, so we excluded that from the PR here.
https://s3.amazonaws.com/julialang-reports/nanosoldier/pkgeval/by_date/2025-04/27/report.html
MathOptInterface v1.39.0 ▅▅▅▅▅▅▅▅▅▅▅▅▅▅ fail

That is likely simply a matter of permitting quoted symbols. Something like:

diff --git a/base/timing.jl b/base/timing.jl
index 9e3a4cf1284..f60f9d0ac53 100644
--- a/base/timing.jl
+++ b/base/timing.jl
@@ -495,7 +495,8 @@ function is_simply_call(@nospecialize ex)
     for a in ex.args
         a isa QuoteNode && continue
         a isa Symbol && continue
-        Base.is_self_quoting(a) && continue
+        is_self_quoting(a) && continue
+        is_quoted(a) && continue
         return false
     end
     return true
@@ -516,6 +517,7 @@ julia> @allocated rand(10^6)
 ```
 """
 macro allocated(ex)
+    ex = macroexpand(__module__, ex)
     if !is_simply_call(ex)
         ex = :((() -> $ex)())
     end
@@ -541,6 +543,7 @@ julia> @allocations rand(10^6)
     This macro was added in Julia 1.9.
 """
 macro allocations(ex)
+    ex = macroexpand(__module__, ex)
     if !is_simply_call(ex)
         ex = :((() -> $ex)())
     end

@odow
Copy link
Contributor

odow commented Apr 30, 2025

Yeah the tests are a bit of a beast to run. I've partitioned the CI job to bring down the runtimes. The skip in PkgEval is one reason I have the tests running on nightly to catch these earlier.

I don't know if that's the right fix though? The issue is that

@allocated MOI.Nonlinear.eval_multivariate_hessian(r, :^, H, x)

is now equivalent to

Base.allocated(() -> MOI.Nonlinear.eval_multivariate_hessian(r, :^, H, x))

so x gets Core.Boxed.

vtjnash added a commit that referenced this pull request May 1, 2025
LebedevRI pushed a commit to LebedevRI/julia that referenced this pull request May 2, 2025
Change the implementation of the allocations to always add a function
call (closure) wrapper so that the inner code can be optimized, even if
the outer scope is `@latestworld`. The implementation creates a closure
if necessary (the code is complex) or just makes a call if the
expression is simple (just a call on symbols). Many packages in the
wild are observed to already be doing this themselves.
@KristofferC KristofferC removed the backport 1.12 Change should be backported to release-1.12 label May 9, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants