Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Generating Julia function for log density evaluation #278

Open
wants to merge 51 commits into
base: main
Choose a base branch
from

Conversation

sunxd3
Copy link
Member

@sunxd3 sunxd3 commented Feb 27, 2025

Motivation

JuliaBUGS compiles BUGS programs into a directed probabilistic graphical model (PGM), which implicitly defines the dependency structure between variables. While this graph allows for execution strategies like topological traversal and parallelization, a significant challenge arises from the BUGS language semantics: every element within an array can be treated as an individual random variable.

This fine-grained dependency structure means that a naive way to generate Julia source based on the variable-level graph would often require fully unrolling all loops. This approach is infeasible, especially for large datasets or complex models, and poses significant difficulties for automatic differentiation (AD) tools to analyze the program.

Proposed Changes

This PR introduces an initial implementation for generating a specialized Julia function dedicated to computing the log density of the model. The core idea is to operate on a higher level of abstraction than individual variable nodes.

The algorithm proceeds as follows:

  1. Statement-Level Dependence Graph: Construct a dependence graph where nodes represent the statements in the BUGS program, rather than individual random variables. Edges represent dependencies between these statements.
  2. Acyclicity Check: Verify if this statement-level graph contains any cycles (including self-loops).
  3. Topological Sort & Loop Fission: If the graph is acyclic, perform a topological sort on the statements. Based on this order, restructure the program, applying full loop fission. This separates loops operating on different variables, ensuring that computations occur in a valid dependency order.
  4. Code Generation: Generate a Julia function based on the topologically sorted and fissioned sequence of statements. Specialized code is generated for:
    • Deterministic assignments (=).
    • Stochastic assignments / Priors (~).
    • Observations (likelihood terms ).
      The generated function takes a flattened vector of parameter values and reconstructs them using Bijectors.jl to handle constraints and compute log Jacobian adjustments, accumulating the log-prior and log-likelihood terms.

Example: Rats Model

Consider the classic "Rats" example:

Original BUGS Code:

begin
    for i in 1:N
        for j in 1:T
            Y[i, j] ~ dnorm(mu[i, j], tau_c)              # (1) Likelihood
            mu[i, j] = alpha[i] + beta[i] * (x[j] - xbar) # (2) Deterministic
        end
        alpha[i] ~ dnorm(alpha_c, alpha_tau)              # (3) Prior
        beta[i] ~ dnorm(beta_c, beta_tau)                 # (4) Prior
    end
    tau_c ~ dgamma(0.001, 0.001)                          # (5) Prior
    sigma = 1 / sqrt(tau_c)                               # (6) Deterministic
    alpha_c ~ dnorm(0.0, 1.0e-6)                          # (7) Prior
    alpha_tau ~ dgamma(0.001, 0.001)                      # (8) Prior
    beta_c ~ dnorm(0.0, 1.0e-6)                           # (9) Prior
    beta_tau ~ dgamma(0.001, 0.001)                       # (10) Prior
    alpha0 = alpha_c - xbar * beta_c                      # (11) Deterministic
end

Statement Dependence Graph:

flowchart TB
    8 --> 3
    7 --> 3
    10 --> 4
    9 --> 11
    9 --> 4
    4 --> 11
    3 --> 2
    4 --> 2
    2 --> 1
    5 --> 1
    5 --> 6
Loading

(Note: Mermaid graph slightly adjusted for clarity based on variable dependencies)

Sequential Representation (after Topological Sort & Loop Fission):

This intermediate representation reflects the order determined by the statement graph dependencies and separates the original nested loops.

begin
    # Independent Priors first
    tau_c ~ dgamma(0.001, 0.001)           # (5)
    alpha_c ~ dnorm(0.0, 1.0e-6)           # (7)
    alpha_tau ~ dgamma(0.001, 0.001)       # (8)
    beta_c ~ dnorm(0.0, 1.0e-6)            # (9)
    beta_tau ~ dgamma(0.001, 0.001)        # (10)

    # Deterministic nodes depending only on above priors
    sigma = 1 / sqrt(tau_c)                # (6)  (Depends on 5)
    alpha0 = alpha_c - xbar * beta_c       # (11) (Depends on 7, 9)

    # Priors depending on hyperparameters (loop fissioned)
    for i in 1:N
        alpha[i] ~ dnorm(alpha_c, alpha_tau) # (3) (Depends on 7, 8)
    end

    for i in 1:N
        beta[i] ~ dnorm(beta_c, beta_tau)    # (4) (Depends on 9, 10)
    end

    # Deterministic node depending on loop variables (loop fissioned)
    for i in 1:N
        for j in 1:T
            mu[i, j] = alpha[i] + beta[i] * (x[j] - xbar) # (2) (Depends on 3, 4)
        end
    end

    # Likelihood (loop fissioned)
    for i in 1:N
        for j in 1:T
            Y[i, j]  dnorm(mu[i, j], tau_c) # (1) (Depends on 2, 5)
        end
    end
end

Generated Julia Log Density Function:

This function takes the model environment (__evaluation_env__) and flattened parameters (__flattened_values__), computes the log density (__logp__), and handles necessary transformations via Bijectors.jl.

function var"##__compute_log_density__#236"(__evaluation_env__, __flattened_values__)
    # Deconstruct the evaluation environment for easier access to variables
    (; alpha, var"beta.c", xbar, sigma, alpha0, x, N, var"alpha.c", mu, var"alpha.tau", Y, T, beta, var"beta.tau", var"tau.c") = __evaluation_env__

    # Initialize log density accumulator and flattened values index
    __logp__ = 0.0
    __current_idx__ = 1

    # --- Process Prior Distributions (in topological order) ---

    # Prior for beta_tau ~ dgamma(0.001, 0.001) [Statement 10]
    # (Gamma distribution requires positive values -> needs transformation)
    __dist__ = dgamma(0.001, 0.001)
    __b__ = Bijectors.bijector(__dist__) # Typically LogBijector for Gamma
    __b_inv__ = Bijectors.inverse(__b__)
    __transformed_length__ = length(Bijectors.transformed(__dist__, __b__)) # Usually 1 for scalar
    # Read value from flattened vector, reconstruct in transformed space
    __reconstructed_value__ = JuliaBUGS.reconstruct(__b_inv__, __dist__, view(__flattened_values__, __current_idx__:(__current_idx__ + __transformed_length__) - 1))
    __current_idx__ += __transformed_length__
    # Transform back to original space and get log Jacobian determinant
    (__value__, __logjac__) = Bijectors.with_logabsdet_jacobian(__b_inv__, __reconstructed_value__)
    # Calculate log prior: logpdf in original space + log Jacobian adjustment
    __logprior__ = Distributions.logpdf(__dist__, __value__) + __logjac__
    # Accumulate log density
    __logp__ = __logp__ + __logprior__
    # Assign the calculated value (in original space) to the variable
    var"beta.tau" = __value__ # Note: Using var"..." syntax for variables with dots

    # Prior for beta_c ~ dnorm(0.0, 1.0e-6) [Statement 9]
    # (Normal distribution is on R -> identity transform usually)
    __dist__ = dnorm(0.0, 1.0e-6)
    __b__ = Bijectors.bijector(__dist__) # Typically identity
    __b_inv__ = Bijectors.inverse(__b__)
    __transformed_length__ = if __b__ === identity
            length(__dist__)
        else
            length(Bijectors.transformed(__dist__, __b__))
        end
    __reconstructed_value__ = JuliaBUGS.reconstruct(__b_inv__, __dist__, view(__flattened_values__, __current_idx__:(__current_idx__ + __transformed_length__) - 1))
    __current_idx__ += __transformed_length__
    (__value__, __logjac__) = Bijectors.with_logabsdet_jacobian(__b_inv__, __reconstructed_value__) # logjac is 0 for identity
    __logprior__ = Distributions.logpdf(__dist__, __value__) + __logjac__
    __logp__ = __logp__ + __logprior__
    var"beta.c" = __value__

    # Prior for alpha_tau ~ dgamma(0.001, 0.001) [Statement 8]
    __dist__ = dgamma(0.001, 0.001)
    __b__ = Bijectors.bijector(__dist__)
    __b_inv__ = Bijectors.inverse(__b__)
    __transformed_length__ = length(Bijectors.transformed(__dist__, __b__))
    __reconstructed_value__ = JuliaBUGS.reconstruct(__b_inv__, __dist__, view(__flattened_values__, __current_idx__:(__current_idx__ + __transformed_length__) - 1))
    __current_idx__ += __transformed_length__
    (__value__, __logjac__) = Bijectors.with_logabsdet_jacobian(__b_inv__, __reconstructed_value__)
    __logprior__ = Distributions.logpdf(__dist__, __value__) + __logjac__
    __logp__ = __logp__ + __logprior__
    var"alpha.tau" = __value__

    # Prior for alpha_c ~ dnorm(0.0, 1.0e-6) [Statement 7]
    __dist__ = dnorm(0.0, 1.0e-6)
    __b__ = Bijectors.bijector(__dist__)
    __b_inv__ = Bijectors.inverse(__b__)
    __transformed_length__ = if __b__ === identity
            length(__dist__)
        else
            length(Bijectors.transformed(__dist__, __b__))
        end
    __reconstructed_value__ = JuliaBUGS.reconstruct(__b_inv__, __dist__, view(__flattened_values__, __current_idx__:(__current_idx__ + __transformed_length__) - 1))
    __current_idx__ += __transformed_length__
    (__value__, __logjac__) = Bijectors.with_logabsdet_jacobian(__b_inv__, __reconstructed_value__)
    __logprior__ = Distributions.logpdf(__dist__, __value__) + __logjac__
    __logp__ = __logp__ + __logprior__
    var"alpha.c" = __value__

    # --- Process Deterministic Calculations (in topological order) ---

    # alpha0 = alpha_c - xbar * beta_c [Statement 11]
    # Depends on alpha_c [7] and beta_c [9], which are now available
    alpha0 = var"alpha.c" - xbar * var"beta.c"

    # --- Process Remaining Priors & Deterministic Nodes ---

    # Prior for tau_c ~ dgamma(0.001, 0.001) [Statement 5]
    __dist__ = dgamma(0.001, 0.001)
    __b__ = Bijectors.bijector(__dist__)
    __b_inv__ = Bijectors.inverse(__b__)
    __transformed_length__ = length(Bijectors.transformed(__dist__, __b__))
    __reconstructed_value__ = JuliaBUGS.reconstruct(__b_inv__, __dist__, view(__flattened_values__, __current_idx__:(__current_idx__ + __transformed_length__) - 1))
    __current_idx__ += __transformed_length__
    (__value__, __logjac__) = Bijectors.with_logabsdet_jacobian(__b_inv__, __reconstructed_value__)
    __logprior__ = Distributions.logpdf(__dist__, __value__) + __logjac__
    __logp__ = __logp__ + __logprior__
    var"tau.c" = __value__

    # sigma = 1 / sqrt(tau_c) [Statement 6]
    # Depends on tau_c [5], which is now available
    sigma = 1 / sqrt(var"tau.c")

    # --- Process Looped Priors (Fissioned Loops) ---

    # Prior for beta[i] ~ dnorm(beta_c, beta_tau) [Statement 4]
    # Depends on beta_c [9] and beta_tau [10]
    for i = 1:N
        __dist__ = dnorm(var"beta.c", var"beta.tau") # Parameters are known values now
        __b__ = Bijectors.bijector(__dist__)
        __b_inv__ = Bijectors.inverse(__b__)
        __transformed_length__ = if __b__ === identity
                length(__dist__)
            else
                length(Bijectors.transformed(__dist__, __b__))
            end
        # Read value for beta[i]
        __reconstructed_value__ = JuliaBUGS.reconstruct(__b_inv__, __dist__, view(__flattened_values__, __current_idx__:(__current_idx__ + __transformed_length__) - 1))
        __current_idx__ += __transformed_length__
        (__value__, __logjac__) = Bijectors.with_logabsdet_jacobian(__b_inv__, __reconstructed_value__)
        __logprior__ = Distributions.logpdf(__dist__, __value__) + __logjac__
        __logp__ = __logp__ + __logprior__
        # Assign to the specific element beta[i]
        beta[Int(i)] = __value__ # Using Int(i) defensively for indexing
    end

    # Prior for alpha[i] ~ dnorm(alpha_c, alpha_tau) [Statement 3]
    # Depends on alpha_c [7] and alpha_tau [8]
    for i = 1:N
        __dist__ = dnorm(var"alpha.c", var"alpha.tau")
        __b__ = Bijectors.bijector(__dist__)
        __b_inv__ = Bijectors.inverse(__b__)
        __transformed_length__ = if __b__ === identity
                length(__dist__)
            else
                length(Bijectors.transformed(__dist__, __b__))
            end
        # Read value for alpha[i]
        __reconstructed_value__ = JuliaBUGS.reconstruct(__b_inv__, __dist__, view(__flattened_values__, __current_idx__:(__current_idx__ + __transformed_length__) - 1))
        __current_idx__ += __transformed_length__
        (__value__, __logjac__) = Bijectors.with_logabsdet_jacobian(__b_inv__, __reconstructed_value__)
        __logprior__ = Distributions.logpdf(__dist__, __value__) + __logjac__
        __logp__ = __logp__ + __logprior__
        # Assign to the specific element alpha[i]
        alpha[Int(i)] = __value__
    end

    # --- Process Looped Deterministic Calculations (Fissioned Loops) ---

    # mu[i, j] = alpha[i] + beta[i] * (x[j] - xbar) [Statement 2]
    # Depends on alpha[i] [3] and beta[i] [4]
    for i = 1:N
        for j = 1:T
            # Both alpha[i] and beta[i] are available from previous loops
            # x[j] and xbar are from the environment (constants/data)
            mu[Int(i), Int(j)] = alpha[Int(i)] + beta[Int(i)] * (x[Int(j)] - xbar)
        end
    end

    # --- Process Likelihood Contribution (Fissioned Loop) ---

    # Y[i, j] ~ dnorm(mu[i, j], tau_c) [Statement 1]
    # Depends on mu[i, j] [2] and tau_c [5]
    for i = 1:N
        for j = 1:T
            # mu[i,j] and tau_c are available
            # Y[i,j] is data from the environment
            # Add log-likelihood contribution directly (no Jacobian for data)
            __logp__ += logpdf(dnorm(mu[Int(i), Int(j)], var"tau.c"), Y[Int(i), Int(j)])
        end
    end

    # --- Final Assertion and Return ---

    # Sanity check: ensure all values from the flattened vector have been consumed
    @assert __current_idx__ == length(__flattened_values__) + 1 "Indexing error: Not all parameter values were used or too many were expected."

    # Return the total computed log density
    return __logp__
end

Performance

The generated function demonstrates significant performance improvements and eliminates allocations compared to evaluating the log density through the generic LogDensityProblems.logdensity interface, which involves more overhead:

LogDensityProblems.logdensity Benchmark:

julia> @benchmark LogDensityProblems.logdensity($model, $x)
BenchmarkTools.Trial: 10000 samples with 1 evaluation per sample.
 Range (min  max):  33.708 μs    9.379 ms  ┊ GC (min  max): 0.00%  99.21%
 Time  (median):     39.459 μs               ┊ GC (median):    0.00%
 Time  (mean ± σ):   41.575 μs ± 122.243 μs  ┊ GC (mean ± σ):  4.88% ±  1.71%
 # ... histogram ...
 Memory estimate: 60.53 KiB, allocs estimate: 1566.

Directly Generated Function Benchmark:

julia> @benchmark $(model.log_density_computation_function)($(model.evaluation_env), $x)
BenchmarkTools.Trial: 10000 samples with 103 evaluations per sample.
 Range (min  max):  781.146 ns   1.123 μs  ┊ GC (min  max): 0.00%  0.00%
 Time  (median):     873.786 ns              ┊ GC (median):    0.00%
 Time  (mean ± σ):   873.010 ns ± 28.609 ns  ┊ GC (mean ± σ):  0.00% ± 0.00%
 # ... histogram ...
 Memory estimate: 0 bytes, allocs estimate: 0.

Gradient Performance (using Mooncake AD on the generated function):

The generated function structure is also amenable to AD, yielding efficient gradient computations:

BenchmarkTools.Trial: 10000 samples with 6 evaluations per sample.
 Range (min  max):  5.160 μs   18.174 μs  ┊ GC (min  max): 0.00%  0.00%
 Time  (median):     5.938 μs               ┊ GC (median):    0.00%
 Time  (mean ± σ):   5.958 μs ± 389.948 ns  ┊ GC (mean ± σ):  0.00% ± 0.00%
 # ... histogram ...
 Memory estimate: 1.19 KiB, allocs estimate: 6.

which is on par with Stan.

sunxd3 and others added 4 commits February 27, 2025 09:22
@coveralls
Copy link

coveralls commented Feb 27, 2025

Pull Request Test Coverage Report for Build 14398731557

Details

  • 300 of 349 (85.96%) changed or added relevant lines in 4 files are covered.
  • 1 unchanged line in 1 file lost coverage.
  • Overall coverage increased (+1.1%) to 83.081%

Changes Missing Coverage Covered Lines Changed/Added Lines %
src/logdensityproblems.jl 4 6 66.67%
src/compiler_pass.jl 0 3 0.0%
src/model.jl 19 24 79.17%
src/source_gen.jl 277 316 87.66%
Files with Coverage Reduction New Missed Lines %
src/model.jl 1 80.34%
Totals Coverage Status
Change from base Build 14383232100: 1.1%
Covered Lines: 2028
Relevant Lines: 2441

💛 - Coveralls

sunxd3 and others added 16 commits March 14, 2025 12:09
Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
@yebai
Copy link
Member

yebai commented Apr 6, 2025

I don't understand why it is related to Mooncake. The above example doesn't use Mooncake.

@sunxd3
Copy link
Member Author

sunxd3 commented Apr 7, 2025

benchmark code for gradient is added (see first code block of #278 (comment))

@yebai yebai requested a review from Copilot April 7, 2025 19:22
Copy link

@Copilot Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copilot reviewed 18 out of 35 changed files in this pull request and generated no comments.

Files not reviewed (17)
  • benchmark/benchmark.jl: Language not supported
  • benchmark/juliabugs.jl: Language not supported
  • benchmark/run_benchmarks.jl: Language not supported
  • src/BUGSExamples/Volume_3/03_Fire.jl: Language not supported
  • src/BUGSExamples/Volume_3/04_Circle.jl: Language not supported
  • src/BUGSExamples/Volume_3/04_FunShapes.jl: Language not supported
  • src/BUGSExamples/Volume_3/04_HollowSquare.jl: Language not supported
  • src/BUGSExamples/Volume_3/04_Parallelogram.jl: Language not supported
  • src/BUGSExamples/Volume_3/04_Ring.jl: Language not supported
  • src/BUGSExamples/Volume_3/04_SquareMinusCircle.jl: Language not supported
  • src/BUGSExamples/Volume_3/05_Hepatitis.jl: Language not supported
  • src/BUGSExamples/Volume_3/05_Hepatitis_ME.jl: Language not supported
  • src/BUGSExamples/Volume_3/06_Hips1.jl: Language not supported
  • src/BUGSExamples/Volume_3/07_Hips2.jl: Language not supported
  • src/BUGSExamples/Volume_3/08_Hips3.jl: Language not supported
  • src/BUGSExamples/Volume_3/09_Hips4.jl: Language not supported
  • src/BUGSExamples/Volume_3/11_PigWeights.jl: Language not supported

@yebai
Copy link
Member

yebai commented Apr 8, 2025

@penelopeysm @mhauru, can you review this PR to the best of your ability? I'll take a look more carefully later.

@yebai yebai self-requested a review April 8, 2025 09:09
@penelopeysm penelopeysm self-requested a review April 8, 2025 09:29
@sunxd3
Copy link
Member Author

sunxd3 commented Apr 8, 2025

Thanks @penelopeysm , just to reiterate, I know the amount of lines changed is very large. But only https://github.com/TuringLang/JuliaBUGS.jl/blob/sunxd/source_gen/src/source_gen.jl and its test https://github.com/TuringLang/JuliaBUGS.jl/blob/sunxd/source_gen/test/source_gen.jl are worth looking. Others are either low risk or updates to benchmarking that's no super relevant.

@penelopeysm
Copy link
Member

I didn't look at any of the Julia files because I don't know how to, and I have no comments on the other files!

Just kidding. I'll do my best haha

Copy link
Member

@penelopeysm penelopeysm left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I haven't yet looked at the code, but I've given the documentation a read through. It's very good! I just put in a couple of comments on places where I still had a bit of confusion. I'll look at the source code another time.

Comment on lines +276 to +278
All subsequent iterations follow the same pattern and have the same dependence vector of $(1)$. Because all dependence vectors are lexicographically non-negative, the loop is sequentially valid.

This requires storing the loop variable `i` for each variable, but we already computed this with JuliaBUGS compilation, so not much overhead is required.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To make sure I'm understanding this properly: as part of the compilation, you step through the loop and on each iteration you calculate the dependency vector for the variables in the loop?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Currently, we don't compute the dependence vector. But we store the values of loop inductions variables for each variables in the model.

end
```

We made a simple change to the program to prepare for lowering: we need to distinguish between observations and model parameters (because they correspond to different code). We introduce a new operator into the program `\eqsim` to indicate that the left hand side is an observation.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Personally, I don't think I'm entirely comfortable with eqsim; I think it looks too much like an ordinary equals, and also it's hard to type.

I think something like ~= would be better for my eyes - would you be ok with that?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

~= is a good proposal, but I don't think Julia parser allows it...

julia> :(a ~= Normal(0, 1))
ERROR: ParseError:
# Error @ REPL[1]:1:6
:(a ~= Normal(0, 1))
#    ╙ ── unexpected `=`
Stacktrace:
 [1] top-level scope
   @ REPL:1

Comment on lines +276 to +278
All subsequent iterations follow the same pattern and have the same dependence vector of $(1)$. Because all dependence vectors are lexicographically non-negative, the loop is sequentially valid.

This requires storing the loop variable `i` for each variable, but we already computed this with JuliaBUGS compilation, so not much overhead is required.
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Currently, we don't compute the dependence vector. But we store the values of loop inductions variables for each variables in the model.

sunxd3 and others added 2 commits April 9, 2025 10:27
Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
Copy link
Member

@penelopeysm penelopeysm left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The only thing I really think is important is the macro hygiene one!

Comment on lines +45 to +48
abstract type EvaluationMode end

struct UseGeneratedLogDensityFunction <: EvaluationMode end
struct UseGraph <: EvaluationMode end
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I like this pattern very much (it's something like sum types / enums) and it makes me sad that it doesn't get used often enough in Julia

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh, any reason why you didn't use @enum here? Just curious.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No particular reason, just didn't come to my mind at the time.

I don't think enum is more elegant than subtyping for this case (not saying it's worse either). Do you have a preference?

@@ -378,6 +429,16 @@ function settrans(model::BUGSModel, bool::Bool=!(model.transformed))
return BangBang.setproperty!!(model, :transformed, bool)
end

function set_evaluation_mode(model::BUGSModel, mode::EvaluationMode)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should this be set_evaluation_mode!!? I'm still not 100% sure when we use ! or !! (I suspect DynamicPPL isn't super consistent on this either)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In this case, because it never modify the original object, I didn't use any !.
In my mind, !! means "mutate if possible". Then if no way of mutating anyway, maybe okay to not use ! at all?

Comment on lines 63 to 91
for example_name in test_examples
model, evaluation_env = _create_model(example_name)
bugs_models[example_name] = model
evaluation_envs[example_name] = evaluation_env
lowered_model_def, reconstructed_model_def = _generate_lowered_model_def(
model.model_def, model.g, evaluation_env
)
log_density_computation_expr = _gen_log_density_computation_function_expr(
lowered_model_def, evaluation_env, gensym(example_name)
)
log_density_computation_functions[example_name] = eval(log_density_computation_expr)
reconstructed_model_defs[example_name] = reconstructed_model_def
end

@testset "source_gen: $example_name" for example_name in test_examples
model_with_consistent_sorted_nodes = _create_bugsmdoel_with_consistent_sorted_nodes(
bugs_models[example_name], reconstructed_model_defs[example_name]
)
result_with_old_model = JuliaBUGS.evaluate!!(bugs_models[example_name])[2]
params = JuliaBUGS.getparams(model_with_consistent_sorted_nodes)
result_with_bugsmodel = JuliaBUGS.evaluate!!(
model_with_consistent_sorted_nodes, params
)[2]
result_with_log_density_computation_function = log_density_computation_functions[example_name](
evaluation_envs[example_name], params
)
@test result_with_old_model ≈ result_with_bugsmodel
@test result_with_log_density_computation_function ≈ result_with_bugsmodel
end
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could the for loop and the testset be combined here? It seems to me that you don't need to construct the dictionaries here since each iteration just processes one symbol at a time.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think so unfortunately. Because I am generating function and eval them. If combined, there are world-age problems.

Comment on lines +516 to +519
MacroTools.@capture(stmt, lhs_ ~ rhs_)
return MacroTools.@q begin
__dist__ = $rhs
__b__ = Bijectors.bijector(__dist__)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there a particular reason why you use the double-underscore here? My impression is that it'd be safer to gensym these variables, because otherwise it's leaky - if someone defines a model with a parameter called __logp__ for example it would interfere with this. For example, this is what DynamicPPL does:

Suggested change
MacroTools.@capture(stmt, lhs_ ~ rhs_)
return MacroTools.@q begin
__dist__ = $rhs
__b__ = Bijectors.bijector(__dist__)
MacroTools.@capture(stmt, lhs_ ~ rhs_)
@gensym dist b
return MacroTools.@q begin
$dist = $rhs
$b = Bijectors.bijector($dist)

(The suggestion extends to the other variables in this Expr.)

I did consider that maybe your intention was to expose some of these variables to the user, like logp? That's what we do with __varinfo__ and __context__ in DynamicPPL. But I guessed that probably not all of these should be exposed haha

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh, I realised maybe you can't do this with __logp__ because it needs to be 'carried over' between different statements (or at the least, it'd be annoying because you'd have to gensym it outside of this function and then pass it in, to ensure that it was the same variable being used all the time). But it seems to me that the rest can be gensym'd!
If you don't gensym logp, then I'd document somewhere that it's a special/reserved variable name and you shouldn't use it (or alternatively, maybe better, issue a warning).

Copy link
Member Author

@sunxd3 sunxd3 Apr 11, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, it was more for ease of programming. Using gensym is much safer, but need to thread all the generated names to sub-functions, which is bit annoying.

Regarding the use of double underscores, I always thought it's convention to reserve using underscore as internal. But I just realize this might only apply to function names?

This is a great catch! I'll add a test for this.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is done now

sunxd3 and others added 9 commits April 11, 2025 08:29
Co-authored-by: Penelope Yong <[email protected]>
Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
Copy link
Contributor

Benchmark results on macOS (aarch64)

BridgeStan not found at location specified by $BRIDGESTAN environment variable, downloading version 2.6.1 to /Users/runner/.bridgestan/bridgestan-2.6.1
Done!

Stan results:

Model Parameters Density Time (µs) Density+Gradient Time (µs)
rats 65 4.90967 7.01367
pumps 12 0.819444 0.940362
bones 13 77.583 84.833
oxford 244 10.9375 12.8125
epil 303 31.083 35.083
lsat 1006 98.375 134.541
schools 133 221.5 344.625
beetles 2 0.708341 0.784514
air 5 0.763158 0.859375

JuliaBUGS Mooncake results:

Model Parameters Density Time (µs) Density+Gradient Time (µs)
rats 65 1.16832 7.59375
pumps 12 0.436567 1.79163
bones 33 45.125 245.688
oxford 244 21.875 57.833
epil 303 7.11475 34.5
lsat 1006 51.375 226.334
schools 133 90.625 686.937
beetles 2 1.03425 3.97629
air 5 0.286856 1.25904

Benchmark results on Ubuntu (x64)

BridgeStan not found at location specified by $BRIDGESTAN environment variable, downloading version 2.6.1 to /home/runner/.bridgestan/bridgestan-2.6.1
Done!

Stan results:

Model Parameters Density Time (µs) Density+Gradient Time (µs)
rats 65 5.2778 7.64433
pumps 12 0.9588 1.19838
bones 13 70.372 87.654
oxford 244 13.6705 16.511
epil 303 28.513 35.576
lsat 1006 154.073 195.925
schools 133 508.575 744.67
beetles 2 0.828686 0.989069
air 5 0.619447 0.758132

JuliaBUGS Mooncake results:

Model Parameters Density Time (µs) Density+Gradient Time (µs)
rats 65 1.92627 13.746
pumps 12 0.600918 2.44636
bones 33 48.07 351.857
oxford 244 23.654 73.227
epil 303 10.8505 55.995
lsat 1006 90.55 366.825
schools 133 103.243 1110.35
beetles 2 1.10206 4.68033
air 5 0.341552 1.39119

@sunxd3
Copy link
Member Author

sunxd3 commented Apr 11, 2025

FYI, I switched to directly using the generated function for benchmarking instead of through LogDensityProblems

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants