Hierarchical Gaussian Filter Model of Human Choice Data #419

cgoldman123 · 2025-02-03T18:32:55Z

cgoldman123
Feb 3, 2025

Hello,

I hope this message finds everyone well. I am currently using the ReactiveBayes package to implement a Hierarchical Gaussian Filter (HGF) model for human choice data in a decision-making paradigm. In this task, participants observe one of two stimuli (obs variable, represented as 1 or 0) and respond with their belief about which stimulus they experienced (resp variable, also represented as 1 or 0). The true probability of observing each stimulus changes over time, along with the uncertainty of that probability, so we are modeling this process using a Gaussian random walk.

I am having difficulty configuring the model to infer both the latent states underlying the observation process and free parameters that describe participants' behavior. I was wondering if you had any insights into how I should structure the model to achieve this. I've pasted my code below:

function sigmoid(x)
    return 1.0 / (1.0 + exp(-x))
end

@model function hgf_smoothing(f,obs, resp, z_variance)
   # Initial states
   z_prev ~ Normal(mean = 0.0, variance = 5.0)  # Higher layer hidden state
   x_prev ~ Normal(mean = 0.0, variance = 5.0)  # Lower layer hidden state

   # Priors on κ and ω
   κ ~ Normal(mean = 1.5, variance = 1.0)
   ω ~ Normal(mean = 0.0, variance = 0.05)
   β ~ Gamma(shape = 2.0, rate = 2.0)  # Inverse temperature parameter for response model

   for i in eachindex(obs)
       # Higher layer update (Gaussian random walk)
       z[i] ~ Normal(mean = z_prev, variance = z_variance)

       # Lower layer update
       x[i] ~ GCV(x_prev, z[i], κ, ω)

       # Apply sigmoid function to convert to probability
       p[i] := sigmoid(x[i]) # Sigmoid transformation

       # Noisy binary observations (Bernoulli likelihood)
       obs[i] ~ Bernoulli(p[i])

       # Response model: assume a softmax function governs responses
       resp_p[i] := sigmoid(β * x[i])

       # Noisy binary response (Bernoulli likelihood)
       resp[i] ~ Bernoulli(resp_p[i])


       # Update previous states
       z_prev = z[i]
       x_prev = x[i]
   end 
end

@constraints function hgfconstraints_smoothing() 
    #Structured mean-field factorization constraints
    q(x_prev,x, z,κ,ω,β) = q(x_prev,x)q(z)q(κ)q(ω)q(β)
end

@meta function hgfmeta_smoothing()
    # Lets use 31 approximation points in the Gauss Hermite cubature approximation method
    GCV() -> GCVMetadata(GaussHermiteCubature(31)) 
    sigmoid() -> DeltaMeta(method = Linearization())
end

function run_inference_smoothing(obs, resp, z_variance)
    @initialization function hgf_init_smoothing()
        q(x) = NormalMeanVariance(0.0,5.0)
        μ(x) = vague(NormalMeanVariance)
        q(z) = NormalMeanVariance(0.0,5.0)
        q(κ) = NormalMeanVariance(1.5,1.0)
        q(ω) = NormalMeanVariance(0.0,0.05)
        q(β) = GammaShapeRate(2.0,2.0)
    end

    #Let's do inference with 20 iterations 
    return infer(
        model = hgf_smoothing(f=sigmoid, z_variance = z_variance,),
        data = (obs = obs,resp=resp,),
        meta = hgfmeta_smoothing(),
        constraints = hgfconstraints_smoothing(),
        initialization = hgf_init_smoothing(),
        iterations = 20,
        options = (limit_stack_depth = 100,), 
        returnvars = (x = KeepLast(), z = KeepLast(),ω=KeepLast(),κ=KeepLast(),β=KeepLast(),),
        free_energy = true,
        callbacks      = (
            before_model_creation = before_model_creation,
            after_model_creation = after_model_creation,
            before_inference = before_inference,
            after_inference = after_inference,
            before_iteration = before_iteration,
            after_iteration = after_iteration,
            before_data_update = before_data_update,
            after_data_update = after_data_update,
            on_marginal_update = on_marginal_update
        )
    )
end

I would greatly appreciate any guidance on building this model.

Thank you for your time,

Carter

albertpod · 2025-02-04T18:04:53Z

albertpod
Feb 4, 2025
Maintainer

Hi @cgoldman123,

Thanks for checking out RxInfer! I can see you've done quite a bit of digging—impressive for a starter model! You almost got it right! The model looks good, I'll take a closer look toward the end of the week if no one picks it up.

A couple of suggestions: instead of using sigmoid + Bernoulli, try Probit or Binomial-Polya if it suits your needs, e.g., obs[i] ~ Probit(x[i]).

The other part, β * x[i], is a bit more problematic. You're trying to multiply a Gamma distribution with a Normal distribution. The recommended approach in this situation is the Projections method, which can be a bit tricky to set up and might not be very fast. See this reference or this one for more details.

As a side note, I would personally use not a well-known softdot node to write something like:

tmp[i] ~ softdot(x[i], β, 1e2) 
resp_p[i] ~ Probit(tmp[i])

This would require defining a product between a Gaussian and a Gamma distribution, but it's not too difficult. I'd recommend giving it a try if the other methods don't work for you.

2 replies

cgoldman123 Feb 11, 2025
Author

Hi @albertpod,
Thank you for your detailed response—I really appreciate it! Following your advice on utilizing the softdot function, I’ve been investigating how to define the product between a Gaussian and Gamma distribution but haven’t had much luck so far.

Would this require adding a rule or marginalrule for the softdot node? If you have an example of this in another problem, I’d be grateful to see it.

Thanks again for your help!

Best,
Carter

albertpod Feb 11, 2025
Maintainer

Hi @cgoldman123,

The inferece example for your model on its way!

@Nimrais and @HoangMHNguyen will take your through the solution after we finalise some changes.

Cheers,
Albert

Nimrais · 2025-02-12T15:27:23Z

Nimrais
Feb 12, 2025
Maintainer

@cgoldman123 Thanks for such a nice question. The model looks very interesting.

This is our coordinated attempt with @HoangMHNguyen to rewrite your model. First, I drop the end-to-end script to run inference with all changes for your model. Below it, I wrote a walkthrough showing how I modified your existing code to ensure it's stable when modeling binary observations and responses with a Gaussian random walk. Just so you know, I generated data myself to test my solution. I don't know if my data generation flow matches the domain, but that's more of a question for you. I need it to test the inference procedure.

using RxInfer
using ExponentialFamilyProjection, ExponentialFamily
using Distributions
using Plots, StatsPlots

# Generate 30 trials of data that could represent a learning task
# where the correct response pattern changes over time
n_trials = 30

# Create a pattern where the correct response changes a few times
# This simulates a learning environment where rules change
function generate_task_data(n_trials)
    # Initialize arrays
    obs = zeros(n_trials)
    resp = zeros(n_trials)
    
    # Create blocks of 100 trials each
    block_size = 100
    n_blocks = ceil(Int, n_trials/block_size)
    
    current_idx = 1
    for block in 1:n_blocks
        if current_idx > n_trials
            break
        end
        
        end_idx = min(current_idx + block_size - 1, n_trials)
        
        # Alternate between different patterns
        pattern_type = mod(block, 3)
        if pattern_type == 0
            # Block type 1: mostly 1s
            obs[current_idx:end_idx] .= rand([1.0, 1.0, 1.0, 0.0], end_idx - current_idx + 1)
        elseif pattern_type == 1
            # Block type 2: mostly 0s
            obs[current_idx:end_idx] .= rand([0.0, 0.0, 0.0, 1.0], end_idx - current_idx + 1)
        else
            # Block type 3: alternating with noise
            base_pattern = repeat([1.0, 0.0], ceil(Int, (end_idx - current_idx + 1)/2))
            obs[current_idx:end_idx] .= base_pattern[1:(end_idx - current_idx + 1)]
            # Add some noise
            noise_idx = rand(1:(end_idx - current_idx + 1), ceil(Int, (end_idx - current_idx + 1)*0.1))
            obs[current_idx .+ noise_idx .- 1] .= 1.0 .- obs[current_idx .+ noise_idx .- 1]
        end
        
        # Generate responses with learning
        error_prob = 0.4 * exp(-block/5) # Exponentially decreasing error rate
        for i in current_idx:end_idx
            if rand() < error_prob
                resp[i] = 1.0 - obs[i]  # Wrong response
            else
                resp[i] = obs[i]  # Correct response
            end
        end
        
        current_idx += block_size
    end
    
    return obs, resp
end

# Generate the data
obs_data, resp_data = generate_task_data(n_trials)

@model function hgf_smoothing(obs, resp, z_variance)
    # Initial states - adjust means to be closer to data range
    z_prev ~ Normal(mean = 0.0, variance = 1.0)  # Reduced variance
    x_prev ~ Normal(mean = 0.0, variance = 1.0)  # Reduced variance
 
    # Priors on κ and ω - maybe adjust these
    κ ~ Normal(mean = 1.0, variance = 0.1)  # Reduced variance and mean
    ω ~ Normal(mean = 0.0, variance = 0.01)  # Reduced variance
    β ~ Gamma(shape = 1.0, rate = 1.0)  # Less constrained

    for i in eachindex(obs)
        # Higher layer update (Gaussian random walk)
        z[i] ~ Normal(mean = z_prev, variance = z_variance)
 
        # Lower layer update
        x[i] ~ GCV(x_prev, z[i], κ, ω)

        # Noisy binary observations (Bernoulli likelihood)
        obs[i] ~ Probit(x[i])
 
        # Noisy binary response (Bernoulli likelihood)
    
        temp[i] ~ softdot(β, x[i], 1.0)
        resp[i] ~ Probit(temp[i])
 
        # Update previous states
        z_prev = z[i]
        x_prev = x[i]
    end 
 end
 
 
@constraints function hgfconstraints_smoothing() 
    #Structured mean-field factorization constraints
    q(x_prev, x, temp, z, κ, ω, β) = q(x_prev, x, temp)q(z)q(κ)q(ω)q(β)
    q(β) :: ProjectedTo(Gamma)
end

@meta function hgfmeta_smoothing()
    # Lets use 31 approximation points in the Gauss Hermite cubature approximation method
    GCV() -> GCVMetadata(GaussHermiteCubature(31))
end

function run_inference_smoothing(obs, resp, z_variance, niterations=10)
    @initialization function hgf_init_smoothing()
        q(z) = NormalMeanVariance(0, 10)
        q(κ) = NormalMeanVariance(1.0, 0.1)
        q(ω) = NormalMeanVariance(0.0, 0.01)
        q(β) = GammaShapeRate(0.1, 0.1)
    end

    return infer(
        model = hgf_smoothing(z_variance = 0.1,),  # Reduced z_variance
        data = (obs = obs, resp = resp,),
        meta = hgfmeta_smoothing(),
        constraints = hgfconstraints_smoothing(),
        initialization = hgf_init_smoothing(),
        iterations = 30,  # More iterations
        options = (limit_stack_depth = 500,),  # Increased stack depth
        returnvars = (x = KeepLast(), z = KeepLast(), ω=KeepLast(), κ=KeepLast(), β=KeepLast(),),
        free_energy = false,  # Enable to monitor convergence
        showprogress = true,
    )
end

infer_result = run_inference_smoothing(obs_data, resp_data, 5.0, 20)

# Extract posteriors
x_posterior = infer_result.posteriors[:x]
z_posterior = infer_result.posteriors[:z]
κ_posterior = infer_result.posteriors[:κ] 
ω_posterior = infer_result.posteriors[:ω]
β_posterior = infer_result.posteriors[:β]

# Print mean and standard deviation of parameters
println("Parameter Estimates:")
println("κ: mean = $(mean(κ_posterior)), std = $(std(κ_posterior))")
println("ω: mean = $(mean(ω_posterior)), std = $(std(ω_posterior))")
println("β: mean = $(mean(β_posterior)), std = $(std(β_posterior))")

# Plot the hidden states x and z
p1 = plot(mean.(x_posterior), ribbon=2*std.(x_posterior), 
    label="x (Lower layer)", title="Hidden States", 
    fillalpha=0.3)
plot!(p1, mean.(z_posterior), ribbon=2*std.(z_posterior), 
    label="z (Higher layer)", fillalpha=0.3)
scatter!(p1, 1:length(obs_data), obs_data, label="Observations", 
    markersize=4)
scatter!(p1, 1:length(resp_data), resp_data, label="Responses", 
    markersize=4)

# Plot parameter posteriors
p2 = plot(
    histogram(rand(κ_posterior, 1000), title="κ", label=""),
    histogram(rand(ω_posterior, 1000), title="ω", label=""),
    histogram(rand(β_posterior, 1000), title="β", label=""),
    layout=(1,3)
)

plot(p1, p2, layout=(2,1), size=(800,600))

Inference results are here

Model Definition changes, the main change is instead of Bernoulli(sigmoid(...)), we use Probit(...) for more numerically stable inference in RxInfer, because there are closed-form updates for the Probit.

One of the problem, that I observe from this model it is really sensitive to priors, so I think it's a place where you should put some thought. Priors that I selected
- z_prev ~ Normal(mean = 0.0, variance = 1.0)
- x_prev ~ Normal(mean = 0.0, variance = 1.0)
- κ ~ Normal(mean = 1.0, variance = 0.1)
- ω ~ Normal(mean = 0.0, variance = 0.01)
- β ~ Gamma(shape = 1.0, rate = 1.0)
These more moderate priors (especially not to big and not to small variances) help keep inference stable.

Constraints changes, because β appears in a non-conjugate multiplication with x[i], we impose a specific projection:

q(β) :: ProjectedTo(Gamma)

This ensures that the approximate posterior for β remains in the Gamma family after each update.

Nuances, The model has a subtle mathematical aspect that requires careful attention: the results are susceptible to how we initialize the marginal distributions, especially for the parameter β. Which, in theory, should not happen, but I observe it for this model.

The key intuition is this: if the updates sent to β's edges in the factor graph stray too far from zero on the left side, the posterior distribution for β (which is constrained to be a Gamma distribution) will try to maximize its entropy. This results in a Gamma distribution with extremely large parameters, effectively causing the inference to get stuck.

This is why the initial choice of q(z) = NormalMeanVariance(0, 10) is okay - it helps keep the updates to β within a reasonable range, preventing this problematic behavior. The initialization acts as an "anchor" that helps ensure stable convergence of the inference procedure.

In simpler terms, we need to be careful about our starting points for the variables, as poor choices can lead the model to numerical dead ends. Starting with moderate values, particularly for z, helps keep everything well-behaved during inference.

Hopefully, this helps you get the HGF model up and running. Good luck with your analysis!

0 replies

cgoldman123 · 2025-02-14T23:12:49Z

cgoldman123
Feb 14, 2025
Author

Dear @Nimrais, @albertpod, @HoangMHNguyen,

Wow, I am floored by the help you all have provided in implementing the HGF model using RxInfer. I’m incredibly grateful for your support.

The model appears to be fitting the data well—I’ve run it on a few subjects and extracted the probability assigned to each response using the following code:

temp_posterior = infer_result.posteriors[:temp]
probit_prob = [cdf(Normal(0,1), t.xi / t.w) for t in temp_posterior] # since temp_posterior is NormalWeightedMeanPrecision

action_probability_values = [resp == 1 ? p : 1 - p for (resp, p) in zip(resp_data, probit_prob)]

Please let me know if you would advise using an alternative method.

I may have more questions as I continue working with the model, but for now, I just wanted to express my deepest appreciation. I look forward to using RxInfer further and recommending it to others!

Best,
Carter

0 replies

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

ReactiveBayes

Hierarchical Gaussian Filter Model of Human Choice Data #419

{{title}}

Replies: 3 comments 2 replies

{{title}}

{{editor}}'s edit

{{editor}}'s edit

{{title}}

{{title}}

{{title}}

{{editor}}'s edit

{{editor}}'s edit

{{title}}

Select a reply

ReactiveBayes

Hierarchical Gaussian Filter Model of Human Choice Data #419

cgoldman123 Feb 3, 2025

Replies: 3 comments · 2 replies

albertpod Feb 4, 2025 Maintainer

cgoldman123 Feb 11, 2025 Author

albertpod Feb 11, 2025 Maintainer

Nimrais Feb 12, 2025 Maintainer

cgoldman123 Feb 14, 2025 Author

cgoldman123
Feb 3, 2025

Replies: 3 comments 2 replies

albertpod
Feb 4, 2025
Maintainer

cgoldman123 Feb 11, 2025
Author

albertpod Feb 11, 2025
Maintainer

Nimrais
Feb 12, 2025
Maintainer

cgoldman123
Feb 14, 2025
Author