Hierarchical Gaussian Filter Model of Human Choice Data #419
Replies: 3 comments 2 replies
-
Hi @cgoldman123, Thanks for checking out RxInfer! I can see you've done quite a bit of digging—impressive for a starter model! You almost got it right! The model looks good, I'll take a closer look toward the end of the week if no one picks it up. A couple of suggestions: instead of using sigmoid + Bernoulli, try Probit or Binomial-Polya if it suits your needs, e.g., The other part, As a side note, I would personally use not a well-known softdot node to write something like: tmp[i] ~ softdot(x[i], β, 1e2)
resp_p[i] ~ Probit(tmp[i]) This would require defining a product between a Gaussian and a Gamma distribution, but it's not too difficult. I'd recommend giving it a try if the other methods don't work for you. |
Beta Was this translation helpful? Give feedback.
-
@cgoldman123 Thanks for such a nice question. The model looks very interesting. This is our coordinated attempt with @HoangMHNguyen to rewrite your model. First, I drop the end-to-end script to run inference with all changes for your model. Below it, I wrote a walkthrough showing how I modified your existing code to ensure it's stable when modeling binary observations and responses with a Gaussian random walk. Just so you know, I generated data myself to test my solution. I don't know if my data generation flow matches the domain, but that's more of a question for you. I need it to test the inference procedure. using RxInfer
using ExponentialFamilyProjection, ExponentialFamily
using Distributions
using Plots, StatsPlots
# Generate 30 trials of data that could represent a learning task
# where the correct response pattern changes over time
n_trials = 30
# Create a pattern where the correct response changes a few times
# This simulates a learning environment where rules change
function generate_task_data(n_trials)
# Initialize arrays
obs = zeros(n_trials)
resp = zeros(n_trials)
# Create blocks of 100 trials each
block_size = 100
n_blocks = ceil(Int, n_trials/block_size)
current_idx = 1
for block in 1:n_blocks
if current_idx > n_trials
break
end
end_idx = min(current_idx + block_size - 1, n_trials)
# Alternate between different patterns
pattern_type = mod(block, 3)
if pattern_type == 0
# Block type 1: mostly 1s
obs[current_idx:end_idx] .= rand([1.0, 1.0, 1.0, 0.0], end_idx - current_idx + 1)
elseif pattern_type == 1
# Block type 2: mostly 0s
obs[current_idx:end_idx] .= rand([0.0, 0.0, 0.0, 1.0], end_idx - current_idx + 1)
else
# Block type 3: alternating with noise
base_pattern = repeat([1.0, 0.0], ceil(Int, (end_idx - current_idx + 1)/2))
obs[current_idx:end_idx] .= base_pattern[1:(end_idx - current_idx + 1)]
# Add some noise
noise_idx = rand(1:(end_idx - current_idx + 1), ceil(Int, (end_idx - current_idx + 1)*0.1))
obs[current_idx .+ noise_idx .- 1] .= 1.0 .- obs[current_idx .+ noise_idx .- 1]
end
# Generate responses with learning
error_prob = 0.4 * exp(-block/5) # Exponentially decreasing error rate
for i in current_idx:end_idx
if rand() < error_prob
resp[i] = 1.0 - obs[i] # Wrong response
else
resp[i] = obs[i] # Correct response
end
end
current_idx += block_size
end
return obs, resp
end
# Generate the data
obs_data, resp_data = generate_task_data(n_trials)
@model function hgf_smoothing(obs, resp, z_variance)
# Initial states - adjust means to be closer to data range
z_prev ~ Normal(mean = 0.0, variance = 1.0) # Reduced variance
x_prev ~ Normal(mean = 0.0, variance = 1.0) # Reduced variance
# Priors on κ and ω - maybe adjust these
κ ~ Normal(mean = 1.0, variance = 0.1) # Reduced variance and mean
ω ~ Normal(mean = 0.0, variance = 0.01) # Reduced variance
β ~ Gamma(shape = 1.0, rate = 1.0) # Less constrained
for i in eachindex(obs)
# Higher layer update (Gaussian random walk)
z[i] ~ Normal(mean = z_prev, variance = z_variance)
# Lower layer update
x[i] ~ GCV(x_prev, z[i], κ, ω)
# Noisy binary observations (Bernoulli likelihood)
obs[i] ~ Probit(x[i])
# Noisy binary response (Bernoulli likelihood)
temp[i] ~ softdot(β, x[i], 1.0)
resp[i] ~ Probit(temp[i])
# Update previous states
z_prev = z[i]
x_prev = x[i]
end
end
@constraints function hgfconstraints_smoothing()
#Structured mean-field factorization constraints
q(x_prev, x, temp, z, κ, ω, β) = q(x_prev, x, temp)q(z)q(κ)q(ω)q(β)
q(β) :: ProjectedTo(Gamma)
end
@meta function hgfmeta_smoothing()
# Lets use 31 approximation points in the Gauss Hermite cubature approximation method
GCV() -> GCVMetadata(GaussHermiteCubature(31))
end
function run_inference_smoothing(obs, resp, z_variance, niterations=10)
@initialization function hgf_init_smoothing()
q(z) = NormalMeanVariance(0, 10)
q(κ) = NormalMeanVariance(1.0, 0.1)
q(ω) = NormalMeanVariance(0.0, 0.01)
q(β) = GammaShapeRate(0.1, 0.1)
end
return infer(
model = hgf_smoothing(z_variance = 0.1,), # Reduced z_variance
data = (obs = obs, resp = resp,),
meta = hgfmeta_smoothing(),
constraints = hgfconstraints_smoothing(),
initialization = hgf_init_smoothing(),
iterations = 30, # More iterations
options = (limit_stack_depth = 500,), # Increased stack depth
returnvars = (x = KeepLast(), z = KeepLast(), ω=KeepLast(), κ=KeepLast(), β=KeepLast(),),
free_energy = false, # Enable to monitor convergence
showprogress = true,
)
end
infer_result = run_inference_smoothing(obs_data, resp_data, 5.0, 20)
# Extract posteriors
x_posterior = infer_result.posteriors[:x]
z_posterior = infer_result.posteriors[:z]
κ_posterior = infer_result.posteriors[:κ]
ω_posterior = infer_result.posteriors[:ω]
β_posterior = infer_result.posteriors[:β]
# Print mean and standard deviation of parameters
println("Parameter Estimates:")
println("κ: mean = $(mean(κ_posterior)), std = $(std(κ_posterior))")
println("ω: mean = $(mean(ω_posterior)), std = $(std(ω_posterior))")
println("β: mean = $(mean(β_posterior)), std = $(std(β_posterior))")
# Plot the hidden states x and z
p1 = plot(mean.(x_posterior), ribbon=2*std.(x_posterior),
label="x (Lower layer)", title="Hidden States",
fillalpha=0.3)
plot!(p1, mean.(z_posterior), ribbon=2*std.(z_posterior),
label="z (Higher layer)", fillalpha=0.3)
scatter!(p1, 1:length(obs_data), obs_data, label="Observations",
markersize=4)
scatter!(p1, 1:length(resp_data), resp_data, label="Responses",
markersize=4)
# Plot parameter posteriors
p2 = plot(
histogram(rand(κ_posterior, 1000), title="κ", label=""),
histogram(rand(ω_posterior, 1000), title="ω", label=""),
histogram(rand(β_posterior, 1000), title="β", label=""),
layout=(1,3)
)
plot(p1, p2, layout=(2,1), size=(800,600)) Inference results are here Model Definition changes, the main change is instead of One of the problem, that I observe from this model it is really sensitive to priors, so I think it's a place where you should put some thought. Priors that I selected Constraints changes, because
This ensures that the approximate posterior for Nuances, The model has a subtle mathematical aspect that requires careful attention: the results are susceptible to how we initialize the marginal distributions, especially for the parameter β. Which, in theory, should not happen, but I observe it for this model. The key intuition is this: if the updates sent to β's edges in the factor graph stray too far from zero on the left side, the posterior distribution for β (which is constrained to be a Gamma distribution) will try to maximize its entropy. This results in a Gamma distribution with extremely large parameters, effectively causing the inference to get stuck. This is why the initial choice of In simpler terms, we need to be careful about our starting points for the variables, as poor choices can lead the model to numerical dead ends. Starting with moderate values, particularly for z, helps keep everything well-behaved during inference. Hopefully, this helps you get the HGF model up and running. Good luck with your analysis! |
Beta Was this translation helpful? Give feedback.
-
Dear @Nimrais, @albertpod, @HoangMHNguyen, Wow, I am floored by the help you all have provided in implementing the HGF model using RxInfer. I’m incredibly grateful for your support. The model appears to be fitting the data well—I’ve run it on a few subjects and extracted the probability assigned to each response using the following code:
Please let me know if you would advise using an alternative method. I may have more questions as I continue working with the model, but for now, I just wanted to express my deepest appreciation. I look forward to using RxInfer further and recommending it to others! Best, |
Beta Was this translation helpful? Give feedback.
-
Hello,
I hope this message finds everyone well. I am currently using the ReactiveBayes package to implement a Hierarchical Gaussian Filter (HGF) model for human choice data in a decision-making paradigm. In this task, participants observe one of two stimuli (obs variable, represented as 1 or 0) and respond with their belief about which stimulus they experienced (resp variable, also represented as 1 or 0). The true probability of observing each stimulus changes over time, along with the uncertainty of that probability, so we are modeling this process using a Gaussian random walk.
I am having difficulty configuring the model to infer both the latent states underlying the observation process and free parameters that describe participants' behavior. I was wondering if you had any insights into how I should structure the model to achieve this. I've pasted my code below:
I would greatly appreciate any guidance on building this model.
Thank you for your time,
Carter
Beta Was this translation helpful? Give feedback.
All reactions