You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Methods like Flow Matching and Diffusion Models use differential equations to integrate their output velocity from latent to sample space. For a fixed step size, this is non-problematic, as we perform $n$ function calls for the integration:
Dynamic step size is trickier, since the exact convergence criterion and update method is well-defined only for a single sample, not for a batch. The step size is computed and adjusted on the fly during integration, and may differ across a batch, which means we may no longer have a set number of function calls $n$ to perform.
As an example, consider a batch of 2 samples, A and B. The velocity moving sample A from latent to sample space is very straight, so only a single step is needed to solve it exactly. The velocity moving sample B from latent to sample space is very jagged, so requires many smaller steps in order to accurately move from latent to sample space without accumulating numerical integration error.
After the first function evaluation, our dynamic step size algorithm chooses the step sizes (1, 1000) for this batch. How often do we now call into $f$? See also Figure 1 of https://arxiv.org/pdf/2307.03672.
With this issue, I intend to open a discussion on the optimal method we should implement for BayesFlow. Here are some options we have:
Suggested Methods
vmap over $f$ and perform $n_i$ calls to $f$ for each sample $i$ in the batch
Update each entry as normal and ignore updates to finished samples once their respective time reaches $t_1$
Consolidate the batch of step sizes into one step size that is used across the whole batch (e.g. the minimum).
...? There are probably more methods that I cannot think of. Contributions welcome.
Benefits and Drawbacks
removes the problem of batches entirely. However, it may be wasteful if the vmap is inefficient, or the batch size is very large. Simultaneously, this performs the minimum number of calls for each sample, so could be more efficient than the other methods.
is harder to implement, but does not have the performance downsides (or benefits) of vmap. It may also be wasteful if a single sample in the batch requires significantly more steps than all the others.
is easiest to implement, but presents us with an arbitrary choice of consolidation function, which is critical for good performance. The minimum may be too pessimistic if we can just discard bad samples, for instance. The biggest benefit here is that we can use the required redundancy of repeated calls to f to further improve the quality of samples.
If anyone knows of any scientific publications that cover the problem of batched dynamic step size integration, I would be glad to read them.
The text was updated successfully, but these errors were encountered:
Problem Statement
Methods like Flow Matching and Diffusion Models use differential equations to integrate their output velocity from latent to sample space. For a fixed step size, this is non-problematic, as we perform$n$ function calls for the integration:
Dynamic step size is trickier, since the exact convergence criterion and update method is well-defined only for a single sample, not for a batch. The step size is computed and adjusted on the fly during integration, and may differ across a batch, which means we may no longer have a set number of function calls$n$ to perform.
As an example, consider a batch of 2 samples, A and B. The velocity moving sample A from latent to sample space is very straight, so only a single step is needed to solve it exactly. The velocity moving sample B from latent to sample space is very jagged, so requires many smaller steps in order to accurately move from latent to sample space without accumulating numerical integration error.
After the first function evaluation, our dynamic step size algorithm chooses the step sizes (1, 1000) for this batch. How often do we now call into$f$ ? See also Figure 1 of https://arxiv.org/pdf/2307.03672.
With this issue, I intend to open a discussion on the optimal method we should implement for BayesFlow. Here are some options we have:
Suggested Methods
vmap
overBenefits and Drawbacks
vmap
is inefficient, or the batch size is very large. Simultaneously, this performs the minimum number of calls for each sample, so could be more efficient than the other methods.vmap
. It may also be wasteful if a single sample in the batch requires significantly more steps than all the others.f
to further improve the quality of samples.If anyone knows of any scientific publications that cover the problem of batched dynamic step size integration, I would be glad to read them.
The text was updated successfully, but these errors were encountered: