Skip to content

Latest commit

 

History

History
473 lines (337 loc) · 12.4 KB

api.md

File metadata and controls

473 lines (337 loc) · 12.4 KB

API

Part of the API of DynamicPPL is defined in the more lightweight interface package AbstractPPL.jl and reexported here.

Model

Macros

A core component of DynamicPPL is the @model macro. It can be used to define probabilistic models in an intuitive way by specifying random variables and their distributions with ~ statements. These statements are rewritten by @model as calls of [internal functions](@ref model_internal) for sampling the variables and computing their log densities.

@model

Type

A Model can be created by calling the model function, as defined by @model.

Model

Models are callable structs.

Model()

Basic properties of a model can be accessed with getargnames, getmissings, and nameof.

nameof(::Model)
getargnames
getmissings

Evaluation

With rand one can draw samples from the prior distribution of a Model.

rand

One can also evaluate the log prior, log likelihood, and log joint probability.

logprior
loglikelihood
logjoint

LogDensityProblems.jl interface

The LogDensityProblems.jl interface is also supported by wrapping a Model in a DynamicPPL.LogDensityFunction.

LogDensityFunction

Condition and decondition

A Model can be conditioned on a set of observations with AbstractPPL.condition or its alias |.

|(::Model, ::Union{Tuple,NamedTuple,AbstractDict{<:VarName}})
condition
DynamicPPL.conditioned

Similarly, one can specify with AbstractPPL.decondition that certain, or all, random variables are not observed.

decondition

Fixing and unfixing

We can also fix a collection of variables in a Model to certain using fix.

This might seem quite similar to the aforementioned condition and its siblings, but they are indeed different operations:

  • conditioned variables are considered to be observations, and are thus included in the computation logjoint and loglikelihood, but not in logprior.
  • fixed variables are considered to be constant, and are thus not included in any log-probability computations.

The differences are more clearly spelled out in the docstring of fix below.

fix
DynamicPPL.fixed

The difference between fix and condition is described in the docstring of fix above.

Similarly, we can unfix variables, i.e. return them to their original meaning:

unfix

Predicting

DynamicPPL provides functionality for generating samples from the posterior predictive distribution through the predict function. This allows you to use posterior parameter samples to generate predictions for unobserved data points.

The predict function has two main methods:

  1. For AbstractVector{<:AbstractVarInfo} - useful when you have a collection of VarInfo objects representing posterior samples.
  2. For MCMCChains.Chains (only available when MCMCChains.jl is loaded) - useful when you have posterior samples in the form of an MCMCChains.Chains object.
predict

Basic Usage

The typical workflow for posterior prediction involves:

  1. Fitting a model to observed data to obtain posterior samples
  2. Creating a new model instance with some variables marked as missing (unobserved)
  3. Using predict to generate samples for these missing variables based on the posterior parameter samples

When using predict with MCMCChains.Chains, you can control which variables are included in the output with the include_all parameter:

  • include_all=false (default): Include only newly predicted variables
  • include_all=true: Include both parameters from the original chain and predicted variables

Models within models

One can include models and call another model inside the model function with left ~ to_submodel(model).

to_submodel

Note that a [to_submodel](@ref) is only sampleable; one cannot compute logpdf for its realizations.

In the past, one would instead embed sub-models using @submodel, which has been deprecated since the introduction of to_submodel(model)

@submodel

In the context of including models within models, it's also useful to prefix the variables in sub-models to avoid variable names clashing:

DynamicPPL.prefix

Under the hood, to_submodel makes use of the following method to indicate that the model it's wrapping is a model over its return-values rather than something else

returned(::Model)

Utilities

It is possible to manually increase (or decrease) the accumulated log density from within a model function.

@addlogprob!

Return values of the model function for a collection of samples can be obtained with returned(model, chain).

returned(::DynamicPPL.Model, ::NamedTuple)

For a chain of samples, one can compute the pointwise log-likelihoods of each observed random variable with pointwise_loglikelihoods. Similarly, the log-densities of the priors using pointwise_prior_logdensities or both, i.e. all variables, using pointwise_logdensities.

pointwise_logdensities
pointwise_loglikelihoods
pointwise_prior_logdensities

For converting a chain into a format that can more easily be fed into a Model again, for example using condition, you can use value_iterator_from_chain.

value_iterator_from_chain

Sometimes it can be useful to extract the priors of a model. This is the possible using extract_priors.

extract_priors

Safe extraction of values from a given AbstractVarInfo as they are seen in the model can be done using values_as_in_model.

values_as_in_model
NamedDist

Testing Utilities

DynamicPPL provides several demo models and helpers for testing samplers in the DynamicPPL.TestUtils submodule.

DynamicPPL.TestUtils.test_sampler
DynamicPPL.TestUtils.test_sampler_on_demo_models
DynamicPPL.TestUtils.test_sampler_continuous
DynamicPPL.TestUtils.marginal_mean_of_samples
DynamicPPL.TestUtils.DEMO_MODELS

For every demo model, one can define the true log prior, log likelihood, and log joint probabilities.

DynamicPPL.TestUtils.logprior_true
DynamicPPL.TestUtils.loglikelihood_true
DynamicPPL.TestUtils.logjoint_true

And in the case where the model includes constrained variables, it can also be useful to define

DynamicPPL.TestUtils.logprior_true_with_logabsdet_jacobian
DynamicPPL.TestUtils.logjoint_true_with_logabsdet_jacobian

Finally, the following methods can also be of use:

DynamicPPL.TestUtils.varnames
DynamicPPL.TestUtils.posterior_mean
DynamicPPL.TestUtils.setup_varinfos
DynamicPPL.TestUtils.update_values!!
DynamicPPL.TestUtils.test_values

Debugging Utilities

DynamicPPL provides a few methods for checking validity of a model-definition.

check_model
check_model_and_trace

And some which might be useful to determine certain properties of the model based on the debug trace.

DynamicPPL.has_static_constraints

For determining whether one might have type instabilities in the model, the following can be useful

DynamicPPL.DebugUtils.model_warntype
DynamicPPL.DebugUtils.model_typed

Interally, the type-checking methods make use of the following method for construction of the call with the argument types:

DynamicPPL.DebugUtils.gen_evaluator_call_with_types

Advanced

Variable names

Names and possibly nested indices of variables are described with AbstractPPL.VarName. They can be defined with AbstractPPL.@varname. Please see the documentation of AbstractPPL.jl for further information.

Data Structures of Variables

DynamicPPL provides different data structures used in for storing samples and accumulation of the log-probabilities, all of which are subtypes of AbstractVarInfo.

AbstractVarInfo

But exactly how a AbstractVarInfo stores this information can vary.

For constructing the "default" typed and untyped varinfo types used in DynamicPPL (see the section on varinfo design for more on this), we have the following two methods:

DynamicPPL.untyped_varinfo
DynamicPPL.typed_varinfo

VarInfo

VarInfo
TypedVarInfo

One main characteristic of VarInfo is that samples are transformed to unconstrained Euclidean space and stored in a linearized form, as described in the main Turing documentation. The Transformations section below describes the methods used for this. In the specific case of VarInfo, it keeps track of whether samples have been transformed by setting flags on them, using the following functions.

set_flag!
unset_flag!
is_flagged

The following functions were used for sequential Monte Carlo methods.

get_num_produce
set_num_produce!
increment_num_produce!
reset_num_produce!
setorder!
set_retained_vns_del!
Base.empty!

SimpleVarInfo

SimpleVarInfo

Common API

Accumulation of log-probabilities

getlogp
setlogp!!
acclogp!!
resetlogp!!

Variables and their realizations

keys
getindex
push!!
empty!!
isempty
DynamicPPL.getindex_internal
DynamicPPL.setindex_internal!
DynamicPPL.update_internal!
DynamicPPL.insert_internal!
DynamicPPL.length_internal
DynamicPPL.reset!
DynamicPPL.update!
DynamicPPL.insert!
DynamicPPL.loosen_types!!
DynamicPPL.tighten_types
values_as

Transformations

DynamicPPL.AbstractTransformation
DynamicPPL.NoTransformation
DynamicPPL.DynamicTransformation
DynamicPPL.StaticTransformation
DynamicPPL.istrans
DynamicPPL.settrans!!
DynamicPPL.transformation
DynamicPPL.link
DynamicPPL.invlink
DynamicPPL.link!!
DynamicPPL.invlink!!
DynamicPPL.default_transformation
DynamicPPL.link_transform
DynamicPPL.invlink_transform
DynamicPPL.maybe_invlink_before_eval!!

Utils

Base.merge(::AbstractVarInfo)
DynamicPPL.subset
DynamicPPL.unflatten
DynamicPPL.varname_leaves
DynamicPPL.varname_and_value_leaves

Evaluation Contexts

Internally, both sampling and evaluation of log densities are performed with AbstractPPL.evaluate!!.

AbstractPPL.evaluate!!

The behaviour of a model execution can be changed with evaluation contexts that are passed as additional argument to the model function. Contexts are subtypes of AbstractPPL.AbstractContext.

SamplingContext
DefaultContext
LikelihoodContext
PriorContext
MiniBatchContext
PrefixContext
ConditionContext

Samplers

In DynamicPPL two samplers are defined that are used to initialize unobserved random variables: SampleFromPrior which samples from the prior distribution, and SampleFromUniform which samples from a uniform distribution.

SampleFromPrior
SampleFromUniform

Additionally, a generic sampler for inference is implemented.

Sampler

The default implementation of Sampler uses the following unexported functions.

DynamicPPL.initialstep
DynamicPPL.loadstate
DynamicPPL.initialsampler

Finally, to specify which varinfo type a Sampler should use for a given Model, this is specified by DynamicPPL.default_varinfo and can thus be overloaded for each model-sampler combination. This can be useful in cases where one has explicit knowledge that one type of varinfo will be more performant for the given model and sampler.

DynamicPPL.default_varinfo

There is also the experimental DynamicPPL.Experimental.determine_suitable_varinfo, which uses static checking via JET.jl to determine whether one should use DynamicPPL.typed_varinfo or DynamicPPL.untyped_varinfo, depending on which supports the model:

DynamicPPL.Experimental.determine_suitable_varinfo
DynamicPPL.Experimental.is_suitable_varinfo

[Model-Internal Functions](@id model_internal)

tilde_assume
tilde_observe