Laplace
Inherits From: MAP
ed.Laplace
ed.inferences.Laplace
Defined in edward/inferences/laplace.py
.
Laplace approximation (Laplace, 1986).
It approximates the posterior distribution using a multivariate normal distribution centered at the mode of the posterior.
We implement this by running MAP
to find the posterior mode. This forms the mean of the normal approximation. We then compute the inverse Hessian at the mode of the posterior. This forms the covariance of the normal approximation.
If MultivariateNormalDiag
or Normal
random variables are specified as approximations, then the Laplace approximation will only produce the diagonal. This does not capture correlation among the variables but it does not require a potentially expensive matrix inversion.
Random variables with both scalar batch and event shape are not supported as tf.hessians
is currently not applicable to scalars.
Note that Laplace
finds the location parameter of the normal approximation using MAP
, which is performed on the latent variable’s original (constrained) support. The scale parameter is calculated by evaluating the Hessian of \(-\log p(x, z)\) in the constrained space and under the mode. This implies the Laplace approximation always has real support even if the target distribution has constrained support.
X = tf.placeholder(tf.float32, [N, D])
w = Normal(loc=tf.zeros(D), scale=tf.ones(D))
y = Normal(loc=ed.dot(X, w), scale=tf.ones(N))
qw = MultivariateNormalTriL(
loc=tf.Variable(tf.random_normal([D])),
scale_tril=tf.Variable(tf.random_normal([D, D])))
inference = ed.Laplace({w: qw}, data={X: X_train, y: y_train})
init
__init__(
latent_vars,
data=None
)
Create an inference algorithm.
latent_vars
: list of RandomVariable or dict of RandomVariable to RandomVariable. Collection of random variables to perform inference on. If list, each random variable will be implictly optimized using a MultivariateNormalTriL
random variable that is defined internally with unconstrained support and is initialized using standard normal draws. If dictionary, each random variable must be a MultivariateNormalDiag
, MultivariateNormalTriL
, or Normal
random variable.build_loss_and_gradients
build_loss_and_gradients(var_list)
Build loss function. Its automatic differentiation is the gradient of
\(- \log p(x,z).\)
finalize
finalize(feed_dict=None)
Function to call after convergence.
Computes the Hessian at the mode.
feed_dict
: dict. Feed dictionary for a TensorFlow session run during evaluation of Hessian. It is used to feed placeholders that are not fed during initialization.initialize
initialize(
*args,
**kwargs
)
print_progress
print_progress(info_dict)
Print progress to output.
run
run(
variables=None,
use_coordinator=True,
*args,
**kwargs
)
A simple wrapper to run inference.
initialize
.update
for self.n_iter
iterations.print_progress
.finalize
.To customize the way inference is run, run these steps individually.
variables
: list. A list of TensorFlow variables to initialize during inference. Default is to initialize all variables (this includes reinitializing variables that were already initialized). To avoid initializing any variables, pass in an empty list.use_coordinator
: bool. Whether to start and stop queue runners during inference using a TensorFlow coordinator. For example, queue runners are necessary for batch training with file readers. *args, **kwargs: Passed into initialize
.update
update(feed_dict=None)
Run one iteration of optimization.
feed_dict
: dict. Feed dictionary for a TensorFlow session run. It is used to feed placeholders that are not fed during initialization.dict. Dictionary of algorithm-specific information. In this case, the loss function value after one iteration.
Laplace, P. S. (1986). Memoir on the probability of the causes of events. Statistical Science, 1(3), 364–378.