We describe how to perform inference in probabilistic models. For background, see the Inference of Probability Models tutorial.
Suppose we have a model \(p(\mathbf{x}, \mathbf{z}, \beta)\) of data \(\mathbf{x}_{\text{train}}\) with latent variables \((\mathbf{z}, \beta)\). Consider the posterior inference problem, \[q(\mathbf{z}, \beta)\approx p(\mathbf{z}, \beta\mid \mathbf{x}_{\text{train}}),\] in which the task is to approximate the posterior \(p(\mathbf{z}, \beta\mid \mathbf{x}_{\text{train}})\) using a family of distributions, \(q(\mathbf{z},\beta; \lambda)\), indexed by parameters \(\lambda\).
In Edward, let z
and beta
be latent variables in the model, where we observe the random variable x
with data x_train
. Let qz
and qbeta
be random variables defined to approximate the posterior. We write this problem as follows:
inference = ed.Inference({z: qz, beta: qbeta}, {x: x_train})
Inference
is an abstract class which takes two inputs: a collection of latent variables, with model variables bound to posterior variables; and a collection of observed variables, with model variables bound to data. The choice of approximation \(q(\mathbf{z}, \beta; \lambda)\) and rules to update parameters \(\lambda\) are specified by an inference algorithm.
Running inference is as simple as running one method.
inference = ed.Inference({z: qz, beta: qbeta}, {x: x_train})
inference.run()
Inference also supports fine control of the training procedure.
inference = ed.Inference({z: qz, beta: qbeta}, {x: x_train})
inference.initialize()
tf.initialize_all_variables().run()
for _ in range(inference.n_iter):
info_dict = inference.update()
inference.print_progress(info_dict)
inference.finalize()
initialize()
builds the algorithm’s update rules (computational graph) for \(\lambda\); initialize_all_variables()
initializes \(\lambda\) (TensorFlow variables in the graph); update()
runs the graph once to update \(\lambda\), which is called in a loop until convergence; finalize()
runs any computation as the algorithm terminates.
The run()
method is a simple wrapper for this procedure.
We highlight other settings during inference.
Model parameters. Model parameters are parameters in a model that we will always compute point estimates for and not be uncertain about. They are defined with tf.Variable
s, where the inference problem is \[\hat{\theta} \leftarrow^{\text{optimize}}
p(\mathbf{x}_{\text{train}}; \theta)\]
from edward.models import Normal
theta = tf.Variable(0.0)
x = Normal(mu=tf.ones(10) * theta, sigma=1.0)
inference = ed.Inference({}, {x: x_train})
Only a subset of inference algorithms support estimation of model parameters. (Note also that this inference example does not have any latent variables. It is only about estimating theta
given that we observe \(\mathbf{x} = \mathbf{x}_{\text{train}}\). We can add them so that inference is both posterior inference and parameter estimation.)
For example, model parameters are useful when applying neural networks from highlevel libraries such as Keras and TensorFlow Slim. See the model compositionality page for more details.
Conditional inference. In conditional inference, only a subset of the posterior is inferred while the rest are fixed using other inferences. The inference problem is \[q(\beta)q(\mathbf{z})\approx
p(\mathbf{z}, \beta\mid\mathbf{x}_{\text{train}})\] where parameters in \(q(\beta)\) are estimated and \(q(\mathbf{z})\) is fixed. In Edward, we enable conditioning by binding random variables to other random variables in data
.
inference = ed.Inference({beta: qbeta}, {x: x_train, z: qz})
In the compositionality page, we describe how to construct inference by composing many conditional inference algorithms.
Implicit prior samples. Latent variables can be defined in the model without any posterior inference over them. They are implicitly marginalized out with a single sample. The inference problem is \[q(\beta)\approx p(\beta\mid\mathbf{x}_{\text{train}}, \mathbf{z}^*)\] where \(\mathbf{z}^*\sim p(\mathbf{z}\mid\beta)\) is a prior sample.
inference = ed.Inference({beta: qbeta}, {x: x_train})
For example, implicit prior samples are useful for generative adversarial networks. Their inference problem does not require any inference over the latent variables; it uses samples from the prior.
edward.inferences.
Inference
(latent_vars=None, data=None, model_wrapper=None)[source]Base class for Edward inference methods.
Attributes
latent_vars  (dict of RandomVariable to RandomVariable) Collection of random variables to perform inference on. Each random variable is binded to another random variable; the latter will infer the former conditional on data. 
data  (dict) Data dictionary whose values may vary at each session run. 
model_wrapper  (ed.Model or None) An optional wrapper for the probability model. If specified, the random variables in latent_vars ‘ dictionary keys are strings used accordingly by the wrapper. 
Methods
Initialization.
Parameters:  latent_vars : dict of RandomVariable to RandomVariable, optional
data : dict, optional
model_wrapper : ed.Model, optional


Notes
If data
is not passed in, the dictionary is empty.
Three options are available for batch training:
Examples
>>> mu = Normal(mu=tf.constant(0.0), sigma=tf.constant(1.0))
>>> x = Normal(mu=tf.ones(N) * mu, sigma=tf.constant(1.0))
>>>
>>> qmu_mu = tf.Variable(tf.random_normal([1]))
>>> qmu_sigma = tf.nn.softplus(tf.Variable(tf.random_normal([1])))
>>> qmu = Normal(mu=qmu_mu, sigma=qmu_sigma)
>>>
>>> Inference({mu: qmu}, {x: tf.constant([0.0] * N)})
Methods
run
(variables=None, use_coordinator=True, *args, **kwargs)[source]A simple wrapper to run inference.
initialize
.tf.train.SummaryWriter
for TensorBoard.update
for self.n_iter
iterations.print_progress
.finalize
.To customize the way inference is run, run these steps individually.
Parameters:  variables : list, optional
use_coordinator : bool, optional
*args
**kwargs


initialize
(n_iter=1000, n_print=None, n_minibatch=None, scale=None, logdir=None, debug=False)[source]Initialize inference algorithm.
Parameters:  n_iter : int, optional
n_print : int, optional
n_minibatch : int, optional
scale : dict of RandomVariable to tf.Tensor, optional
logdir : str, optional
debug : bool, optional


update
(feed_dict=None)[source]Run one iteration of inference.
Parameters:  feed_dict : dict, optional


Returns:  dict

print_progress
(info_dict)[source]Print progress to output.
Parameters:  info_dict : dict


finalize
()[source]Function to call after convergence.