API and Documentation


We describe how to perform inference in probabilistic models. For background, see the Inference tutorial.

Suppose we have a model \(p(\mathbf{x}, \mathbf{z}, \beta)\) of data \(\mathbf{x}_{\text{train}}\) with latent variables \((\mathbf{z}, \beta)\). Consider the posterior inference problem, \[q(\mathbf{z}, \beta)\approx p(\mathbf{z}, \beta\mid \mathbf{x}_{\text{train}}),\] in which the task is to approximate the posterior \(p(\mathbf{z}, \beta\mid \mathbf{x}_{\text{train}})\) using a family of distributions, \(q(\mathbf{z},\beta; \lambda)\), indexed by parameters \(\lambda\).

In Edward, let z and beta be latent variables in the model, where we observe the random variable x with data x_train. Let qz and qbeta be random variables defined to approximate the posterior. We write this problem as follows:

inference = ed.Inference({z: qz, beta: qbeta}, {x: x_train})

Inference is an abstract class which takes two inputs. The first is a collection of latent random variables beta and z, along with “posterior variables” qbeta and qz, which are associated to their respective latent variables. The second is a collection of observed random variables x, which is associated to the data x_train.

Inference adjusts parameters of the distribution of qbeta and qz to be close to the posterior \(p(\mathbf{z}, \beta\,|\,\mathbf{x}_{\text{train}})\).

Running inference is as simple as running one method.

inference = ed.Inference({z: qz, beta: qbeta}, {x: x_train})

Inference also supports fine control of the training procedure.

inference = ed.Inference({z: qz, beta: qbeta}, {x: x_train})


for _ in range(inference.n_iter):
  info_dict = inference.update()


initialize() builds the algorithm’s update rules (computational graph) for \(\lambda\); tf.global_variables_initializer().run() initializes \(\lambda\) (TensorFlow variables in the graph); update() runs the graph once to update \(\lambda\), which is called in a loop until convergence; finalize() runs any computation as the algorithm terminates.

The run() method is a simple wrapper for this procedure.

Other Settings

We highlight other settings during inference.

Model parameters. Model parameters are parameters in a model that we will always compute point estimates for and not be uncertain about. They are defined with tf.Variables, where the inference problem is \[\hat{\theta} \leftarrow^{\text{optimize}} p(\mathbf{x}_{\text{train}}; \theta)\]

from edward.models import Normal

theta = tf.Variable(0.0)
x = Normal(mu=tf.ones(10) * theta, sigma=1.0)

inference = ed.Inference({}, {x: x_train})

Only a subset of inference algorithms support estimation of model parameters. (Note also that this inference example does not have any latent variables. It is only about estimating theta given that we observe \(\mathbf{x} = \mathbf{x}_{\text{train}}\). We can add them so that inference is both posterior inference and parameter estimation.)

For example, model parameters are useful when applying neural networks from high-level libraries such as Keras and TensorFlow Slim. See the model compositionality page for more details.

Conditional inference. In conditional inference, only a subset of the posterior is inferred while the rest are fixed using other inferences. The inference problem is \[q(\beta)q(\mathbf{z})\approx p(\mathbf{z}, \beta\mid\mathbf{x}_{\text{train}})\] where parameters in \(q(\beta)\) are estimated and \(q(\mathbf{z})\) is fixed. In Edward, we enable conditioning by binding random variables to other random variables in data.

inference = ed.Inference({beta: qbeta}, {x: x_train, z: qz})

In the compositionality page, we describe how to construct inference by composing many conditional inference algorithms.

Implicit prior samples. Latent variables can be defined in the model without any posterior inference over them. They are implicitly marginalized out with a single sample. The inference problem is \[q(\beta)\approx p(\beta\mid\mathbf{x}_{\text{train}}, \mathbf{z}^*)\] where \(\mathbf{z}^*\sim p(\mathbf{z}\mid\beta)\) is a prior sample.

inference = ed.Inference({beta: qbeta}, {x: x_train})

For example, implicit prior samples are useful for generative adversarial networks. Their inference problem does not require any inference over the latent variables; it uses samples from the prior.

class edward.inferences.Inference(latent_vars=None, data=None, model_wrapper=None)[source]

Base class for Edward inference methods.


latent_vars (dict of RandomVariable to RandomVariable) Collection of random variables to perform inference on. Each random variable is binded to another random variable; the latter will infer the former conditional on data.
data (dict) Data dictionary whose values may vary at each session run.
model_wrapper (ed.Model or None) An optional wrapper for the probability model. If specified, the random variables in latent_vars‘ dictionary keys are strings used accordingly by the wrapper.




latent_vars : dict of RandomVariable to RandomVariable, optional

Collection of random variables to perform inference on. Each random variable is binded to another random variable; the latter will infer the former conditional on data.

data : dict, optional

Data dictionary which binds observed variables (of type RandomVariable or tf.Tensor) to their realizations (of type tf.Tensor). It can also bind placeholders (of type tf.Tensor) used in the model to their realizations; and prior latent variables (of type RandomVariable) to posterior latent variables (of type RandomVariable).

model_wrapper : ed.Model, optional

A wrapper for the probability model. If specified, the random variables in latent_vars‘ dictionary keys are strings used accordingly by the wrapper. data is also changed. For TensorFlow, Python, and Stan models, the key type is a string; for PyMC3, the key type is a Theano shared variable. For TensorFlow, Python, and PyMC3 models, the value type is a NumPy array or TensorFlow tensor; for Stan, the value type is the type according to the Stan program’s data block.


If data is not passed in, the dictionary is empty.

Three options are available for batch training:

  1. internally if user passes in data as a dictionary of NumPy arrays;
  2. externally if user passes in data as a dictionary of TensorFlow placeholders (and manually feeds them);
  3. externally if user passes in data as TensorFlow tensors which are the outputs of data readers.


>>> mu = Normal(mu=tf.constant(0.0), sigma=tf.constant(1.0))
>>> x = Normal(mu=tf.ones(N) * mu, sigma=tf.constant(1.0))
>>> qmu_mu = tf.Variable(tf.random_normal([1]))
>>> qmu_sigma = tf.nn.softplus(tf.Variable(tf.random_normal([1])))
>>> qmu = Normal(mu=qmu_mu, sigma=qmu_sigma)
>>> Inference({mu: qmu}, {x: tf.constant([0.0] * N)})


run(variables=None, use_coordinator=True, *args, **kwargs)[source]

A simple wrapper to run inference.

  1. Initialize algorithm via initialize.
  2. (Optional) Build a TensorFlow summary writer for TensorBoard.
  3. (Optional) Initialize TensorFlow variables.
  4. (Optional) Start queue runners.
  5. Run update for self.n_iter iterations.
  6. While running, print_progress.
  7. Finalize algorithm via finalize.
  8. (Optional) Stop queue runners.

To customize the way inference is run, run these steps individually.


variables : list, optional

A list of TensorFlow variables to initialize during inference. Default is to initialize all variables (this includes reinitializing variables that were already initialized). To avoid initializing any variables, pass in an empty list.

use_coordinator : bool, optional

Whether to start and stop queue runners during inference using a TensorFlow coordinator. For example, queue runners are necessary for batch training with the n_minibatch argument or with file readers.


Passed into initialize.


Passed into initialize.

initialize(n_iter=1000, n_print=None, n_minibatch=None, scale=None, logdir=None, debug=False)[source]

Initialize inference algorithm.


n_iter : int, optional

Number of iterations for algorithm.

n_print : int, optional

Number of iterations for each print progress. To suppress print progress, then specify 0. Default is int(n_iter / 10).

n_minibatch : int, optional

Number of samples for data subsampling. Default is to use all the data. n_minibatch is available only for TensorFlow, Python, and PyMC3 model wrappers; use scale for Edward’s language. All data must be passed in as NumPy arrays. For subsampling details, see tf.train.slice_input_producer and tf.train.batch.

scale : dict of RandomVariable to tf.Tensor, optional

A scalar value to scale computation for any random variable that it is binded to. For example, this is useful for scaling computations with respect to local latent variables.

logdir : str, optional

Directory where event file will be written. For details, see tf.summary.FileWriter. Default is to write nothing.

debug : bool, optional

If True, add checks for NaN and Inf to all computations in the graph. May result in substantially slower execution times.


Run one iteration of inference.


feed_dict : dict, optional

Feed dictionary for a TensorFlow session run. It is used to feed placeholders that are not fed during initialization.



Dictionary of algorithm-specific information.


Print progress to output.


info_dict : dict

Dictionary of algorithm-specific information.


Function to call after convergence.