`KLqp`

Inherits From: `VariationalInference`

- Class
`ed.KLqp`

- Class
`ed.inferences.KLqp`

Defined in `edward/inferences/klqp.py`

.

Variational inference with the KL divergence

\(\text{KL}( q(z; \lambda) \| p(z \mid x) ).\)

This class minimizes the objective by automatically selecting from a variety of black box inference techniques.

`KLqp`

also optimizes any model parameters \(p(z \mid x; \theta)\). It does this by variational EM, minimizing

\(\mathbb{E}_{q(z; \lambda)} [ \log p(x, z; \theta) ]\)

with respect to \(\theta\).

In conditional inference, we infer \(z\) in \(p(z, \beta \mid x)\) while fixing inference over \(\beta\) using another distribution \(q(\beta)\). During gradient calculation, instead of using the model's density

\(\log p(x, z^{(s)}), z^{(s)} \sim q(z; \lambda),\)

for each sample \(s=1,\ldots,S\), `KLqp`

uses

\(\log p(x, z^{(s)}, \beta^{(s)}),\)

where \(z^{(s)} \sim q(z; \lambda)\) and \(\beta^{(s)} \sim q(\beta)\).

**init**

```
__init__(
*args,
**kwargs
)
```

`build_loss_and_gradients`

`build_loss_and_gradients(var_list)`

Wrapper for the `KLqp`

loss function.

\(-\text{ELBO} = -\mathbb{E}_{q(z; \lambda)} [ \log p(x, z) - \log q(z; \lambda) ]\)

KLqp supports

- score function gradients (Paisley et al., 2012)
- reparameterization gradients (Kingma and Welling, 2014)

of the loss function.

If the KL divergence between the variational model and the prior is tractable, then the loss function can be written as

\(-\mathbb{E}_{q(z; \lambda)}[\log p(x \mid z)] + \text{KL}( q(z; \lambda) \| p(z) ),\)

where the KL term is computed analytically (Kingma and Welling, 2014). We compute this automatically when \(p(z)\) and \(q(z; \lambda)\) are Normal.

`finalize`

`finalize()`

Function to call after convergence.

`initialize`

```
initialize(
n_samples=1,
kl_scaling=None,
*args,
**kwargs
)
```

Initialize inference algorithm. It initializes hyperparameters and builds ops for the algorithm's computation graph.

: int, optional. Number of samples from variational model for calculating stochastic gradients.`n_samples`

: dict of RandomVariable to float, optional. Provides option to scale terms when using ELBO with KL divergence. If the KL divergence terms are`kl_scaling`

\(\alpha_p \mathbb{E}_{q(z\mid x, \lambda)} [ \log q(z\mid x, \lambda) - \log p(z)],\)

then pass {\(p(z)\): \(\alpha_p\)} as

`kl_scaling`

, where \(\alpha_p\) is a float that specifies how much to scale the KL term.

`print_progress`

`print_progress(info_dict)`

Print progress to output.

`run`

```
run(
variables=None,
use_coordinator=True,
*args,
**kwargs
)
```

A simple wrapper to run inference.

- Initialize algorithm via
`initialize`

. - (Optional) Build a TensorFlow summary writer for TensorBoard.
- (Optional) Initialize TensorFlow variables.
- (Optional) Start queue runners.
- Run
`update`

for`self.n_iter`

iterations. - While running,
`print_progress`

. - Finalize algorithm via
`finalize`

. - (Optional) Stop queue runners.

To customize the way inference is run, run these steps individually.

: list, optional. A list of TensorFlow variables to initialize during inference. Default is to initialize all variables (this includes reinitializing variables that were already initialized). To avoid initializing any variables, pass in an empty list.`variables`

: bool, optional. Whether to start and stop queue runners during inference using a TensorFlow coordinator. For example, queue runners are necessary for batch training with file readers. *args, **kwargs: Passed into`use_coordinator`

`initialize`

.

`update`

`update(feed_dict=None)`

Run one iteration of optimization.

: dict, optional. Feed dictionary for a TensorFlow session run. It is used to feed placeholders that are not fed during initialization.`feed_dict`

dict. Dictionary of algorithm-specific information. In this case, the loss function value after one iteration.