## Bayesian Neural Network

A Bayesian neural network is a neural network with a prior distribution on its weights (Neal, 2012).

Consider a data set $$\{(\mathbf{x}_n, y_n)\}$$, where each data point comprises of features $$\mathbf{x}_n\in\mathbb{R}^D$$ and output $$y_n\in\mathbb{R}$$. Define the likelihood for each data point as \begin{aligned} p(y_n \mid \mathbf{w}, \mathbf{x}_n, \sigma^2) &= \text{Normal}(y_n \mid \mathrm{NN}(\mathbf{x}_n\;;\;\mathbf{w}), \sigma^2),\end{aligned} where $$\mathrm{NN}$$ is a neural network whose weights and biases form the latent variables $$\mathbf{w}$$. Assume $$\sigma^2$$ is a known variance.

Define the prior on the weights and biases $$\mathbf{w}$$ to be the standard normal \begin{aligned} p(\mathbf{w}) &= \text{Normal}(\mathbf{w} \mid \mathbf{0}, \mathbf{I}).\end{aligned}

Let’s build the model in Edward. We define a 3-layer Bayesian neural network with $$\tanh$$ nonlinearities.

from edward.models import Normal

def neural_network(x):
h = tf.tanh(tf.matmul(x, W_0) + b_0)
h = tf.tanh(tf.matmul(h, W_1) + b_1)
h = tf.matmul(h, W_2) + b_2
return tf.reshape(h, [-1])

N = 40  # number of data ponts
D = 1   # number of features

W_0 = Normal(loc=tf.zeros([D, 10]), scale=tf.ones([D, 10]))
W_1 = Normal(loc=tf.zeros([10, 10]), scale=tf.ones([10, 10]))
W_2 = Normal(loc=tf.zeros([10, 1]), scale=tf.ones([10, 1]))
b_0 = Normal(loc=tf.zeros(10), scale=tf.ones(10))
b_1 = Normal(loc=tf.zeros(10), scale=tf.ones(10))
b_2 = Normal(loc=tf.zeros(1), scale=tf.ones(1))

x = tf.cast(x_train, dtype=tf.float32)
y = Normal(loc=neural_network(x), scale=0.1 * tf.ones(N))

This program builds the model assuming the features x_train already exists in the Python environment. Alternatively, one can also define a TensorFlow placeholder,

x = tf.placeholder(tf.float32, [N, D])

The placeholder must be fed with data later during inference.

A toy demonstration is available in the Getting Started section. Source code is available at examples/bayesian_nn.py in the Github repository.

Neal, R. M. (2012). Bayesian learning for neural networks (Vol. 118). Springer Science & Business Media.