janmr blog

Neural Networks - Multiple Inputs

When considering efficient implementations of evaluating a neural network as described in the previous post, it becomes useful to be able to evaluate multiple inputs simultaneously. And utilizing efficient linear algebra routines becomes apparent.

Let there be given mm inputs to the neural network, each represented as columns of a matrix

A0Rn0×m.A^0 \in \mathbb{R}^{n^0 \times m}.

Let furthermore the weights and biases be represented as matrices and (column) vectors, respectively:

WlRnl×nl1,blRnlW^l \in \mathbb{R}^{n^l \times n^{l-1}}, \quad b^l \in \mathbb{R}^{n^l}

for l=1,,Ll=1,\ldots,L.

We can then compute ZlRnl×mZ^l \in \mathbb{R}^{n^l \times m} from Al1A^{l-1} as

Zl=WlAl1+bl[11]=[Wlbl][Al111],Z^l = W^l A^{l-1} + b^l [1 \cdots 1] = \begin{bmatrix} W^l & b^l \end{bmatrix} \begin{bmatrix} A^{l-1} \\ 1 \cdots 1 \end{bmatrix} ,

for i=1,,nli=1,\dots,n^l, and l=1,,Ll=1,\ldots,L.

Now AlRnl×mA^l \in \mathbb{R}^{n^l \times m} can be computed from ZlZ^l by applying glg^l to each element:

Aicl=gl(Zicl),A^l_{ic} = g^l(Z^l_{ic}),

for i=1,,nli=1,\ldots,n^l, c=1,,mc=1,\ldots,m and l=1,,Ll=1,\ldots,L.

Each column of ALA^L will now represent the output:

[N(A,10)N(A,20)N(A,m0)]=AL.\begin{bmatrix} N(A^0_{\ast,1}) & N(A^0_{\ast,2}) & \cdots & N(A^0_{\ast,m}) \end{bmatrix} = A^L.

The next post will set the scene for training a neural network to best fit a set of input/output pairs.