When considering efficient implementations of evaluating a neural network as described in the
previous post, it becomes useful to be able to
evaluate multiple inputs simultaneously. And utilizing efficient linear algebra routines becomes
apparent.
Let there be given m inputs to the neural network, each represented as columns of a matrix
A0∈Rn0×m.
Let furthermore the weights and biases be represented as matrices and (column) vectors, respectively:
Wl∈Rnl×nl−1,bl∈Rnl
for l=1,…,L.
We can then compute Zl∈Rnl×m from Al−1 as
Zl=WlAl−1+bl[1⋯1]=[Wlbl][Al−11⋯1],
for i=1,…,nl, and l=1,…,L.
Now Al∈Rnl×m can be computed from Zl by applying gl to each element:
Aicl=gl(Zicl),
for i=1,…,nl, c=1,…,m and l=1,…,L.
Each column of AL will now represent the output:
[N(A∗,10)N(A∗,20)⋯N(A∗,m0)]=AL.
The next post will
set the scene for training a neural network to best fit a set of input/output pairs.