Below is the overall diagram of the neural network.
Above is a “simple” mathematic function whos output our neural network will try to predict.
graph LR subgraph InputLayer[Input Layer] direction LR x1((x1)) x2((x2)) end subgraph HiddenLayer_1[Hidden Layer 1] direction LR n11((n11)) n12((n12)) n13((n13)) end subgraph HiddenLayer_2[Hidden Layer 2] direction LR n21((n21)) n22((n22)) n23((n23)) end subgraph OutputLayer[Output Layer] direction LR o1((o1)) end x1 -- w11 --> n11 x1 -- w12 --> n12 x1 -- w13 --> n13 x2 -- w21 --> n11 x2 -- w22 --> n12 x2 -- w23 --> n13 n11 -- w11 --> n21 n11 -- w12 --> n22 n11 -- w13 --> n23 n12 -- w21 --> n21 n12 -- w22 --> n22 n12 -- w23 --> n23 n13 -- w31 --> n21 n13 -- w32 --> n22 n13 -- w33 --> n23 n21 -- w11 --> o1 n22 -- w21 --> o1 n23 -- w31 --> o1
Here we will represent everything as matrices because that is what makes everything far more easier.
The elements inside these matrices DO NOT follow standard matrix notation, they subscripts of those elements are actually following the above graph for convenience
Input Matrix ()
Weight(Input Layer → Hidden Layer 1) Matrix ()
Hidden Layer 1 Output Matrix ()
Weight(Hidden Layer 1 → Hidden Layer 2) Matrix ()
Hidden Layer 2 Output Matrix ()
Weight(Hidden Layer 2 → Output Layer) Matrix ()
Output Layer Matrix ()
feed forward
The absolute first step in the working of a neural network is feed-forwarding, this is just a fancy name for calculating output based on a given input. Innitially, what you do is just provide some input to the neural network and it will tell you what it thinks the output should be. Ofcourse, it will always be incorrect because our network has not been trained for it now, which means the weights have not been adjusted.
Let’s not focus on that and just focus on calculating the output for what we have right now.