Neural Network Architecture
Last updated
Last updated
You take the prob of both points in the 2 models, add them and apply the sigmoid function to get the new prob.
What if we want to weigh the sum?
We have a combination of the 2 previous model + the weights + the bias.
Simple version
The perceptron will output a probability. In this case it will be small because the point is incorrectly classified. This process is known as feedforward.
y has is the prob that the point is labeled blue. This is what neural networks do: they take the input vector and then apply a sequence of lin models and simoid functions. The maps when combined become highly non linear.
Our prediction is therefor y_hat = .. see formula above. Multiplications of matrices and sigmoid functions. What about the error?
Recall that the error for our perceptron was:
We can actually use the same error function for multilayer perceptron. It's just that y_hat will be a little more complicated.
In a nutshell, backpropagation will consist of:
Doing a feedforward operation.
Comparing the output of the model with the desired output.
Calculating the error.
Running the feedforward operation backwards (backpropagation) to spread the error to each of the weights.
Use this to update the weights, and get a better model.
Continue this until we have a model that is good.
The gradient descent: you take all the W_i_j super k, and we update it by adding a small number : the learning rate * partial derivative of E with respect to that same weight. It'll give us a new updated weight at W'_i_j super k.
Feedforward again: