Neural Network Architecture

You take the prob of both points in the 2 models, add them and apply the sigmoid function to get the new prob.

What if we want to weigh the sum?

We have a combination of the 2 previous model + the weights + the bias.

Feedforward

Simple version

The perceptron will output a probability. In this case it will be small because the point is incorrectly classified. This process is known as feedforward.

y has is the prob that the point is labeled blue. This is what neural networks do: they take the input vector and then apply a sequence of lin models and simoid functions. The maps when combined become highly non linear.

Our prediction is therefor y_hat = .. see formula above. Multiplications of matrices and sigmoid functions. What about the error?

Recall that the error for our perceptron was:

We can actually use the same error function for multilayer perceptron. It's just that y_hat will be a little more complicated.

Backpropagation

In a nutshell, backpropagation will consist of:

  • Doing a feedforward operation.

  • Comparing the output of the model with the desired output.

  • Calculating the error.

  • Running the feedforward operation backwards (backpropagation) to spread the error to each of the weights.

  • Use this to update the weights, and get a better model.

  • Continue this until we have a model that is good.

The gradient descent: you take all the W_i_j super k, and we update it by adding a small number : the learning rate * partial derivative of E with respect to that same weight. It'll give us a new updated weight at W'_i_j super k.

Feedforward again:

Last updated

Was this helpful?