Neural Networks

Each node in the hidden layer adds to the model's ability to capture interactions or abstract patterns. The more nodes we have the more interactions we can capture.

Representation learning

  • Deep networks internally build representations of patterns in the data = representation learning

  • Partially replace the need for feature engineering

  • Subsequent layers build increasingly sophisticated representations fo raw data.

Some advantages:

  • The modeler doesn't need to specify the interactions

  • When you train the model, the neural network gets weights that find the relevant patterns to make better predictions

Loss function

  • Aggregates eorrors in predictions from many data points into a single number.

  • It's a measure of the model's predictive performance.

  • A common loss function for a regression task = squared error

Where do NNs shine

  • Input is high-dimensional discrete or real-valued

  • Output is discrete or real valued, or a vector of values

  • Possibly noisy data

  • Form of target function is unknown

  • Human interpretability is not important

  • The computation of the output based on the input has to be fast

  • (Highly) non-linear models

  • Can learn to order/rank inputs easily

  • Scale to very large datasets

  • Very flexible models

  • Composed of simple units (neurons)

  • Adapt to different types of data

  • May require “fiddling” with model architecture + optimization hyperparameters

  • Standardizing data can be very important

Last updated