Neural Networks
Last updated
Last updated
Each node in the hidden layer adds to the model's ability to capture interactions or abstract patterns. The more nodes we have the more interactions we can capture.
Representation learning
Deep networks internally build representations of patterns in the data = representation learning
Partially replace the need for feature engineering
Subsequent layers build increasingly sophisticated representations fo raw data.
Some advantages:
The modeler doesn't need to specify the interactions
When you train the model, the neural network gets weights that find the relevant patterns to make better predictions
Loss function
Aggregates eorrors in predictions from many data points into a single number.
It's a measure of the model's predictive performance.
A common loss function for a regression task = squared error
Where do NNs shine
Input is high-dimensional discrete or real-valued
Output is discrete or real valued, or a vector of values
Possibly noisy data
Form of target function is unknown
Human interpretability is not important
The computation of the output based on the input has to be fast
(Highly) non-linear models
Can learn to order/rank inputs easily
Scale to very large datasets
Very flexible models
Composed of simple units (neurons)
Adapt to different types of data
May require “fiddling” with model architecture + optimization hyperparameters
Standardizing data can be very important