JulienBeaulieu
  • Introduction
  • Sciences
    • Math
      • Probability
        • Bayes Rule
        • Binomial distribution
        • Conditional Probability
      • Statistics
        • Descriptive Statistics
        • Inferential Statistics
          • Normal Distributions
          • Sampling Distributions
          • Confidence Intervals
          • Hypothesis Testing
          • AB Testing
        • Simple Linear Regression
        • Multiple Linear Regression
          • Statistical learning course
          • Model Assumptions And How To Address Each
        • Logistic Regression
      • Calculus
        • The big picture of Calculus
          • Derivatives
          • 2nd derivatives
          • The exponential e^x
        • Calculus
        • Gradient
      • Linear Algebra
        • Matrices
          • Matrix Multiplication
          • Inverses and Transpose and permutations
        • Vector Space and subspaces
        • Orthogonality
          • Orthogonal Sets
          • Projections
          • Least Squares
        • Gaussian Elimination
    • Programming
      • Command Line
      • Git & GitHub
      • Latex
      • Linear Algebra
        • Element-wise operations, Multiplication Transpose
      • Encodings and Character Sets
      • Uncategorized
      • Navigating Your Working Directory and File I/O
      • Python
        • Problem Solving
        • Strings
        • Lists & Dictionaries
        • Storing Data
        • HTTP Requests
      • SQL
        • Basic Statements
        • Entity Relationship Diagram
      • Jupyter Notebooks
      • Data Analysis
        • Data Visualization
          • Data Viz Cheat Sheet
          • Explanatory Analysis
          • Univariate Exploration of Data
            • Bar Chart
            • Pie Charts
            • Histograms
            • Kernel Density Estimation
            • Figures, Axes, and Subplots
            • Choosing a Plot for Discrete Data
            • Scales and Transformations (Log)
          • Bivariate Exploration of Data
            • Scatterplots
            • Overplotting, Transparency, and Jitter
            • Heatmaps
            • Violin & Box Plots
            • Categorical Variable Analysis
            • Faceting
            • Line Plots
            • Adapted Bar Charts
            • Q-Q, Swarm, Rug, Strip, Stacked, and Rigeline Plots
          • Multivariate Exploration of Data
            • Non-Positional Encodings for Third Variables
            • Color Palettes
            • Faceting for Multivariate Data
            • Plot and Correlation Matrices
            • Other Adaptations of Bivariate PLots
            • Feature Engineering for Data Viz
        • Python - Cheat Sheet
    • Machine Learning
      • Courses
        • Practical Deep learning for coders
          • Convolutional Neural Networks
            • Image Restauration
            • U-net
          • Lesson 1
          • Lesson 2
          • Lesson 3
          • Lesson 4 NLP, Collaborative filtering, Embeddings
          • Lesson 5 - Backprop, Accelerated SGD
          • Tabular data
        • Fast.ai - Intro to ML
          • Neural Nets
          • Business Applications
          • Class 1 & 2 - Random Forests
          • Lessons 3 & 4
      • Unsupervised Learning
        • Dimensionality Reduction
          • Independant Component Analysis
          • Random Projection
          • Principal Component Analysis
        • K-Means
        • Hierarchical Clustering
        • DBSCAN
        • Gaussian Mixture Model Clustering
        • Cluster Validation
      • Preprocessing
      • Machine Learning Overview
        • Confusion Matrix
      • Linear Regression
        • Feature Scaling and Normalization
        • Regularization
        • Polynomial Regression
        • Error functions
      • Decision Trees
      • Support Vector Machines
      • Training and Tuning
      • Model Evaluation Metrics
      • NLP
      • Neural Networks
        • Perceptron Algorithm
        • Multilayer Perceptron
        • Neural Network Architecture
        • Gradient Descent
        • Backpropagation
        • Training Neural Networks
  • Business
    • Analytics
      • KPIs for a Website
  • Books
    • Statistics
      • Practice Statistics for Data Science
        • Exploring Binary and Categorical Data
        • Data and Sampling Distributions
        • Statistical Experiments and Significance Testing
        • Regression and Prediction
        • Classification
        • Correlation
    • Pragmatic Thinking and Learning
      • Untitled
    • A Mind For Numbers: How to Excel at Math and Science
      • Focused and diffuse mode
      • Procrastination
      • Working memory and long term memory
        • Chunking
      • Importance of sleeping
      • Q&A with Terrence Sejnowski
      • Illusions of competence
      • Seeing the bigger picture
        • The value of a Library of Chunks
        • Overlearning
Powered by GitBook
On this page

Was this helpful?

  1. Sciences
  2. Math
  3. Statistics
  4. Inferential Statistics

Normal Distributions

PreviousInferential StatisticsNextSampling Distributions

Last updated 5 years ago

Was this helpful?

Let's contruct the formula for the bell curve function. We're working with the mean and variance.

The graph of this first forumla is:

As expected : when x=mu, then f(x) = 0. Also it's quadratic to it has a parabolic shape.

If we devide by sigma^2, this affects how wide or narrow it is:

If sigma = 4, then our curve will become wider. The larger our variance is, the larger the quadratic. This is because for a large sigma, the resulting f(x) will be smaller. So sigma will affect our resulting bell curve. This must be taken into the formula so that's why we added it.

Say we multiply by -1/2, what happens is that the quadratic is flipped over, so it becomes the green curve.

We need this because our bell curve is... bell curve. We need to multiply be a neg number to get it, otherwise it's the wrong shape. But this isn't enough to have our bell curve everything is under 0.

Now, if we add exponent: Where is f(x) maximized?

Answer = mu. This function is optimized when whatever is in the power is the largest. And that happens when mu = 0, so when x=mu.

Also, how can we find the maximum? Because it's when when x= mu, then f(mu) = e^0 = 1. The max height of the curve is 1.

This function is also minimized when x = + or - infinity.

When x = infinity of - infinity, f(x) = 0... which is exactly what we want our bell curve to be!

So the normal distribution function should be:

However, the area under the bell curve with this formula is not one. That is problematic because we want it to be one. The area is actually sqrt(2piesigma^2).

Therefore, in order for the bell curve's area to be equal to one, we multiply the formula by a normalizer.

This means the the bell curve function is described by:

In summary

To interpret this, we find inside the exp the quadratic penalty term of deviations from the expectation of the mean of the expression, and then the exponential squeezes it back into the bell shaped curve.

So the probability of X p(X) is twice as likely as the P(X') if the height of X on the curve is twice the heigh of X'.

This means that the height of the curve is proportional to the probability of this value being drawn.

Depending on the amount of coin flips we have, we use one or the other formulas. The last one is called our Gaussian exponential. 2nd one is binomial distribution.

The purpose of normal distribution is to be able to tackle many coin flips, or whatever example. So when we do hypothesis testing, and intervals, etc, we'll do it for the Gaussian exponential and not for binomial distributions or coin flips.