JulienBeaulieu
  • Introduction
  • Sciences
    • Math
      • Probability
        • Bayes Rule
        • Binomial distribution
        • Conditional Probability
      • Statistics
        • Descriptive Statistics
        • Inferential Statistics
          • Normal Distributions
          • Sampling Distributions
          • Confidence Intervals
          • Hypothesis Testing
          • AB Testing
        • Simple Linear Regression
        • Multiple Linear Regression
          • Statistical learning course
          • Model Assumptions And How To Address Each
        • Logistic Regression
      • Calculus
        • The big picture of Calculus
          • Derivatives
          • 2nd derivatives
          • The exponential e^x
        • Calculus
        • Gradient
      • Linear Algebra
        • Matrices
          • Matrix Multiplication
          • Inverses and Transpose and permutations
        • Vector Space and subspaces
        • Orthogonality
          • Orthogonal Sets
          • Projections
          • Least Squares
        • Gaussian Elimination
    • Programming
      • Command Line
      • Git & GitHub
      • Latex
      • Linear Algebra
        • Element-wise operations, Multiplication Transpose
      • Encodings and Character Sets
      • Uncategorized
      • Navigating Your Working Directory and File I/O
      • Python
        • Problem Solving
        • Strings
        • Lists & Dictionaries
        • Storing Data
        • HTTP Requests
      • SQL
        • Basic Statements
        • Entity Relationship Diagram
      • Jupyter Notebooks
      • Data Analysis
        • Data Visualization
          • Data Viz Cheat Sheet
          • Explanatory Analysis
          • Univariate Exploration of Data
            • Bar Chart
            • Pie Charts
            • Histograms
            • Kernel Density Estimation
            • Figures, Axes, and Subplots
            • Choosing a Plot for Discrete Data
            • Scales and Transformations (Log)
          • Bivariate Exploration of Data
            • Scatterplots
            • Overplotting, Transparency, and Jitter
            • Heatmaps
            • Violin & Box Plots
            • Categorical Variable Analysis
            • Faceting
            • Line Plots
            • Adapted Bar Charts
            • Q-Q, Swarm, Rug, Strip, Stacked, and Rigeline Plots
          • Multivariate Exploration of Data
            • Non-Positional Encodings for Third Variables
            • Color Palettes
            • Faceting for Multivariate Data
            • Plot and Correlation Matrices
            • Other Adaptations of Bivariate PLots
            • Feature Engineering for Data Viz
        • Python - Cheat Sheet
    • Machine Learning
      • Courses
        • Practical Deep learning for coders
          • Convolutional Neural Networks
            • Image Restauration
            • U-net
          • Lesson 1
          • Lesson 2
          • Lesson 3
          • Lesson 4 NLP, Collaborative filtering, Embeddings
          • Lesson 5 - Backprop, Accelerated SGD
          • Tabular data
        • Fast.ai - Intro to ML
          • Neural Nets
          • Business Applications
          • Class 1 & 2 - Random Forests
          • Lessons 3 & 4
      • Unsupervised Learning
        • Dimensionality Reduction
          • Independant Component Analysis
          • Random Projection
          • Principal Component Analysis
        • K-Means
        • Hierarchical Clustering
        • DBSCAN
        • Gaussian Mixture Model Clustering
        • Cluster Validation
      • Preprocessing
      • Machine Learning Overview
        • Confusion Matrix
      • Linear Regression
        • Feature Scaling and Normalization
        • Regularization
        • Polynomial Regression
        • Error functions
      • Decision Trees
      • Support Vector Machines
      • Training and Tuning
      • Model Evaluation Metrics
      • NLP
      • Neural Networks
        • Perceptron Algorithm
        • Multilayer Perceptron
        • Neural Network Architecture
        • Gradient Descent
        • Backpropagation
        • Training Neural Networks
  • Business
    • Analytics
      • KPIs for a Website
  • Books
    • Statistics
      • Practice Statistics for Data Science
        • Exploring Binary and Categorical Data
        • Data and Sampling Distributions
        • Statistical Experiments and Significance Testing
        • Regression and Prediction
        • Classification
        • Correlation
    • Pragmatic Thinking and Learning
      • Untitled
    • A Mind For Numbers: How to Excel at Math and Science
      • Focused and diffuse mode
      • Procrastination
      • Working memory and long term memory
        • Chunking
      • Importance of sleeping
      • Q&A with Terrence Sejnowski
      • Illusions of competence
      • Seeing the bigger picture
        • The value of a Library of Chunks
        • Overlearning
Powered by GitBook
On this page
  • Type I and II error
  • P-value
  • Impact of large sample size:
  • Doing multiple hypothesis tests
  • Confidence intervals VS hypothesis tests

Was this helpful?

  1. Sciences
  2. Math
  3. Statistics
  4. Inferential Statistics

Hypothesis Testing

PreviousConfidence IntervalsNextAB Testing

Last updated 5 years ago

Was this helpful?

Hypothesis testing and confidence intervals allow us to use only sample data to draw conclusions about an entire population.

H0: Is the condition we believe to be true before we collect any data. Mathematically the null hyp is a statement of 2 groups being equal or that the effect is equal to 0.

The null and the alternative hypotheses should be competing and non overlapping.

The alternative hypotheses is usually associated with what you want or what you want to prove to be true. Mathematically the alternative holds a = > or < sign.

Ex: A new version of a page:

Type I and II error

Type 1

Type 2

Type I Errors

Type I errors have the following features:

  1. You should set up your null and alternative hypotheses, so that the worst of your errors is the type I error.

  2. They are denoted by the symbol \alphaα.

  3. The definition of a type I error is: Deciding the alternative (H_1H1​) is true, when actually (H_0H0​) is true.

  4. Type I errors are often called false positives.

Type II Errors

  1. They are denoted by the symbol \betaβ.

  2. The definition of a type II error is: Deciding the null (H_0H0​) is true, when actually (H_1H1​) is true.

  3. Type II errors are often called false negatives.

Example:

is the same as

Keep in mind: hypothesis tests and confidence intervals tell us about parameters not statistics.

P-value

The definition of a p-value is the conditional probability of observing your statistic (or one more extreme in favor of the alternative) if the null hypothesis is true.

In this video, you learned exactly how to calculate this value. The more extreme in favor of the alternative portion of this statement determines the shading associated with your p-value.

The P-value depends on your alternative hypothesis. It determines what is considered more extreme.

Therefore, you have the following cases:

If your parameter is greater than some value in the alternative hypothesis, your shading would look like this to obtain your p-value. You have the following cases:

If your parameter is greater than some value in the alternative hypothesis, your shading would look like this to obtain your p-value: We share greater than the statistic. Here we found that our sample mean was 5, so we shade greater than 5.

If instead we inverse our hypothesis to the following image, then our P-Value would look like below. So the P-Value would be much higher if again, the sample mean we found was 5. That is because

If your parameter is less than some value in the alternative hypothesis, your shading would look like this to obtain your p-value:

If your parameter is not equal to some value in the alternative hypothesis (if we have a not equal sign in your alternative), your shading would look like this to obtain your p-value:

In these cases we just care about statisticas that are just far from the null in either direction.

How to choose H0 or H1 based on P-Value:

If you're willing to make 5% of error, where you choose the Hypothesis incorrectly:

Acceptable way to draw conclusions

Impact of large sample size:

When conducting a hyp test we should be asking ourselves, is my sample representative of my population of interest? Are there ways to assure that everyone in my population is accuratly represented in my sample? If our sample isn't representative, our conclusions won't be good.

We should know the impact of the sample size and the role they play on our results. As sample size increases, everything (even the smallest differences ) becomes statistically significant, so we're always choosing the alternative hypothesis.

As sample sizes increase as part of our data world, we are moving away from hypothesis tests and using different techniques for exactly this reason.

Ex: which of 2 coffee types will sell more on average. Say type 1 sells on average better than type 2. However, there are millions of individuals who still prefer a different type. With large sample sizes, we should do better than this. Discussing averages leaves out an entire part of the population who preferred type 2.

With machine learning we can individualize the approach. Maybe we sell 20 types of coffee, and we know what type every member of the pop wants. So large sample size are detrimental to hyp testing, but are great for machine learning.

One of the most important aspects of interpreting any statistical results (and one that is frequently overlooked) is assuring that your sample is truly representative of your population of interest.

Hypothesis testing takes an aggregate approach towards the conclusions made based on data, as these tests are aimed at understanding population parameters (which are aggregate population values).

Doing multiple hypothesis tests

When performing more than one hypothesis test, your type I error compounds. In order to correct for this, a common technique is called the Bonferroni correction. This correction is very conservative, but says that your new type I error rate should be the error rate you actually want divided by the number of tests you are performing.

Therefore, if you would like to hold a type I error rate of 1% for each of 20 hypothesis tests, the Bonferroni corrected rate would be 0.01/20 = 0.0005. This would be the new rate you should use as your comparison to the p-value for each of the 20 tests to make your decision.

Other Techniques

Additional techniques to protect against compounding type I errors include:

Confidence intervals VS hypothesis tests

A two-sided hypothesis test (that is a test involving a \neq≠ in the alternative) is the same in terms of the conclusions made as a confidence interval as long as:

1 - CI = \alpha1−CI=α

For example, a 95% confidence interval will draw the same conclusions as a hypothesis test with a type I error rate of 0.05 in terms of which hypothesis to choose, because:

1 - 0.95 = 0.051−0.95=0.05

assuming that the alternative hypothesis is a two sided test.

Particularly in the way that data is collected today in the age of computers, response bias is so important to keep in mind. In the 2016 U.S election, polls conducted by many news media suggested a staggering difference from the reality of poll results. You can read about how response bias played a role .

- in the medical field

here
Tukey correction
Q-values