Statistics#
Hypothesis Testing#
Make informed decisions from data
Steps:
Formulate a hypothesis
Collect a sample
Use the sample to reject or accept the hypothesis
Make a decision based on test results
Types of Hypothesis:
Null Hypothesis: No effect
Alternative Hypothesis: Effect or difference is observed
Instead of trying to prove our hypothesis is correct, we try to disprove or reject the null hypothesis. If the null hypothesis is false, we proclaim that the alternative hypothesis must be true.
How to do a test?
Test Statistic: We compute a numerical value that measures the difference between groups
p-value: The probability of observing the test statistic assuming that the null hypothesis is true. Typically, we aim to have this value as low as possible.
Significance level: We assign a significance level eg. 0.05
If p-value <= 0.05, we reject the null hypothesis, so the observed difference between groups is statistically significant.
If p-value > 0.05, we have to accept the null hypothesis, so the observed difference can not be true.
Common Types of Hypothesis Testing#
t-test: Compares the means of two groups or compare the mean of the sample with the population
z-test: Compares sample proportion to a population proportion
ANOVA-test: Compares means of three or more groups
t-test#
Let’s assume a company produces the same product in two manufacturing units A and B and they measured the length of the product at both units.
Now, if we want to compare if there are significant differences between the products produced in different units, we can do a t-test.
import numpy as np
from scipy import stats
plant_a = [7.1, 7.3, 7.2, 7.4, 7.1, 7.5, 7.3, 7.2, 7.4, 7.6]
plant_b = [7.5, 7.6, 7.4, 7.7, 7.6, 7.5, 7.8, 7.7, 7.6, 7.9]
t_stat, p_val = stats.ttest_ind(plant_a, plant_b)
print("t-statistic:", t_stat)
print("p-value:", p_val)
t-statistic: -4.525483399593919
p-value: 0.0002618655396325686
Null Hypothesis: The products from two units are similar.
Alternative Hypothesis: There is a significant difference between the products.
Since the p-value is less than 0.05, we reject the null hypothesis. We can statistically say that the products produced in units A and B are significantly different.
z-test#
Let’s say a company wants to test a new change to the website. They rolled out this feature to 300 users out of 1000. The remaining 700 users are shown the old website. Now, they looked at conversion percentage on New vs Old and would like to know if the change lead to a increase in conversion.
Steps:
Compute probabilities of clicks in both variants
Compute z-statistic and standard error
Compute p-value
new_web_users = 300
old_web_users = 700
new_web_clicks = 150
old_web_clicks = 250
prob_clicks_new = new_web_clicks / new_web_users
prob_clicks_old = old_web_clicks / old_web_users
se = (np.sqrt((prob_clicks_new * (1 - prob_clicks_new) / new_web_clicks)
+ (prob_clicks_old * (1 - prob_clicks_old) / old_web_clicks)))
z_stat = (prob_clicks_new-prob_clicks_old) / se
z_stat
np.float64(2.8097574347450816)
p_val = stats.norm.sf(abs(z_stat))
p_val
np.float64(0.002478942540635378)
Since the p-value is very less, the observed difference between groups is statistically significant. So, the new change worked and lead to increase in click rate.
ANOVA#
An extension of t-test to multiple groups.
Used to compare if there is any difference between difference groups.
For example, we consider three groups spread across three countries and observe their click rate.
G1, G2, G3
Null Hypotheis: Means G1 == G2 == G3
Alternative Hypothesis: Atleast one of the groups is significantly different.
Computation:
Sum of Squares Between: Sum of squared differences between group means and overall mean
Sum of Squares Within: Sum of squared difference between each point and its group mean
Compute Mean sum of squares Between
Compute Mean sum of squares Within
F-ratio: Mean Between / Mean Within
p-value: Probability of observing f-ration
g1 = [100, 120, 110, 130, 140, 120, 110, 130, 140, 150]
g2 = [90, 100, 110, 120, 130, 140, 150, 160, 170, 180]
g3 = [80, 90, 100, 110, 120, 130, 140, 150, 160, 170]
f_stat, p_val = stats.f_oneway(g1, g2, g3)
print("F-Ratio:", f_stat)
print("p-value:", p_val)
F-Ratio: 0.48
p-value: 0.623963320632972
Since p-value > 0.05, we can not reject the null hypothesis so it must be true, so we conclude the country specific differences that we see are not statistically significant.
np.mean(g1), np.mean(g2), np.mean(g3)
(np.float64(125.0), np.float64(135.0), np.float64(125.0))
How to communicate results of ANOVA to a stakeholder who isn’t well versed in statistics?
Show interactions
Bar chart between groups