Hypothesis Testing of frequencybased samples
Part 4 of our Introduction to Hypothesis Testing series.
by Daniel Bray, posted 25/08/2020
In part one of this series, we introduced the idea of hypothesis testing, along with a full description of the different elements that go into using these tools. It ended with a cheatsheet to help you choose which test to use based on the kind of data you’re testing.
Part two outlined some code samples for how to perform ztests on proportionbased samples.
Part three outlined some code samples for how to perform ttests on meanbased samples.
This post will now go into more detail for frequencybased samples.
If any of these terms – Null Hypothesis, Alternative Hypothesis, pvalue – are new to you, then I’d suggest reviewing the first part of this series before carrying on with this one..
What is a frequencybased sample?
In these cases we’re interested in checking frequencies, e.g. I’m expecting my result set to have a given distribution: does it?
Are differences between the distributions of two samples big enough that we should notice it? Are the distributions between variables in a single sample enough to indicate that the variables might depend on each other?
Requirements for the quality of the sample
For these tests the following sampling rules are required:
Random  The sample must be a random sample from the entire population 
Normal  The sample must be normal, for these tests either:

Independent  The sample must be independent – for these tests a good rule of thumb is that the sample size be less than 10% of the total population. 
Tests for meanbased samples
All of these code samples are available in this git repository
Chisquared qualityoffit
Compare the counts for some variables in a sample to an expected distribution
In this test we have an expected distribution of data across a category, and we want to check if the sample matches that.
For example, suppose a network was sized to have the expected distribution, and a sample observed the following counts
Class of Service  Expected Distribution  Observed Count in sample (size 650) 
A  5%  27 
B  10%  73 
C  15%  82 
D  70%  468 
Given a null hypothesis that the distribution is as expected, then the following python code would derive the probability that the sample fits into this expected distribution.
from scipy.stats import chisquare # can we assume anything from our sample significance = 0.05 # what do we expect to see in proportions? expected_proportions = [.05, .1, .15, .7] # what counts did we see in our sample? observed_counts = [27, 73, 82, 468] ######################## # how big was our sample sample_size = sum(observed_counts) # we derive our comparison counts here for our expected proportions, based on the sample size expected_counts = [float(sample_size) * x for x in expected_proportions] # Get the stat data (chi_stat, p_value) = chisquare(observed_counts, expected_counts) # report print('chi_stat: %0.5f, p_value: %0.5f' % (chi_stat, p_value)) if p_value > significance: print("Fail to reject the null hypothesis  we have nothing else to say") else: print("Reject the null hypothesis  suggest the alternative hypothesis is true")
Chisquared (homogeneity)
Compare the counts for some variables between two samples
In this case, the test is similar to the best fit (above) but rather than estimate the expected counts from the expected distribution, the test is comparing two sets of sampled counts to see if their frequencies are different enough to suggest that the underlying populations have different distributions.
This is, in effect, the same code as above – only in this case we have actual expected values to match, rather than having to estimate them from the sample.
from scipy.stats import chisquare # can we assume anything from our sample significance = 0.05 # what counts did we see in our samples? observed_counts_A = [32, 65, 97, 450] observed_counts_B = [27, 73, 82, 468] ######################## # Get the stat data (chi_stat, p_value) = chisquare(observed_counts_A, observed_counts_B) # report print('chi_stat: %0.5f, p_value: %0.5f' % (chi_stat, p_value)) if p_value > significance: print("Fail to reject the null hypothesis  we have nothing else to say") else: print("Reject the null hypothesis  suggest the alternative hypothesis is true")
Chisquared (independence)
Check single sample to see if the discrete variables are independent
In this case you have a sample from a population, over two discrete variables, and you want to tell if these two discrete variables have some kind of relationship – or if they are independent.
NOTE: this is for discrete variables (i.e. categories). If you wanted to check if numeric variables are independent you’d want to consider using something like a linear regression.
Suppose we had a pivot to see how people from different area types (town/country) voted for three different political parties.
The question we are asking is whether or not we can say whether or not there is likely to be a connection between these two variables (i.e. do town/country people have a strong preference to vote for a given party).
Party  
Cocktail Party  Garden Party  Mouse Party  
Voter Type  
Town  200  150  50 
Country  250  300  50 
The python code to check this is:
from scipy.stats import chi2_contingency import numpy as np # can we assume anything from our sample significance = 0.05 pivot = np.array([ # town votes [200,150,50], # country votes [250,300,50] ]) ######################## # Get the stat data (chi_stat, p_value, degrees_of_freedom, expected) = chi2_contingency(pivot) # report print('chi_stat: %0.5f, p_value: %0.5f' % (chi_stat, p_value)) if p_value > significance: print("Fail to reject the null hypothesis  we have nothing else to say") else: print("Reject the null hypothesis  suggest the alternative hypothesis is true")
Where do we go next?
Thank you for reading the final part of our introduction into hypothesis testing. I hope you found it a useful introduction into the world of statistical analysis. If you would like to look deeper into this field, I’d suggest the following.
 I’ve not touched on issues of power or effect size in this series. For that I would direct you to Robert Coe’s always worth reading: It’s the effect size, stupid: what effect size is and why it is important
 If you have more complex types of data to examine, then I’d suggest reading more into
 Analysis Of Variance – for when you have means in more than two sets of groups to compare, and using multiple tsets would waste your power.
 Linear Regression – for when you want to predict the value of one continuous variable, based on the values of some other continuous value, or just want to see if different continuous variables are, in fact, related.
 If our previous post – Quantitative analysis is as subjective as qualitative analysis – is making you doubt whether you can trust stats at all, then check out how meta analysis can be used to collect the results of multiple different analyses, and produce a single overall measure as to whether the underlying tests show a significant interaction.
If you would like to know more or have any suggestions, please don’t hesitate to reach out to us!