Central Limit Theorem

According to Pollock and Edwards (2020), the central limit theorem can be described in the following manner. “The central limit theorem is an established statistical rule that tells us that if we were to take an infinite number of samples of size n from a population of N members, the sample means will follow a normal distribution.”

As you might remember from class, I programmed a simulation of rolling a six-sided die 1,000 times and demonstrated how increasing the number of rolls increased the likelihood that the mean of the simulation would approximate the true mean of a six-sided die.

If we look at two different simulations we can see that they each converge on 3.5, but that the paths they take to get there and where they end up is slightly different.

Simulating Rolling a Die 1000 Times 1 Simulating Rolling a Die 1000 Times 2

This hints that the central limit theorem may apply to our simulations. To verify this, I ran 10,000 simulations of “Roll a Six-Sided Die 1,000 Times” and captured the distribution of the resulting mean values of each simulation. That looked like the following.

Simulating Rolling a Die 1000 Times 1

Here we can see that my simulation has generated a distribution around the true mean with results ranging from approximately 3.3 to approximately 3.9. We expect to see results like this around any repeated sampling process, whether that process is rolling a six-sided die or taking draws from a pool of respondents in order to study public opinion.


The Normal Distribution

Mathematical Notation for T-Statistic Analysis


When analyzing hypotheses it is important to remember that we can examine how significant our findings are by comparing our results to a “Null” hypothesis. That is to say, to ask ourselves how likely the effect we see in our analysis could be accurate if the expected result is “no effect” or “null.”

This is done by To find the T-statistic when you know the coefficient and the standard error, you divide the “Beta” coefficient by the “Standard Error” for that Coefficient.

\[t = \hat{\beta}/\hat{S_{\hat{\beta}}}\]

To find the “Beta” coefficient when you know the T-Statistic and the standard error, you multiply the T-Statistic by the Standard Error.

\[\hat{\beta} = t * \hat{S_{\hat{\beta}}}\]

To find the Standard Error when you know the “Beta” coefficient and the T-Statistic, you divide the “Beta” coefficient by the T-Statistic.

\[\hat{S_{\hat{\beta}}} = \hat{\beta}/t\]


Cable Revenue

     
Statistic N   Mean St. Dev. Min Pctl(25) Pctl(75) Max
Viewership 10   4,604 2,889 2,173 2,740 6,320 9,534
Revenue 10   140,631 99,441 65,714 78,224 189,685   319,484


Network Revenue

Statistic   N   Mean   St. Dev.   Min   Pctl(25)   Pctl(75)   Max
Viewership   11   18,181   13,691   700   8,750   22,000   43,700
Revenue   11   320,456   274,577   7,000   127,795   383,780   853,898


Regression

Results
Dependent variable:
Revenue
(1) (2)
Viewership 20.021*** 34.324***
(0.385) (0.903)
Constant -43,564.810*** -17,426.190***
(8,615.287) (4,839.045)
Observations 11 10
R2 0.997 0.994
Adjusted R2 0.996 0.994
Residual Std. Error 16,665.040 (df = 9) 7,826.720 (df = 8)
F Statistic 2,705.667*** (df = 1; 9) 1,444.835*** (df = 1; 8)
Note: p<0.1; p<0.05; p<0.01