\(1\). Consider Example \(2\) of Lecture \(3\) (page \(6\)). Suppose that \(n\) Hershey’s Kisses were tossed and each outcome was recorded (as opposed to simply recording the total number of Kisses that landed on their base).
Based on the information provided, \(y_1, \dots, y_n\) have Bernoulli distribution, and the assumptions of Bernoulli are:
Then the likelihood function is:
\[L(\mathbf{y}; \pi) = \left\{ \begin{array}{ll} \prod^n_{i=1} \pi^{y_i} \left(1-\pi \right)^{1-y_i} = \pi^{\sum^n_{i=1} y_i} \left( 1-\pi\right)^{n-\sum^n_{i=1}y_i}, \hspace{0.5cm} &\text{$y_i=1$ or $0$; $0<\pi<1$}\\ 0, \hspace{0.5cm} &\text{otherwise} \\ \end{array} \right.\]
\[y_i = \left\{ \begin{array}{ll} 1, &\text{the Kiss land on the base} \\ 0, &\text{otherwise} \end{array} \right.\]
Since \(y_1, \dots , y_n\) are independent and identically distributed as Bernoulli, \[i_n(\theta) = nE\left[ -\frac{\partial^2}{\partial \theta^2} logp_\theta \left( Y_1 \right) \right]\]
As the result, the Fisher information on a single observation is: \[i_1(\theta) = E\left[ -\frac{\partial^2}{\partial \theta^2} logp_\theta \left( Y_1 \right) \right]\]
\[\begin{equation} \begin{split} \text{for a single observation, $n=1$, $\implies$} \hspace{0.5cm} \frac{\partial}{\partial \pi} log p_\pi \left(Y \right) &= \frac{ y_i}{\pi}- \frac{1-y_i}{1-\pi} \\ \frac{\partial^2}{\partial \pi^2} logp_\pi \left(Y \right) &= - \left[\frac{ y_i }{\pi^2}+\frac{1-y_i}{\left( 1-\pi \right)^2} \right] \\ i\left( \pi \right) &= E\left[ \frac{y_i }{\pi^2} + \frac{1-y_i}{\left( 1-\pi \right)^2} \right] \\ &= \frac{1}{\pi^2}E\left(y_i\right) + \frac{1}{\left(1-\pi\right)^2}-\frac{1}{\left(1-\pi\right)^2}E\left(y_i\right) \\ &= \frac{\pi}{\pi^2} + \frac{1}{\left( 1-\pi\right)^2} - \frac{\pi}{\left(1-\pi\right)^2} \\ &= \frac{1}{\pi} + \frac{1}{\left( 1-\pi\right)^2} - \frac{\pi}{\left(1-\pi\right)^2} \\ &= \frac{\left(1-\pi\right)^2+\pi-\pi^2}{\pi\left(1-\pi\right)^2} \\ &= \frac{1-2\pi+\pi^2+\pi-\pi^2}{\pi\left(1-\pi\right)^2} \\ &= \frac{1}{\pi \left(1-\pi\right)} \\ \end{split} \end{equation}\]
\(2\). [Adapted from problem \(1.10\) of Agresti, second edition] A sample of \(100\) women suffer from dysmenorrhea. A new analgesic is claimed to provide greater relief than a standard one. After using each analgesic in a crossover experiment, \(40\) reported greater relief with the standard analgesic and 60 reported greater relief with the new one.
\(\pi=\) the proportion of women who experience greater relief using the new analgesic than the standard analgesic
In fact, binomial distribution is a special case of a multinomial distribution
Obviously, all the assumptions are likely to be satisfied. Of course, the observations, women, may have influence on each other.
\[L(\pi) = \left\{ \begin{array}{ll} \pi^{60} \left(1-\pi \right)^{40}, \hspace{1cm} &\text{$0<\pi<1$} \\ 0, \hspace{1cm} &\text{otherwise.} \\ \end{array} \right.\]
\[\mathbb{H_0}: \pi=0.5 \hspace{0.5cm} v.s \hspace{0.5cm} \mathbb{H_a}: \pi>0.5\]
In English, the null hypothesis is that the new analgesic does not provide greater relief than the standard analgesic. The alternative hypothesis is that the new analgesic indeed provide greater relief than the standard analgesic.
As we discuss in the lecture, all three testing procedures are based on the likelihood function in one way or another. They are all asymptotically equivalent. However, Wald test statistic is easy to calculate (only need to estimate the parameters and the standard error of the estimated parameters in our models, in this case, only \(\hat{\pi}\) and \(\hat{se}\left(\hat{\pi}\right)\)). Also, its confidence interval has a closed form, while the other two do not.
# set up variables #
x <- 60
n <- 100
pi_0 <- 0.5
# pi_hat #
pi_hat <- x/n
pi_hat
## [1] 0.6
se_hat_pi_hat <- sqrt(pi_hat*(1-pi_hat))/sqrt(n)
se_hat_pi_hat
## [1] 0.04898979
# test statistics #
tw <- (pi_hat - pi_0) / se_hat_pi_hat
tw
## [1] 2.041241
# p-value #
pnorm(tw, lower.tail=F)
## [1] 0.02061342
# test statistic #
tlr <- -2*(log(pi_0^x*(1-pi_0)^(n-x))-log(pi_hat^x*(1-pi_hat)^(n-x)))
tlr
## [1] 4.027103
sqrt(tlr)
## [1] 2.006764
# p-value #
pchisq(tlr, df=1, lower.tail=F)
## [1] 0.04477477
# u(pi) #
upi <- x/pi_0 - (n-x)/(1-pi_0)
upi
## [1] 40
# i(pi) #
ipi <- n / (pi_0*(1-pi_0))
ipi
## [1] 400
# ts #
ts <- upi / sqrt(ipi)
ts
## [1] 2
# p-value #
pnorm(ts, lower.tail=F)
## [1] 0.02275013
wup <- pi_hat+qnorm(0.05,lower.tail=F) / (sqrt(n/(pi_hat*(1-pi_hat))))
wup
## [1] 0.680581
qchisq(.05, df=1, lower.tail=F)
## [1] 3.841459
z <- qnorm(0.05,lower.tail=F)
z
## [1] 1.644854
pi_t <- (x+0.5*z^2) / (n + z^2)
pi_t
## [1] 0.5973657
sup <- pi_t + (z*sqrt(n) / (n+z^2)) * sqrt(pi_hat * (1-pi_hat) + z^2 / (n*4))
sup
## [1] 0.6769219
knitr::kable(tbt, col.names=colnames(tbt), digits = 3, align="c")
| Test Statistic | P-value | Lower Bound | Upper Bound | |
|---|---|---|---|---|
| Wald Test | 2.041 | 0.021 | 0 | 0.681 |
| LR Ratio Test | 4.027 | 0.045 | 0 | 0.693 |
| Score Test | 2.000 | 0.023 | 0 | 0.677 |
\(3\). [Adapted from problem \(1.17\) of Agresti, second edition] Assume that \(y_1, \dots , y_n\) are independently drawn from a Poisson(\(\mu\)) distribution.
\[\begin{equation} \begin{split} \text{For Possion, the likelihood function is} \\ L(\mu; y) &=\prod^n_{i=1} \frac{\mu^{y_i} e^{-\mu}}{y_i !}\\ &= \frac{\mu^{\sum^n_{i=1}y_i}e^{-n\mu}}{y_1 ! y_2 ! \dots y_n !}\\ \text{and the loglikelihood is} \\ l\left(\mu ; y \right) &= \sum^n_{i=1} y_i log \mu - n\mu \\ \text{The Maximum Likelihood Estimator} \\ \frac{\partial}{\partial \mu} l\left(\mu ; y \right) &= \frac{\sum^n_{i=1} y_i}{\mu} - n = 0 \\ \hat{\mu} &= \frac{\sum^n_{i=1} y_i}{n} \\ \end{split} \end{equation}\]
\[\mathbb{H_0}: \mu=\mu_0 \hspace{0.5cm} \text{versus} \hspace{0.5cm} \mathbb{H_1}: \mu \ne \mu_0\]
The Wald Test: \[\begin{equation} \begin{split} \hat{\mu} &= \frac{\sum^n_{i=1} y_i}{n} \\ &= \bar{y}\\ u(\hat{\mu}) &= \frac{\sum^n_{i=1} y_i}{\hat{\mu}} - n \\ i(\hat{\mu}) &= -E\left[-\frac{\sum^n_{i=1} y_i}{\hat{\mu}^2} \right] \\ &=\frac{1}{\hat{\mu}^2} nE\left[y_i \right] \\ &= \frac{1}{\hat{\mu}^2} n \mu \\ &= \frac{n}{\hat{\mu}} \\ &= \frac{n}{\bar{y}} \\ \sqrt{\frac{1}{i\left(\hat{\mu}\right)}} &= \sqrt{\frac{\hat{\mu}}{n}} \\ &= \sqrt{\frac{\bar{y}}{n}} \\ \text{Wald Statistic:} \hspace{0.5cm} T_w &\approx \frac{\bar{y} - \mu_0} {\sqrt{\frac{\bar{y}}{n}} } \\ \end{split} \end{equation}\]
Score Test: \[\begin{equation} \begin{split} u(\mu_0) &= \frac{\sum^n_{i=1} y_i}{\mu_0} - n \\ i(\mu_0) &= -E\left[-\frac{\sum^n_{i=1} y_i}{\mu_0^2} \right] \\ &=\frac{1}{\mu_0^2} nE\left[y_i \right] \\ &= \frac{1}{\mu_0^2} n \mu \\ &= \frac{n}{\mu_0} \\ T_s &= \frac{\frac{\sum^n_{i=1} y_i}{\mu_0} - n }{\sqrt{\frac{n}{\mu_0}}} \\ &= \frac{\frac{n\bar{y}}{\mu_0}-n}{\sqrt{\frac{n}{\mu_0}}} \\ &= \frac{\frac{n^2\bar{y}-n^2\mu_0}{n\mu_0}}{\sqrt{\frac{n}{\mu_0}}} \\ &= \frac{n\times\left(\bar{y}-\mu_0\right)}{\mu_0} \times \frac{\sqrt{\mu_0}}{\sqrt{n}} \\ &= \frac{\bar{y} - \mu_0} {\sqrt{\frac{\mu_0}{n}} } \\ \end{split} \end{equation}\]
Likelihood Ratio Test: \[\begin{equation} \begin{split} log\left(\mu_0 \right) &= \sum^n_{i=1} y_i log \mu_0 - n \mu_0 \\ log\left(\hat{\mu}\right) &= \sum^n_{i=1} y_i log \left(\frac{\sum^n_{i=1} y_i }{n} \right) - \sum^n_{i=1} y_i \\ T_{LR} &= -2 \left[ \left(\sum^n_{i=1} y_i log \mu_0 - n \mu_0\right) - \left( \sum^n_{i=1} y_i log \left(\frac{\sum^n_{i=1} y_i }{n} \right) - \sum^n_{i=1} y_i\right)\right] \\ &= -2\left[n\bar{y}log\mu_0-n\mu_0-n\bar{y}\log\bar{y}+n\bar{y} \right] \end{split} \end{equation}\]
\(4\). According to Benford’s Law, the leading digit in lists of numbers from many naturally- occurring data set is distributed in a non-uniform manner. Specifically, if \(D_1\) denotes the leading digit of a number, then
\[P(D_1 = i) = log_{10} \left(\frac{i+1}{i} \right), \hspace{1cm} i=1,\dots,9.\]
Does Benford’s law hold true for stock market data? In each of the following, clearly state your hypotheses (all notation should be defined), the test statistic used and its distribution under the null hypothesis, and your conclusion in the context of the problem.
| Digit | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 |
|---|---|---|---|---|---|---|---|---|---|
| Count | 735 | 432 | 273 | 266 | 200 | 175 | 169 | 148 | 126 |
Is there evidence of deviation from Benford’s Law?
The test hypothesis: \[H_0: \text{The first digits in the days with nonzero returns from 1986 to 1995 follows Benford's Law}\] \[H_a: \text{The first digits in the days with nonzero returns from 1986 to 1995 violate Benford's Law}\]
\[\begin{equation} \begin{split} P(D_1=1) &= log_{10}2 = 0.693 \hspace{1cm} &E(D_1=1) =0.693*2524 = 759.80 \\ P(D_1=2) &= 0.405 \hspace{1cm} &E(D_1=2) = 444.45 \\ P(D_1=3) &= 0.288 \hspace{1cm} &E(D_1=3) = 315.34 \\ P(D_1=4) &= 0.223 \hspace{1cm} &E(D_1=4) = 244.60 \\ P(D_1=5) &= 0.182 \hspace{1cm} &E(D_1=5) = 199.85 \\ P(D_1=6) &= 0.154 \hspace{1cm} &E(D_1=6) = 168.97 \\ P(D_1=7) &= 0.134 \hspace{1cm} &E(D_1=7) = 146.37 \\ P(D_1=8) &= 0.118 \hspace{1cm} &E(D_1=8) = 129.11 \\ P(D_1=9) &= 0.105 \hspace{1cm} &E(D_1=9) = 115.49 \\ \text{$\chi^2$ Test Statistic:} \hspace{1cm} \chi^2 &= \sum^9_{i=1} \frac{\left(D_{1i}-E(D_1=i) \right)^2}{E\left(D_1=i\right)} \\ &= \text{Calculated by "hand" in R shown below} \\ &= 16.15 \\ \text{P-value} &\approx 0.04 \\ \end{split} \end{equation}\]
# probability and expected nonzero returns for i=1 to 9 #
pd1 <- function(x) {
out_mat <- matrix(,ncol=2, nrow=9)
for (i in 1:9){
out_mat[i,1] <- log10((x[i]+1)/x[i])
out_mat[i,2] <- out_mat[i,1]*2524
}
obs <- matrix(c(735, 432, 273, 266, 200, 175, 169, 148, 126), ncol=1)
chi2 <- (obs-out_mat[,2])^2 / out_mat[,2]
out_mat <- cbind(x,out_mat,obs, chi2)
colnames(out_mat) <- c("Digit","Probability", "Expected Return Days", "Observation Return Days","X Square")
return(out_mat)
}
x <- c(1,2,3,4,5,6,7,8,9)
knitr:: kable(pd1(x), digits=3, align="c")
| Digit | Probability | Expected Return Days | Observation Return Days | X Square |
|---|---|---|---|---|
| 1 | 0.301 | 759.800 | 735 | 0.809 |
| 2 | 0.176 | 444.454 | 432 | 0.349 |
| 3 | 0.125 | 315.345 | 273 | 5.686 |
| 4 | 0.097 | 244.601 | 266 | 1.872 |
| 5 | 0.079 | 199.853 | 200 | 0.000 |
| 6 | 0.067 | 168.974 | 175 | 0.215 |
| 7 | 0.058 | 146.372 | 169 | 3.498 |
| 8 | 0.051 | 129.109 | 148 | 2.764 |
| 9 | 0.046 | 115.492 | 126 | 0.956 |
# chi square score #
sum(pd1(x)[,5])
## [1] 16.15026
# P-value #
pchisq(sum(pd1(x)[,5]), df=9-1, lower.tail=F)
## [1] 0.0402792
| Digit | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 |
|---|---|---|---|---|---|---|---|---|---|
| \(|Return|\) \(\ge\) 0.1% | 698 | 387 | 232 | 228 | 162 | 136 | 124 | 108 | 92 |
| \(|Return|\) < 0.1% | 37 | 45 | 41 | 38 | 39 | 39 | 45 | 40 | 34 |
Does Benford’s law hold for days with absolute returns no less than 0.1%? Is there evidence to support the conjecture that the distribution of leading digits for very small returns is uniform?
The test hypothesis: \[H_0: \text{The first digits in the firt digits in days with absolute returns no less than $0.1$% from 1986 to 1995 follows Benford's Law}\] \[H_a: \text{The first digits in the firt digits in days with absolute returns no less than $0.1$% from 1986 to 1995 violate Benford's Law}\]
the total return with absolute returns no less than \(0.1\)% is: \[698+387+232+228+162+136+124+108+92 = 2167\]
the new expected return counts are: \[\begin{equation} \begin{split} E(D_1=1)&=0.693*2167=652.332 \\ E(D_1=2)&=381.590 \\ E(D_1=3)&=270.742 \\ E(D_1=4)&=210.004 \\ E(D_1=5)&=171.586 \\ E(D_1=6)&=145.074 \\ E(D_1=7)&=125.669 \\ E(D_1=8)&=110.848 \\ E(D_1=9)&=99.156 \\ \text{$\chi^2$ Test Statistic:} \hspace{1cm} \chi^2 &= \sum^9_{i=1} \frac{\left(D_{1i}-E(D_1=i) \right)^2}{E\left(D_1=i\right)} \\ &= \text{Calculated by "hand" in R shown below} \\ &= 12.075 \\ \text{P-value} &\approx 0.148 \\ \end{split} \end{equation}\]
Conclusion: P-value of the goodness of fit test \(\approx 0.148\), based on \(5\)% significant level, we fail reject the null hypothesis. We do not have enough evident to conclude that the first digits in days with absolute returns less than \(0.1\)% from 1986 to 1995 violate Benford’s Law.
# probability and expected nonzero returns for i=1 to 9 #
pd1 <- function(x) {
out_mat <- matrix(,ncol=2, nrow=9)
o <- c(698, 387, 232, 228, 162, 136, 124, 108, 92)
for (i in 1:9){
out_mat[i,1] <- log10((x[i]+1)/x[i])
out_mat[i,2] <- out_mat[i,1]*sum(o)
}
obs <- matrix(o, ncol=1)
chi2 <- (out_mat[,2]-obs)^2 / out_mat[,2]
out_mat <- cbind(x, out_mat,obs, chi2)
colnames(out_mat) <- c("digit","Probability", "Expected Return Days", "Observation Return Days","X Square")
return(out_mat)
}
x <- c(1,2,3,4,5,6,7,8,9)
knitr:: kable(pd1(x), digits=3, align="c")
| digit | Probability | Expected Return Days | Observation Return Days | X Square |
|---|---|---|---|---|
| 1 | 0.301 | 652.332 | 698 | 3.197 |
| 2 | 0.176 | 381.590 | 387 | 0.077 |
| 3 | 0.125 | 270.742 | 232 | 5.544 |
| 4 | 0.097 | 210.004 | 228 | 1.542 |
| 5 | 0.079 | 171.586 | 162 | 0.536 |
| 6 | 0.067 | 145.074 | 136 | 0.568 |
| 7 | 0.058 | 125.669 | 124 | 0.022 |
| 8 | 0.051 | 110.848 | 108 | 0.073 |
| 9 | 0.046 | 99.156 | 92 | 0.517 |
# chi square score #
sum(pd1(x)[,5])
## [1] 12.07466
# P-value #
pchisq(sum(pd1(x)[,5]), df=9-1, lower.tail=F)
## [1] 0.1479035
\[\begin{equation} \begin{split} \text{Total count} &= 37+45+41+38+38+39+45+10+34 = 357 \\ E(D_1=1) &= \dots = E(D_1=9) = \frac{1}{9} \times 357 = 39.667 \\ \text{$\chi^2$ Test Statistic:} \hspace{1cm} \chi^2 &= \sum^9_{i=1} \frac{\left(D_{1i}-E(D_1=i) \right)^2}{E\left(D_1=i\right)} \\ &= \text{Calculated by "hand" in R shown below} \\ &= 2.622 \\ \text{P-value} &\approx 0.956 \\ \end{split} \end{equation}\]
# probability and expected nonzero returns for i=1 to 9 #
pd1 <- function(x) {
out_mat <- matrix(,ncol=2, nrow=9)
o <- c(37, 45, 41, 38, 38, 39, 45, 40, 34)
for (i in 1:9){
out_mat[i,1] <- 1/length(x)
out_mat[i,2] <- 1/9*sum(o)
}
obs <- matrix(o, ncol=1)
chi2 <- (out_mat[,2]-obs)^2 / out_mat[,2]
out_mat <- cbind(x, out_mat,obs, chi2)
colnames(out_mat) <- c("digit","Probability", "Expected Return Days", "Observation Return Days","X Square")
return(out_mat)
}
x <- c(1,2,3,4,5,6,7,8,9)
knitr:: kable(pd1(x), digits=3, align="c")
| digit | Probability | Expected Return Days | Observation Return Days | X Square |
|---|---|---|---|---|
| 1 | 0.111 | 39.667 | 37 | 0.179 |
| 2 | 0.111 | 39.667 | 45 | 0.717 |
| 3 | 0.111 | 39.667 | 41 | 0.045 |
| 4 | 0.111 | 39.667 | 38 | 0.070 |
| 5 | 0.111 | 39.667 | 38 | 0.070 |
| 6 | 0.111 | 39.667 | 39 | 0.011 |
| 7 | 0.111 | 39.667 | 45 | 0.717 |
| 8 | 0.111 | 39.667 | 40 | 0.003 |
| 9 | 0.111 | 39.667 | 34 | 0.810 |
# chi square score #
sum(pd1(x)[,5])
## [1] 2.621849
# P-value #
pchisq(sum(pd1(x)[,5]), df=9-1, lower.tail=F)
## [1] 0.9558066
\[\begin{equation} \begin{split} H_0: &\text{The first digits in the firt digits in days with very small returns from 1986 to 1995 follows uniform distribution} \\ H_a: &\text{The first digits in the firt digits in days with very small returns from 1986 to 1995 are linearly ordered} \end{split} \end{equation}\]
\[H_0: \left( \pi_1, \dots, \pi_9 \right) = \left(\frac{1}{9}, \dots, \frac{1}{9}\right)\]
\[H_a: \text{$\pi_j$'s are linearly ordered}\]
Under \(H_0\), \(X^2_1\) has a \(\chi^2_1\) distribution
Calculations:
\[\begin{equation} \begin{split} \mu^0 &= \frac{1}{9} \times (1+\dots+9) = 5 \\ \mu^0_2 &= \frac{1}{9} \times \left[ (1-5)^2+\dots+(1-9)^2 \right]= 6.667 \\ X^2_1 &= \text{Calculated by "hand" in R below} \\ &= 0.136 \\ P-value &\approx 0.712 \\ \end{split} \end{equation}\]
\[\begin{equation} \begin{split} X^2-X^2_1 &= 2.486 \\ P-value &\approx 0.928 \end{split} \end{equation}\]
mumu <- function(x, o){
mu0 <- 1/9 * sum(x);
mu00 <- c()
gsj <- c()
ngsj <- c()
for (i in 1:length(x)){
mu00[i] <- (x[i]-mu0)^2
}
mu02 <- 1/9*sum(mu00)
for (j in 1:length(x)){
gsj[j] <- (x[j]-mu0) / sqrt(mu02)
ngsj[j] <- o[j]*gsj[j]
}
return((1/sqrt(sum(o)) * sum(ngsj))^2)
}
x <- c(1,2,3,4,5,6,7,8,9)
o <- c(37, 45, 41, 38, 38, 39, 45, 40, 34)
# X^2_1 #
mumu(x, o)
## [1] 0.1361345
# P-value #
pchisq(mumu(x,o),df=1, lower.tail=F)
## [1] 0.7121545
#X^2 - X^2_1#
sum(pd1(x)[,5])-mumu(x,o)
## [1] 2.485714
#p-value#
pchisq(sum(pd1(x)[,5])-mumu(x,o), df=9-2, lower.tail=F)
## [1] 0.9281689