July, 2019

Parametric & Non Parametric Hypothesis

  • Parametric hypothesis, if the family of distributions determined by the hypothesis can be put into one to one correspondence with a subset of finite dimensional Euclidean space.
  • If not, then non parametric hypothesis.
  • Every simple hypothesis is a parametric hypothesis.
  • Non parametric hypothesis is a composite hypothesis which is not simple.

Examples

  • \(X \sim Bin(n,p)\). To test \(H_0 : p = 0.5\) v/s \(H_1 : p = 0.75\). Both are parametric hypotheses.
  • \(X \sim Bin(n,p)\). To test \(H_0 : p = 0.5\) v/s \(H_1 : p > 0.5\). Here, \(\Theta_1 = \{p : p > 0.5\}\). Both are parametric hypotheses.
  • \((X,Y) \sim F(x,y)\), where \(F\) is bivariate normal with marginals \(N(0,1)\) and \(N(1,1)\). To test \(H_0 : \rho = \rho_0\) v/s \(H_1 : \rho > \rho_0\). Here \(\Theta_1 = \{\rho : \rho > \rho_0\}\). Both are parametric hypotheses.
  • \(X \sim F\), where \(F\) is a location family with location parameter \(\theta\). To test \(H_0 : \theta = 0\) v/s \(H_1 : \theta = 0.5\). Both are non parametric hypotheses.

Distribution Free

  • Suppose \((X_1,X_2,...,X_n) \sim F\).
  • \(F \in \mathcal{F}\), a class of distributions.
  • \(T(X_1,X_2,...,X_n)\) is distribution free if it's distribution is same for all \(F \in \mathcal{F}\).
  • Will see examples of distribution free statistics later on.
  • Non parametrics and distribution free do not have same meaning.
  • Will see why later on.

Non Parametrics

  • No distributional assumptions.
  • Testing problems analogous to parametrics.
  • Test statistics are easy to calculate.
  • These need not be functions of observations.
  • In many cases these would be discrete random variables taking 'few' values.
  • Limiting distributions will be normal or chi-square.
  • Robust procedures.

Test for One Sample Location Family

  • Sign test.
  • Wilcoxon sign-rank test.

Sign Test

  • \(X_1,X_2,...,X_n\) IID \(F(x-\theta)\).
  • \(F \in \mathcal{F_0} = \{F : F\) is absolutely continuous, and \(F(0) = \frac{1}{2}\}\).
  • To test
    • \(H_0 : \theta = 0\),
    • \(H_1 : \theta > 0\),
    • \(H_2 : \theta < 0\),
    • \(H_3 : \theta \neq 0\).

Remarks

  • To test \(H_0^* : \theta = \theta_0\) (known) v/s corresponding alternative hyotheses, consider \(Y_i = X_i - \theta_0, \hspace{1mm} i =1(1)n\).
  • All the hypotheses are non parametric.
  • One can make appropriate adjustments to test for quantiles.

Sign Statistic

  • \(S = \sum_{i=1}^{n} I_{(0,\infty)}(X_i)\).
  • \(B = \sum_{i=1}^{n} I_{(-\infty,0)}(X_i)\).
  • \(S\) counts the number of sample observations which are positive.
  • \(S \sim Bin(n,p)\).
  • \(p=P(X_1>0)=1-P(X_1 \leq 0)=1-F(-\theta)\).
  • \(S\) is not distribution free.
  • But, under \(H_0, S \sim Bin(n,\frac{1}{2})\). Sign statistic is distribution free only under \(H_0\).

Rejection rules

  • Reject \(H_0\) in favour of \(H_1\) for large values of \(S\).
  • Reject \(H_0\) in favour of \(H_2\) for small values of \(S\).
  • Reject \(H_0\) in favour of \(H_3\) for small or large values of \(S\).

Testing of \(H_0\) v/s \(H_1\) with respect to a given size \(\alpha\)

The test function is \[ \phi(s) = \begin{cases} 1, \text{if} \hspace{1mm} S > s,\\ a, \text{if} \hspace{1mm} S = s,\\ 0, \text{if} \hspace{1mm} S < s, \end{cases} \] where \(s\) is such that \(P_{H_0}(S>s) \leq \alpha < P_{H_0}(S \geq s)\) and \(a \in [0,1)\) is such that \(E_{H_0}\phi=\alpha\).

Testing of \(H_0\) v/s \(H_2\) with respect to a given size \(\alpha\)

The test function is \[ \phi(s) = \begin{cases} 1, \text{if} \hspace{1mm} S < s,\\ a, \text{if} \hspace{1mm} S = s,\\ 0, \text{if} \hspace{1mm} S > s, \end{cases} \] where \(s\) is such that \(P_{H_0}(S<s) \leq \alpha < P_{H_0}(S \leq s)\) and \(a \in [0,1)\) is such that \(E_{H_0}\phi=\alpha\).

Testing of \(H_0\) v/s \(H_3\) with respect to a given size \(\alpha\)

The test function is \[ \phi(s) = \begin{cases} 1, \text{if} \hspace{1mm} S < s_1,or \hspace{1mm} S>s_2,\\ a_1, \text{if} \hspace{1mm} S = s_1,\\ a_2, \text{if} \hspace{1mm} S = s_2,\\ 0, \text{if} \hspace{1mm} s_1<S<s_2, \end{cases} \] where \(s_1, s_2\) are such that \(P_{H_0}(S<s_1) \leq \alpha_1 < P_{H_0}(S \leq s_1)\), \(P_{H_0}(S>s_2) \leq \alpha_2 < P_{H_0}(S \geq s_2)\), and \(a_1, a_2 \in [0,1)\) are such that \(P_{H_0}(S<s_1)+a_1P_{H_0}(S=s_1)=\alpha_1\), \(P_{H_0}(S>s_2)+a_2P_{H_0}(S=s_2)=\alpha_2\), and \(0<\alpha_1,\alpha_2<\alpha_1+\alpha_2=\alpha<1\).

Limiting distribution

  • By CLT, \(T = \frac{S-\frac{n}{2}}{\sqrt{\frac{n}{4}}} \Longrightarrow Z, Z \sim N(0,1)\), under \(H_0\).
  • For testing \(H_0\) v/s \(H_1\), the test function is \[ \phi(t) = \begin{cases} 1, \text{if} \hspace{1mm} t>\tau_{\alpha},\\ 0, \text{Otherwise}. \end{cases} \]
  • For testing \(H_0\) v/s \(H_2\), the test function is \[ \phi(t) = \begin{cases} 1, \text{if} \hspace{1mm} t<-\tau_{\alpha},\\ 0, \text{Otherwise}. \end{cases} \]

  • For testing \(H_0\) v/s \(H_3\), the test function is \[ \phi(t) = \begin{cases} 1, \text{if} \hspace{1mm} |t|>\tau_{\alpha/2},\\ 0, \text{Otherwise}. \end{cases} \]
  • \(\tau_{\alpha}\) is the upper \(\alpha\) point of a \(N(0,1)\) distribution.

Remark

  • Suppose \((X_1,Y_1),(X_2,Y_2),...,(X_n,Y_n)\) is a random sample of size \(n\) from a bivariate population.
  • \(D=X-Y\).
  • Sign test can be used to test for the location of \(D\).

Sign test in R

library(nonpar)
x = c(1.8, 3.3, 5.65, 2.25, 2.5, 3.5, 2.75, 3.25, 3.10, 2.70, 3, 4.75, 3.4)
signtest(x, m = 3.5, alpha = 0.05, alternative = 'two.sided', 
         conf.level = 0.95, exact = TRUE)
## 
##  Large Sample Approximation for the Sign Test 
##  
##  H0: The population median is =  3.5 
##  HA: The population median is not equal to  3.5 
##  
##  B = 10 
##  
##  Significance Level = 0.05 
##  The p-value is  0.043308142810792 
##  There is enough evidence to conclude that the population median is different than 3.5 at a significance level of  0.05 
##  
##  The  95 % confidence interval is [ 2.25 ,  3.3 ]. 
## 

sign.test = function(data, quantile = 0.5, theta = 0, alternative = 'two.sided')
{
n = length(data)
y = data - theta
z = which(y > 0)
S = length(z)
b = binom.test(S, n, p = 1-quantile, alternative = alternative)
p.value = b$p.value
aa = list('Exact sign test', 'Quantile' = quantile, 'Theta' = theta, 
          'Alternative hypothesis' = alternative, 'Value of the sign statistic' = S,
          'p-value of the test' = p.value)
aa
}