| Theoritical | Observed | |
|---|---|---|
| mean | 0.0000000 | -0.0005778 |
| variance | 0.0617284 | 0.0618136 |
used libraries
require(car) #for qqPlot()require(kableExtra) # for printing tablesrequire(MASS) #for generaing bivariate normal samplerequire(fMultivar) #for bivariate t samplerequire(copula) # for generating samples from copularequire(extraDistr) #for laplace distributionOur main aim is to check if we have paired observations ,we want to check whether the corresponding variables are independent or not.
\((X_1,Y_1),(X_2,Y_2),\ldots(X_n,Y_n)\hspace{0.5cm}iid\hspace{0.5cm} F(x,y)\) (Continuous)
kendall’s \(\tau\) is defined as
\[\mathcal{\tau} = \dfrac{1}{\binom n2}\sum_{i = 1}^{n-1}\sum_{j = i+1}^{n}sign(X_i - X_j)sign(Y_i - Y_j) \] where, \[ \begin{equation} sign(u)= \begin{cases} 0 & \text{if } u = 0\\ \dfrac{\left|u\right|}{u} & \text{if } u \neq 0 \end{cases} \end{equation} \]
Let, A := concordant pairs B := discordant pairs
\[ \tau = \dfrac{A - B}{\binom n2} \] In general:
\[ -1 \leq \tau \leq 1\]
We present a visual explanation of choosing such a function as a measure of dependency.
Here we take 1000 sample of size 10 each and calculate the mean and variance of the Kendall’s \(\tau\) based on these 1000 values.
| Theoritical | Observed | |
|---|---|---|
| mean | 0.0000000 | -0.0005778 |
| variance | 0.0617284 | 0.0618136 |
We know that Kendal’s Tao test statistics is distribution free under \(H_0\) .We will verify it here visually.Here we will generate data from various distribution independently and plot the histogram of the Kendal’s \(\tau\) .
\(\sqrt{\dfrac{9n(n-1)}{2(2n+5)}}\tau \overset{\mathcal{d}}{\Longrightarrow} N(0,1)\) as \(n\rightarrow \infty\)
Comment: Good fit
We take n i.i.d samples from \(\mathcal{N}_2(0,0,1,1,\rho)\) test for: \[H_0: \rho = 0 \hspace{0.3cm}vs\hspace{0.3cm}H_a: \rho\not=0\] test statistic: \[\sqrt{\dfrac{9n(n-1)}{2(2n+5)}}\hat\tau \] test rule: Reject \(H_0\) at size 0.05 if \(\left|\sqrt{\dfrac{9n(n-1)}{2(2n+5)}}\hat\tau\right|>\mathcal{z}_{0.025}\)
For fix sample size(n = 30) we plot power function for spearman \(\rho\) and kendall’s \(\tau\) with respect to \(\rho\)
From this plot also we can conclude that testing with these two statistics are equivalent.
\((X_1,Y_1),(X_2,Y_2),\ldots(X_n,Y_n)\overset{iid}{\sim}\mathcal{N}_2(0,0,1,1,\rho)\)
Test for: \[H_0: \rho = 0 \hspace{0.3cm}vs\hspace{0.3cm}H_a: \rho\not=0\] Test Statistics under \(H_0\):
\[T = \dfrac{r\sqrt{n-1}}{\sqrt{1-r^2}}\overset{H_0}\sim t_{n-2} \] Rejection rule: Reject \(H_0\) at size \(\alpha = 0.05\) if \(\left|T_{obs}\right|>t_{0.025,n-2}\)
Samples are taken from bivariate normal and plot the power function of t-test and kendalls \(\tau\) to compare them.
\(\textbf{Comment}\): t-test performs better
\(\textbf{Comment}\): Kendall’s \(\tau\) performs better
\(C(u_1, u_2;\theta)\)= \(uv(1+\theta(1-u)(1-v))\)
where \(u_i \in(0,1)\) for $i=1,2,\, and \(\theta\in(-1,1)\). Here, \(\theta\) is the dependence parameter controlling the tail dependence.
\(C(u_1, u_2, \ldots, u_d;\theta)\) = \(\exp\left(-\left(\sum_{i=1}^{d} (-\ln u_i)^{\theta}\right)^{1/\theta}\right)\)
where \(u_i \in(0,1)\) for \(i=1,2,\ldots,d\), and \(\theta>1\). Here \(\theta\) is the dependence parameter controlling the tail dependence.
\[\sqrt{\dfrac{9n}{4}}\tau\overset{\mathcal{d}}{\Longrightarrow}N(0,1)\hspace{0.4cm} as\hspace{0.4cm} n \rightarrow\infty\]
generating c(0,1) and laplace(0,1)
\(H_0\):\(\rho=0\) vs \(H_1:\rho>0\)
Our very first assumption was the continuous setup. We will see what happens if that said assumption is violated.
\(x_i\overset{iid}{\sim} U(-1,1)\hspace{1cm}i = 1,2,\ldots,30\)
Kendall's rank correlation tau
data: x and y
T = 170, p-value = 0.09369
alternative hypothesis: true tau is not equal to 0
sample estimates:
tau
-0.2183908
comment: independent sample
Kendall's rank correlation tau
data: c(x1, x2) and c(1 - abs(x1), abs(x2) - 1)
z = 0.30614, p-value = 0.7595
alternative hypothesis: true tau is not equal to 0
sample estimates:
tau
0.02711864
comment: independent sample
\(x_i\overset{iid}{\sim} U(-\pi,\pi)\hspace{1cm}i = 1,2,\ldots,40\)
Kendall's rank correlation tau
data: x and y
T = 517, p-value = 0.002807
alternative hypothesis: true tau is not equal to 0
sample estimates:
tau
0.325641
comment: dependent sample
We want to say a big thank you to everyone who helped make this project possible. Special thanks to Subhrangsu, Sourav, Subhendu and to Isha Dewan ma'am for helping us.
Indian Statistical Institute