The following R code was used to generate the results table below:
kable(stats %>%
group_by(Area) %>%
summarize(FDMean = mean(FD,na.rm=FALSE),
FD_SD = sd(FD),
FD_Skewness = skewness(FD),
PWMean = mean(PW,na.rm=FALSE),
PW_SD = sd(PW),
PW_Skewness = skewness(PW)) )
| Area | FDMean | FD_SD | FD_Skewness | PWMean | PW_SD | PW_Skewness |
|---|---|---|---|---|---|---|
| City | 14.825 | 1.737778 | 0.5194790 | 252.300 | 26.13888 | -0.4094382 |
| Metropolitan | 25.225 | 3.206464 | 0.8386320 | 403.050 | 72.06014 | -0.3264501 |
| Town | 6.450 | 1.852926 | 0.3599267 | 148.525 | 15.05884 | -0.1930091 |
The Fixed Deposit for each area are positively skewed, whereas the Personal Wealth are negatively skewed.
The null hypothesis is the means of the Fixed Deposits among the different areas are equal.
\(H_0\) : \(\mu_{0}\) = \(\mu_{1}\) = \(\mu_{2}\)
Breaking it down, the null hypothesis, \(H_0\) consists of three sets of comparisons:
\(\mu_{0}\) = \(\mu_{1}\) or \(\mu_{0}\) - \(\mu_{1}\) = 0
\(\mu_{0}\) = \(\mu_{2}\) or \(\mu_{0}\) - \(\mu_{2}\) = 0
\(\mu_{1}\) = \(\mu_{2}\) or \(\mu_{1}\) - \(\mu_{2}\) = 0
The alternative hypothesis is the means of the Fixed Deposits among the different areas are not equal.
\(H_1\) : \(\mu_{0}\) \(\neq\) \(\mu_{1}\) \(\neq\) \(\mu_{2}\)
The essential statistics for each of the samples are as below:
| Sample 0 (City): | Sample 1 (Metropolitan): | Sample 2 (Town): |
|---|---|---|
| Sample mean, \(\bar{x}_0\) = 14.825 | Sample mean, \(\bar{x}_1\) = 25.225 | Sample mean, \(\bar{x}_2\) = 6.450 |
| Sample size, n = 40 | Sample size, n = 40 | Sample size, n = 40 |
| Sample standard deviation, s = 1.737778 | Sample standard deviation, s = 3.206464 | Sample standard deviation, s = 1.852926 |
| Standard error of the mean, \(\sigma_{m}\) = 0.2747668 | Standard error of the mean, \(\sigma_{m}\) = 0.5069864 | Standard error of the mean, \(\sigma_{m}\) = 0.2929733 |
Use a two tailed F-test.
\(H_0\) : No difference in variances
\(H_a\) : Difference in variances exist
Choose alpha level, \(\alpha\)= 0.05 : 0.025 for two-tailed test.
All samples have size 40, so df: (40,40)
Critical F(40,40): 1.875
| Set 0 (City Vs. Metro): | Set 1 (City Vs. Town): | Set 2 (Metro Vs. Town): |
|---|---|---|
| \(\sigma^2_{0}\) = 3.0198724 | \(\sigma^2_{0}\) = 3.0198724 | \(\sigma^2_{1}\) = 10.2814114 |
| \(\sigma^2_{1}\) = 10.2814114 | \(\sigma^2_{2}\) = 3.4333348 | \(\sigma^2_{2}\) = 3.4333348 |
| F Stat = \(\sigma^2_{1}\)/ \(\sigma^2_{0}\) = 3.4045847 | F Stat = \(\sigma^2_{2}\) / \(\sigma^2_{0}\) = 1.1369139 | F Stat = \(\sigma^2_{1}\) / \(\sigma^2_{2}\) = 2.9945846 |
| 3.4045847 > 1.875 | 1.1369139 < 1.875 | 2.9945846 > 1.875 |
| Reject the null hypothesis: Variances are different. | Failed to reject the null hypothesis: Variances are equal | Reject the null hypothesis: Variances are different. |
Population variance is unknown, so use a t test.
Significance level, \(\alpha\)= 0.05 and a corresponding confidence level of 95%.
Critical t value, \(t_{0.025,78}\) = 1.986
| Comparisons | t statistic | Standard Error | Confidence Interval | p-value |
|---|---|---|---|---|
| Set 0 (City vs Metro) | -18.035 | 0.576656 | (-11.553448, -9.246552) | < 2.2e-16 |
| Set 1 (City vs Town) | 20.851 | 0.7977 | (7.575358, 9.174642) | < 2.2e-16 |
| Set 2 (Metro vs Town) | 32.064 | 0.5855499 | (17.60466, 19.94534) | < 2.2e-16 |
In each of the comparisons:
1. the t statistic is greater than the critical value of t
2. the p-value is significantly less than the chosen significance level
Thus, we reject the null hypothesis for each of the comparisons. The means for each of the comparison sets populations are not equal.
Conducting linear regression for Fixed Deposits on Personal Wealth by area.
Interpretation: There is a negative correlation between PW and FD.
Interpretation: There is a negative correlation between PW and FD, although the slope is as significant as the other areas.
Interpretation: There is a negative correlation between PW and FD, although the slope is as significant as the other areas.
The Linear Regression equation is given as:
y = \(\beta_{0}\) + \(\beta_{1}\)\(x\)
The values were obtained by running the samples through R’s lm method.
| Area | Intercept, \(\beta_{0}\) | Slope, \(\beta_{1}\) | Equation | |
|---|---|---|---|---|
| Metro | 661.131679 | -10.2311865 | \(y\) = 661.131679 -10.2311865 (FD) | |
| City | 271.168754 | -1.2727659 | \(y\) = 271.168754 -1.2727659 (FD) | |
| Town | 162.3233757 | -2.139283 | \(y\) = 162.3233757 -2.139283 (FD) |
Interpretation:
For the Metro area, RM 661,131 is the portion of PW not explained by FD. On average, the average levels of Personal Wealth decreases by RM 10,231 for every additional RM 1,000 in FD.
For the City area, RM 271,168 is the portion of PW not explained by FD. On average, the average levels of Personal Wealth decreases by RM 1,272 for every additional RM 1,000 in FD.
For the Town area, RM 162,323 is the portion of PW not explained by FD. On average, the average levels of Personal Wealth decreases by RM 2,139 for every additional RM 1,000 in FD.
| Area | Pearson coefficient,\(r\) | Coefficient of Determination, \(r^2\) | Spearman rank coefficient, \(r_s\) | |
|---|---|---|---|---|
| Metro | -0.4552576 | 0.2072595 | -0.5277471 | |
| City | -0.0846166 | 0.00716 | -0.0893667 | |
| Town | -0.2632296 | 0.0692898 | -0.27018 |
Interpretation: The Metro area displays the highest R square values, indicating that 20.72% of the variation in Personal Wealth is explained by the variation in Fixed Deposit. The City and Town areas display markedly lower values: 0.71% and 6.92% respectively.
All areas display a negative association, with the Metro area displaying a higher level of negative correlation between FD and PW, followed by Town and City.
The data in the Metro area displays the strongest linear relationship, followed by Town and lastly City.
Critical t-value for \(\alpha\) = 0.05 : +/- 2.0244
Critical F value for df = 1,38 and \(\alpha\) = 0.05 : 4.09817166
| Area | t value | F statistic | Confidence Interval |
|---|---|---|---|
| Metro | -3.152 | 9.935 | -16.8022788, -3.6600941 |
| City | -0.523 | 0.274 | -6.1946981, 3.6491664 |
| Town | -1.682 | 2.829 | -4.7140858, 0.4355197 |
Interpretation: : For Metro, as the t value is < the lower t critical value and F > Critical F value, the slope of the regression model is considered significant and there is sufficient evidence of a linear relationship.
For City, as the t value is not in the rejection region and F < Critical F value, there is no significant regression fit and there is insufficient evidence of a linear relationship.
For Town, as the t value is not in the rejection region and F < Critical F value, there is no significant regression fit and there is insufficient evidence of a linear relationship.
The Metro area displayed a strong linear relationship between Personal Wealth and Fixed Deposits.
The Town and City areas did not fulfill the criterias for a linear relationship and cannot be concluded to be as such.