Question 1

The following R code was used to generate the results table below:

kable(stats %>%
    group_by(Area) %>%
    summarize(FDMean = mean(FD,na.rm=FALSE),
              FD_SD = sd(FD),
              FD_Skewness = skewness(FD),
              PWMean = mean(PW,na.rm=FALSE),
              PW_SD = sd(PW),
              PW_Skewness = skewness(PW)) )
Area FDMean FD_SD FD_Skewness PWMean PW_SD PW_Skewness
City 14.825 1.737778 0.5194790 252.300 26.13888 -0.4094382
Metropolitan 25.225 3.206464 0.8386320 403.050 72.06014 -0.3264501
Town 6.450 1.852926 0.3599267 148.525 15.05884 -0.1930091

Intepretation

The Fixed Deposit for each area are positively skewed, whereas the Personal Wealth are negatively skewed.


Question 2

The null hypothesis is the means of the Fixed Deposits among the different areas are equal.
\(H_0\) : \(\mu_{0}\) = \(\mu_{1}\) = \(\mu_{2}\)

Breaking it down, the null hypothesis, \(H_0\) consists of three sets of comparisons:
\(\mu_{0}\) = \(\mu_{1}\) or \(\mu_{0}\) - \(\mu_{1}\) = 0
\(\mu_{0}\) = \(\mu_{2}\) or \(\mu_{0}\) - \(\mu_{2}\) = 0
\(\mu_{1}\) = \(\mu_{2}\) or \(\mu_{1}\) - \(\mu_{2}\) = 0

The alternative hypothesis is the means of the Fixed Deposits among the different areas are not equal.
\(H_1\) : \(\mu_{0}\) \(\neq\) \(\mu_{1}\) \(\neq\) \(\mu_{2}\)

The essential statistics for each of the samples are as below:

Sample 0 (City): Sample 1 (Metropolitan): Sample 2 (Town):
Sample mean, \(\bar{x}_0\) = 14.825 Sample mean, \(\bar{x}_1\) = 25.225 Sample mean, \(\bar{x}_2\) = 6.450
Sample size, n = 40 Sample size, n = 40 Sample size, n = 40
Sample standard deviation, s = 1.737778 Sample standard deviation, s = 3.206464 Sample standard deviation, s = 1.852926
Standard error of the mean, \(\sigma_{m}\) = 0.2747668 Standard error of the mean, \(\sigma_{m}\) = 0.5069864 Standard error of the mean, \(\sigma_{m}\) = 0.2929733

To test for the equality of the population variances

Use a two tailed F-test.

\(H_0\) : No difference in variances
\(H_a\) : Difference in variances exist
Choose alpha level, \(\alpha\)= 0.05 : 0.025 for two-tailed test.
All samples have size 40, so df: (40,40)
Critical F(40,40): 1.875

Set 0 (City Vs. Metro): Set 1 (City Vs. Town): Set 2 (Metro Vs. Town):
\(\sigma^2_{0}\) = 3.0198724 \(\sigma^2_{0}\) = 3.0198724 \(\sigma^2_{1}\) = 10.2814114
\(\sigma^2_{1}\) = 10.2814114 \(\sigma^2_{2}\) = 3.4333348 \(\sigma^2_{2}\) = 3.4333348
F Stat = \(\sigma^2_{1}\)/ \(\sigma^2_{0}\) = 3.4045847 F Stat = \(\sigma^2_{2}\) / \(\sigma^2_{0}\) = 1.1369139 F Stat = \(\sigma^2_{1}\) / \(\sigma^2_{2}\) = 2.9945846
3.4045847 > 1.875 1.1369139 < 1.875 2.9945846 > 1.875
Reject the null hypothesis: Variances are different. Failed to reject the null hypothesis: Variances are equal Reject the null hypothesis: Variances are different.

To test for the equality of the population means

Population variance is unknown, so use a t test.
Significance level, \(\alpha\)= 0.05 and a corresponding confidence level of 95%.
Critical t value, \(t_{0.025,78}\) = 1.986

Comparisons t statistic Standard Error Confidence Interval p-value
Set 0 (City vs Metro) -18.035 0.576656 (-11.553448, -9.246552) < 2.2e-16
Set 1 (City vs Town) 20.851 0.7977 (7.575358, 9.174642) < 2.2e-16
Set 2 (Metro vs Town) 32.064 0.5855499 (17.60466, 19.94534) < 2.2e-16

Conclusion

In each of the comparisons:
1. the t statistic is greater than the critical value of t
2. the p-value is significantly less than the chosen significance level

Thus, we reject the null hypothesis for each of the comparisons. The means for each of the comparison sets populations are not equal.

Question 3

Conducting linear regression for Fixed Deposits on Personal Wealth by area.

Section (i)

Interpretation: There is a negative correlation between PW and FD.

Interpretation: There is a negative correlation between PW and FD, although the slope is as significant as the other areas.

Interpretation: There is a negative correlation between PW and FD, although the slope is as significant as the other areas.


Section (ii)

The Linear Regression equation is given as:
y = \(\beta_{0}\) + \(\beta_{1}\)\(x\)
The values were obtained by running the samples through R’s lm method.

Area Intercept, \(\beta_{0}\) Slope, \(\beta_{1}\) Equation
Metro 661.131679 -10.2311865 \(y\) = 661.131679 -10.2311865 (FD)
City 271.168754 -1.2727659 \(y\) = 271.168754 -1.2727659 (FD)
Town 162.3233757 -2.139283 \(y\) = 162.3233757 -2.139283 (FD)

Interpretation:
For the Metro area, RM 661,131 is the portion of PW not explained by FD. On average, the average levels of Personal Wealth decreases by RM 10,231 for every additional RM 1,000 in FD.

For the City area, RM 271,168 is the portion of PW not explained by FD. On average, the average levels of Personal Wealth decreases by RM 1,272 for every additional RM 1,000 in FD.

For the Town area, RM 162,323 is the portion of PW not explained by FD. On average, the average levels of Personal Wealth decreases by RM 2,139 for every additional RM 1,000 in FD.


Section (iii)

Area Pearson coefficient,\(r\) Coefficient of Determination, \(r^2\) Spearman rank coefficient, \(r_s\)
Metro -0.4552576 0.2072595 -0.5277471
City -0.0846166 0.00716 -0.0893667
Town -0.2632296 0.0692898 -0.27018

Interpretation: The Metro area displays the highest R square values, indicating that 20.72% of the variation in Personal Wealth is explained by the variation in Fixed Deposit. The City and Town areas display markedly lower values: 0.71% and 6.92% respectively.

All areas display a negative association, with the Metro area displaying a higher level of negative correlation between FD and PW, followed by Town and City.

The data in the Metro area displays the strongest linear relationship, followed by Town and lastly City.


Section (iv)

Critical t-value for \(\alpha\) = 0.05 : +/- 2.0244
Critical F value for df = 1,38 and \(\alpha\) = 0.05 : 4.09817166

Area t value F statistic Confidence Interval
Metro -3.152 9.935 -16.8022788, -3.6600941
City -0.523 0.274 -6.1946981, 3.6491664
Town -1.682 2.829 -4.7140858, 0.4355197

Interpretation: : For Metro, as the t value is < the lower t critical value and F > Critical F value, the slope of the regression model is considered significant and there is sufficient evidence of a linear relationship.

For City, as the t value is not in the rejection region and F < Critical F value, there is no significant regression fit and there is insufficient evidence of a linear relationship.

For Town, as the t value is not in the rejection region and F < Critical F value, there is no significant regression fit and there is insufficient evidence of a linear relationship.


Conclusion

The Metro area displayed a strong linear relationship between Personal Wealth and Fixed Deposits.

The Town and City areas did not fulfill the criterias for a linear relationship and cannot be concluded to be as such.