Hewayalage Vishva Lahiru Kantha Abeyrathne (s3735195) Kodithuwakku Arachchige Iresh Udara Kaushalya (s3704769)
Last updated: 28 October, 2018
Descriptive Statistics will be used to observe summary of both the situations (before & after) and summary of difference between mean values of both the situations.
Line plot will be used as a visualization tool to identify similarities between both weights.
Further, Box plot will be created with difference of weights in both the situations in order to identify any possible outliers prior to Paired-samples t-test.
Q-Q plot will be in use to observe the normality of the distribution in order to make sure that normality assumption is valid for Paired-sample t-test.
Finally, Paired-Sapmple t-test will be applied for distribution specifying relavant null hypothesis and alternative hypothesis in order to arrive to a decision.
#Descriptive Statistics for weight6weeks and pre.weight
Diet %>%summarise(
Mean_weight6weeks = mean(weight6weeks, na.rm = TRUE),
SD_weight6weeks= sd(weight6weeks, na.rm = TRUE),
Mean_pre.weight = mean(pre.weight, na.rm = TRUE),
SD_pre.weight= sd(pre.weight, na.rm = TRUE),
Mean_Difference= mean(weight6weeks - pre.weight, na.rm = TRUE),
SD_Difference = sd(weight6weeks - pre.weight, na.rm = TRUE),
n = n()
) -> table1
knitr::kable(table1)| Mean_weight6weeks | SD_weight6weeks | Mean_pre.weight | SD_pre.weight | Mean_Difference | SD_Difference | n |
|---|---|---|---|---|---|---|
| 68.68077 | 8.924504 | 72.52564 | 8.723344 | -3.844872 | 2.551478 | 78 |
Created “weight_difference” column by getting the difference of “pre.weight” and “weight6weeks” columns using mutate () function.
Descriptive statistics were generated for “weight_difference” variable.
#Create differences Column (weight_difference)
Diet <- Diet %>% mutate(weight_difference = weight6weeks - pre.weight)
#Descriptive Statistics for Column d
Diet %>% summarise(
Min = min(weight_difference, na.rm = TRUE),
Q1 = quantile(weight_difference, probs = .25, na.rm = TRUE),
Median = median(weight_difference, na.rm = TRUE),
Q3 = quantile(weight_difference, probs = .75, na.rm = TRUE),
Max = max(weight_difference, na.rm = TRUE),
Mean = mean(weight_difference, na.rm = TRUE),
SD = sd(weight_difference, na.rm = TRUE),
IQR = IQR(weight_difference, na.rm = TRUE),
n = n(),
Missing = sum(is.na(weight_difference))
) -> table2
knitr::kable(table2)| Min | Q1 | Median | Q3 | Max | Mean | SD | IQR | n | Missing |
|---|---|---|---|---|---|---|---|---|---|
| -9.2 | -5.55 | -3.6 | -2 | 2.1 | -3.844872 | 2.551478 | 3.55 | 78 | 0 |
#line plot Visualization
matplot(t(data.frame(Diet$weight6weeks, Diet$pre.weight)),
type = "b",
pch = 19,
col = 1,
lty = 1,
xlab = "",
ylab = "Weight",
xaxt = "n"
)
axis(1, at = 1:2, labels = c("After 6 weeks", "Before"))#Outliers
boxplot(Diet$weight_difference)#Check normalaity of the differences using Q-Q plot
qqPlot(Diet$weight_difference, dist="norm")## [1] 17 77
Sigificant level will be taken as 0.05.
Null Hypothesis :
\[H_0: \mu_1 = \mu_2 \]
Alternative Hypothesis :
\[H_A: \mu_1 \ne \mu_2\]
#Calculation of the paired sample t-test
t.test(Diet$weight6weeks, Diet$pre.weight,
paired = TRUE,
alternative = "two.sided")##
## Paired t-test
##
## data: Diet$weight6weeks and Diet$pre.weight
## t = -13.309, df = 77, p-value < 2.2e-16
## alternative hypothesis: true difference in means is not equal to 0
## 95 percent confidence interval:
## -4.420141 -3.269602
## sample estimates:
## mean of the differences
## -3.844872
Main finding of the investigation is that weights before the diet plan and weights after the diet plan are significantly different with the rejection of null hypothesis in the paired t-test.
It can be observed that After the diet plan, weights of the people tend to get decreased after 6 weeks.
Therefore it can be stated that diet plan has a impact in reducing the weight of the people.
Answer would be ‘Yes’ for the question, “Does diet plan reduce the weight of a person?”.
Major strenght of the investigation is the accuracy of the dataset. All the weights were taken from real world sample population.
But still there can be limitations with this investigation and further improvements might be required.
One limitation of this investigation is that size of the sample which is 78 records. Sample size can be improved as an improvement to make sure that sample distribution will be highly normaly distributed.