Instruction
Save this file as
“Math_2305_data_science_assignment_your_first_name_last_name.Rmd” (e.g.,
assignment4_jonny_appleseed.Rmd).
For each task, provide appropriate R command(s) in the code
chunk, and execute the code chunk to generate an outcome.
After completing all tasks, save the your Rmd file, and produce
an HTML report. 3a. Make sure to delete all intermediate code chunks
before creating an HTML report.
Submit your Rmd file and the rendered HTML report to D2L by its
due date.
1. Load the data file (5 points)
- Load
Math 2305 Data Science Assignment_data.csv and
save it in an R object so that you can use in the subsequent analysis.
Use tidyverse package in this RMD file as you will use data
science tools and techniques when analyzing the data.
Examine the loaded data set
2. How many rows and columns does it have? (3 points)
## [1] 77 17
3. Examine the first several rows of the data sets (3 points)
4. Compute mean and standard deviation of teachers,
compstu, and testscr values (4 point)
5. Create a scatter plot of readscr
vs. mathscr (5 point)
- Use
readscr for the X axis
- Use “Reading score” for the X axis label
- Use
mathscr for the Y axis
- Use “Math score” for the Y axis label
- Show a best fit line
## `geom_smooth()` using formula = 'y ~ x'

6. Compute a correlation between readscr and
mathscr (30 points)
##
## Pearson's product-moment correlation
##
## data: df$readscr and df$mathscr
## t = 21.843, df = 75, p-value < 2.2e-16
## alternative hypothesis: true correlation is not equal to 0
## 95 percent confidence interval:
## 0.8911715 0.9547822
## sample estimates:
## cor
## 0.9295991
- State null hypothesis and alternative hypothesis
My Response:
Ho: There is no linear correlation between
Reading Scores and Math Scores
Ha: There is a nonzero linear positive/negative correlation between
Reading Scores and Math Scores
- Report the test statistics, degree of freedom and statistical
significance
My Response:
The t-statistic for the observed correlation is approximately t =
21.843. In addition, the test has 75 degrees of freedom. Furthermore,
regarding the statistical significance, the p - value is extremely
small, approximately 2.2e-16. Given this p-value, there’s compelling
evidence against the null hypothesis of no correlation, leading us to
reject Ho, which states there is no linear correlation between reading
and math scores. This suggests that the observed strong positive
correlation between readscr and mathscr is
statistically significant and is not a result of random variations in
our sample.
- Describe the meaning of p-value in words
My Response:
The p-value tells us the probability that the observed relationship
between readscr and mathscr could have
happened due to random occurrences in the dataset. A small p-value (like
less than 0.05) suggests that the observed relationship has statistical
significance; this means that we can reasonably generalize this
relationship to the broader population, implying it’s not just a result
of random occurrences in our sample.
7. Report the findings from your correlation analysis (30
points)
- Write a sentence describing the findings
- Make sure to include all important information
My Response:
From our analysis, the test statistic is t = 21.843 with a degree of
freedom of 75, and a p -value of < 2.2e-16. Given we are using a 95%
confidence interval, our significance level (alpha) is 0.05. Since the
p-value is much smaller than alpha, we reject the null hypothesis (Ho)
in favor of the alternative hypothesis (Ha). As mentioned, this suggests
a statistically significant positive linear relationship between reading
scores and math scores (r = 0.9295991, p = 2.2e-16 < .05)
8. Create an HTML report of your correlation analysis
Note: Professor reserves the right to decide what answers, code, and
step-processs is correct not the student. Once the student submits the
assignment, they are not able to resubmit for a higher grade and all
grades are final when the professor inserts them in D2L.
Convert to HTML for 20 points and submit both the markdown file
(.rmd) and HTML to the assignment folder in D2L.
You will have a total of 100 points for this assignment.