This is an R Markdown document. Markdown is a simple formatting syntax for authoring HTML, PDF, and MS Word documents. For more details on using R Markdown see http://rmarkdown.rstudio.com.
When you click the Knit button a document will be generated that includes both content as well as the output of any embedded R code chunks within the document. You can embed an R code chunk like this:
summary(cars)
## speed dist
## Min. : 4.0 Min. : 2.00
## 1st Qu.:12.0 1st Qu.: 26.00
## Median :15.0 Median : 36.00
## Mean :15.4 Mean : 42.98
## 3rd Qu.:19.0 3rd Qu.: 56.00
## Max. :25.0 Max. :120.00
You can also embed plots, for example:
Note that the echo = FALSE parameter was added to the
code chunk to prevent printing of the R code that generated the
plot.
We have 4 sections
DatasetA <- read_excel(“/Users/ha113ab/Desktop/datasets/DatasetA.xlsx”)—- we are reading the datasets from the folder that it was saved in accessing it through path.
In The First Section We are doing a Descriptive Statistics: here we are calculating means and standard deviations.
mean(DatasetA\(StudyHours) ---- finds the average study hours sd(DatasetA\)StudyHours) —- finds how spread out the study hours are. The same concepts follows for ExamScore
Then we displayed it through
hist(DatasetA$StudyHours, main = “StudyHours”, breaks = 20, col = “orange”, border = “black”, cex.main = 1, cex.axis = 1, cex.lab = 1)
AND
hist(DatasetA$ExamScore, main = “ExamScore”, breaks = 20, col = “grey”, border = “white”, cex.main = 1, cex.axis = 1, cex.lab = 1)
In The Second Section We are doing a Normality Tests: here we are checking if the data is normally distributed, or bell-shaped. shapiro_study <- shapiro.test(DatasetA$StudyHours) tests if study hours follow a normal distribution. The same test is used for exam scores.
In The Third Section We are doing a Correlational Analysis: here we are checking relationships between variables. cor.test(DatasetA\(StudyHours, DatasetA\)ExamScore, method = “spearman”) checks the same but does not assume a straight-line relationship.
In The Fourth and Finak Sectionwe are basucally visualizing where hist() creates bar charts to show distributions. The first histogram displays how many students studied different amounts of hours. The second histogram shows how many students received different exam scores. ggscatter() creates a scatterplot with dots for each student and a trend line.