5_Capstone_Analysis_[Coy Johnson]

(Introduction) I have a set of data from my class that shows the student # and their scores on 10 assignments. I am needing to organise or show this data in a way to where I can identy students that neeed help or problems with certain assignments. People that will benifit from this would be teachers, students, and administration.

data_hm <- read.csv("data/student_assignment_scores.csv")

head(data_hm)
  Student_ID Assignment_1 Assignment_2 Assignment_3 Assignment_4 Assignment_5
1  Student_1           98           77           52           98           54
2  Student_2           86           89           70           67           58
3  Student_3           50           52           51           54           65
4  Student_4           74           82           87           95           75
5  Student_5           59           91           59           89           92
6  Student_6           85           73           89           89           91
  Assignment_6 Assignment_7 Assignment_8 Assignment_9 Assignment_10
1           87           54           80           82            79
2           80           84           57           87            54
3           55           63           65           92            71
4           69           63           52           90            69
5           64           77           86           71            99
6           83           82           62           75            90
glimpse(data_hm)
Rows: 30
Columns: 11
$ Student_ID    <chr> "Student_1", "Student_2", "Student_3", "Student_4", "Stu…
$ Assignment_1  <int> 98, 86, 50, 74, 59, 85, 67, 98, 96, 73, 56, 85, 74, 86, …
$ Assignment_2  <int> 77, 89, 52, 82, 91, 73, 79, 92, 64, 71, 57, 85, 53, 71, …
$ Assignment_3  <int> 52, 70, 51, 87, 59, 89, 54, 82, 98, 88, 85, 94, 91, 58, …
$ Assignment_4  <int> 98, 67, 54, 95, 89, 89, 70, 85, 85, 88, 91, 67, 76, 62, …
$ Assignment_5  <int> 54, 58, 65, 75, 92, 91, 91, 82, 78, 74, 100, 81, 66, 95,…
$ Assignment_6  <int> 87, 80, 55, 69, 64, 83, 91, 72, 61, 75, 90, 50, 51, 73, …
$ Assignment_7  <int> 54, 84, 63, 63, 77, 82, 86, 84, 75, 80, 94, 92, 81, 66, …
$ Assignment_8  <int> 80, 57, 65, 52, 86, 62, 61, 50, 79, 100, 80, 86, 96, 55,…
$ Assignment_9  <int> 82, 87, 92, 90, 71, 75, 69, 94, 86, 92, 64, 70, 83, 74, …
$ Assignment_10 <int> 79, 54, 71, 69, 99, 90, 60, 77, 87, 91, 58, 72, 75, 100,…

In this data set we have 30 students, 10 assignments, and student grades on each of these assignments. Each students has 10 assignments with a grade for each assignments, as show above. There are no data quality issues found.

average_score_table <- data.frame(
  assignment = names(data_hm)[-1],
  score = colMeans(data_hm[-1], na.rm = TRUE)
)

average_score_table
                 assignment    score
Assignment_1   Assignment_1 78.80000
Assignment_2   Assignment_2 74.16667
Assignment_3   Assignment_3 74.10000
Assignment_4   Assignment_4 76.03333
Assignment_5   Assignment_5 79.96667
Assignment_6   Assignment_6 74.60000
Assignment_7   Assignment_7 73.53333
Assignment_8   Assignment_8 70.86667
Assignment_9   Assignment_9 78.33333
Assignment_10 Assignment_10 77.63333
ggplot(average_score_table, aes(x = assignment, y = score)) +
  geom_col(fill = "#1D9E75", color = "white") +
  labs(
    title = "Average Score per Assignment",
    x = "Assignment",
    y = "Average Score"
  ) +
  theme_minimal() +
  theme(
    plot.title = element_text(size = 16, face = "bold", hjust = 0.5),
    axis.text.x = element_text(angle = 45, hjust = 1)
  )

Average score across all 30 students per assignment

The bar chart above allows me to quickly look at each of the assignments and see if any off them have way lower averages than other. There are no assignments with super lower averages, so it can be inferred that most likely all assignments.


The next visual shows each students individual average. With this visual I can spot students that need more attention or extra help.

library(dplyr)
library(ggplot2)

student_avg <- data_hm %>%
  mutate(Average = rowMeans(select(., starts_with("Assignment")))) %>%
  arrange(Average)

ggplot(student_avg,
       aes(x = reorder(Student_ID, Average),
           y = Average)) +
  geom_col(fill = "steelblue") +
  coord_flip() +
  labs(
    title = "Student Average Scores",
    x = "Student",
    y = "Average Score"
  )


(Summary)

When analysis the data from the students assignment scores there are a few thing we can conclude. The class average of each assignment was above 70%, this tells me that all assignments are doable and can be completed. When looking at the individual student averages we have 3 students that fall under the 70% mark. This stat tells me that the majority of the class is passing, but I might should try to give these three students some more assistance.