Introdction

For this project I chose to find the data facing the 5 most popular history classes offered here at Sewanee. I am very interested in history as I am thinking of minoring in it. This is important for me as It shows me what classes I should think of taking when it comes to registering for classes.

Data

-I acquired all of this data from the university registrar via csv. This is the data I used to be able to use read.csv as well as the data I used to load the data from the registrar:

library(tidyverse)
course_data_raw <- read_csv("data/course_data.csv")

Data Wrangling

For this project I had to find a way to soley get a few columns as well as only a few variables.

Filter -I used the filter function to only show me subjects in the data such as History

more_filter <- filter(course_data_raw , subj == "HIST")

Select

-I also realized I had far too many columns so I knew I had to use select to size down my data chart and get only needed variables. -

ten_select <- select(more_filter , title, subj , limit, enrolled , available)

Arrange and Slice

-I then knew I needed to use the arrange and slice functions to get our data set down to show me only the top 5 most popular classes. I also realized that I needed to see the data in order so i used desc to have more data more organized and readable. (I tried 2 different variables for arrange that is why there are 2) -

more_arrange <- arrange (ten_select , by = desc(enrolled))
most_arrange <- arrange (more_arrange , by = desc(available))

ten_slice <- slice(most_arrange, 1:10)

Mutate

-After I got an easier to read data set I chose to mutate my data a little bit to combine some data sets as well as make a new column with the finalized data that I would need. -

ten_mutate <- ten_slice %>%
  mutate(empty_seats = limit - enrolled)
  
  mutate_ten <- ten_mutate %>% select(-available,)

Group by and Summarize

-Finally I ran into a problem. I was originally trying to find the 10 most popular classes, but my data set was being combined and adding together, making my graph not accurate. I had to group by and summarize as well as cut my data down that way I could have good data. Usually you would put this in between select and arrange, but since I had not run into this problem yet, this happened to be the last tweak I made to my data (for group by I used the data set final_ten but it would not let me knit until I put it as course_dat_raw)

final_ten <- course_data_raw |>
  group_by(title) |>
  summarize(limit_max =max(limit))

Visulation

Bar Chart

-I was finally able to start on my bar chart! My bar chart features the names of the classes as well as the maximum amount each class has space for. -The higher the number the class shows the more people are in the class. All classes shown were almost entirely full. The higher the number - the more popular the class

ggplot(final_ten, aes(y = reorder(limit_max, title), x = title , fill = title)) + 
  geom_bar(stat = "identity") +
  labs(
    x = "Name of the classes" ,
    y = "Maximum amount the classes hold" , 
    title = "Top 5 most popular classes at Sewanee" , 
    subtitle = "Limit each class holds including if each class is filled to the limit " , 
    caption = "Data was extracted in csv format from University Registrar"
    )+ 
  theme(text = element_text(size=12))

##Conclusion

In conclusion,this was a very insightful class. As a first year this is important and very helpful because I have not yet gone through the process of registering for classes. This will give me a little more insight and help prepare me a little better that way I feel equipped to choose the best classes for me. As I previously stated, I am interested in history classes, so this will show me the best classes, which may be taught by a good professor or just very interesting. I know know if I am interested in taking any of these classes, which ones to take first.