Overview

This file contains a set of tasks that you need to complete in R for the lab assignment. The tasks may require you to add a code chuck, type code into a chunk, and/or execute code. Some tasks may also ask you to answer specific questions. Don’t forget that you need to acknowledge if you used any resources beyond class materials or got help to complete the assignment.

Additional information and examples relevant to this assignment can be found in the file “PlayingWithDataTutorial.html”.

The data set you will use is different than the one used in the instructions. Pay attention to the differences in the Excel files name, any variable names, and/or object names. You will need to adjust your code accordingly.

Once you have completed the assignment, you will need to knit this R Markdown file to produce an html file. You will then need to upload the .html file and this .Rmd file to AsULearn. Additionally, for this assignment you will upload the Excel file you created.

1. Add your name and the date

The first thing you need to do in this file is to add your name and date in the lines underneath this document’s title (see the code in lines 10 and 11).

getwd()

## [1] "/Users/Saniyaaa/Desktop/PlayingWithDataFall2025"

setwd("/Users/Saniyaaa/Desktop/PlayingWithDataFall2025")

setwd("/Users/Saniyaaa/Desktop/PlayingWithDataFall2025")
library(dplyr)

## 
## Attaching package: 'dplyr'

## The following objects are masked from 'package:stats':
## 
##     filter, lag

## The following objects are masked from 'package:base':
## 
##     intersect, setdiff, setequal, union

2. Getting started

Insert a chunk of code in this section to identify and set your working directory and load packages. We will use the same three packages we did in the last lab: openxlsx, dplyr and tidyverse.

library(openxlsx)
library(dplyr)
library(tidyverse)

## ── Attaching core tidyverse packages ──────────────────────── tidyverse 2.0.0 ──
## ✔ forcats   1.0.0     ✔ readr     2.1.5
## ✔ ggplot2   3.5.2     ✔ stringr   1.5.1
## ✔ lubridate 1.9.4     ✔ tibble    3.3.0
## ✔ purrr     1.1.0     ✔ tidyr     1.3.1
## ── Conflicts ────────────────────────────────────────── tidyverse_conflicts() ──
## ✖ dplyr::filter() masks stats::filter()
## ✖ dplyr::lag()    masks stats::lag()
## ℹ Use the conflicted package (<http://conflicted.r-lib.org/>) to force all conflicts to become errors

3. Load Two Data Sets

Insert a chunk of code in this section to load your data. The Excel file for this assignment has two sheets: grades and attendance. Sheet 1 contains the grades data and Sheet 2 contains the attendance data. You will want to load each sheet into R as separate data objects. The name of the Excel file is different than what is in the instructions. Accordingly, you will need to adjust the code to read in the Excel file that was downloaded as part of the zip file.

GradeBook <- read.xlsx("GradeBook.xlsx" , sheet = 1)
head(GradeBook, 10)

##         X1 Midterm.1 Midterm.2 Assignment.1 Assignment.2 Assignment.3    Final
## 1     Noah  15.00000  12.00000     5.000000     8.000000     5.661442 30.00000
## 2     Jack  11.00478  15.00000     6.771172    10.000000     8.000000 26.00000
## 3    Emily  20.00000  20.00000     8.000000     8.000000     8.154995 20.00000
## 4    Colin  20.00000  17.00000     8.000000     5.000000     8.673615 25.00000
## 5   Hannah  10.00000  17.00000     6.802136     9.604730    10.000000 20.00000
## 6   Aubrie  20.00000  14.00000     5.000000     6.000000     6.000000 17.78453
## 7   Olivia  14.00000  17.72971    10.000000     7.000000     6.000000 26.00000
## 8   Duncan   9.62783  16.00000     7.000000     8.065708     8.000000 18.95910
## 9    Katie  19.00000  12.00000     9.000000     8.000000     8.967217 20.00000
## 10 Jackson  17.00000  15.00000     8.000000     6.000000     2.549882 25.00000

Attendance <- read.xlsx("GradeBook.xlsx" , sheet = 2)
head(Attendance, 10)

##       Name 1 2 3 4 5
## 1     Noah 1 1 1 1 1
## 2     Jack 0 1 1 1 1
## 3    Emily 1 1 0 0 1
## 4    Colin 1 0 1 1 1
## 5   Hannah 1 1 0 1 1
## 6   Aubrie 1 1 1 1 1
## 7   Olivia 1 1 1 1 1
## 8   Duncan 0 1 0 0 1
## 9    Katie 1 1 1 1 1
## 10 Jackson 1 0 1 1 1

4. Take a look at your data

Insert a chunk of code in this section and display the first 15 observations of each data set.

GradeBook <- read.xlsx("GradeBook.xlsx" , sheet = 1)
head(GradeBook, 15)

##          X1 Midterm.1 Midterm.2 Assignment.1 Assignment.2 Assignment.3    Final
## 1      Noah  15.00000  12.00000     5.000000     8.000000     5.661442 30.00000
## 2      Jack  11.00478  15.00000     6.771172    10.000000     8.000000 26.00000
## 3     Emily  20.00000  20.00000     8.000000     8.000000     8.154995 20.00000
## 4     Colin  20.00000  17.00000     8.000000     5.000000     8.673615 25.00000
## 5    Hannah  10.00000  17.00000     6.802136     9.604730    10.000000 20.00000
## 6    Aubrie  20.00000  14.00000     5.000000     6.000000     6.000000 17.78453
## 7    Olivia  14.00000  17.72971    10.000000     7.000000     6.000000 26.00000
## 8    Duncan   9.62783  16.00000     7.000000     8.065708     8.000000 18.95910
## 9     Katie  19.00000  12.00000     9.000000     8.000000     8.967217 20.00000
## 10  Jackson  17.00000  15.00000     8.000000     6.000000     2.549882 25.00000
## 11 Victoria  11.00000   9.93236     8.000000    10.000000     6.701154 26.00000
## 12  Matthew  10.00000  13.00000    10.000000     9.000000    10.000000 26.00000
## 13  Michael   7.00000  11.00000     8.000000    10.000000    10.000000 18.00000
## 14   Olivia  14.00000  12.00000     6.000000     7.000000     8.000000 29.00000
## 15 Samantha   6.00000  10.00000     6.000000     8.000000     6.000000 19.00000

Attendance <- read.xlsx("Gradebook.xlsx" , sheet = 2)
head(Attendance, 15)

##        Name 1 2 3 4 5
## 1      Noah 1 1 1 1 1
## 2      Jack 0 1 1 1 1
## 3     Emily 1 1 0 0 1
## 4     Colin 1 0 1 1 1
## 5    Hannah 1 1 0 1 1
## 6    Aubrie 1 1 1 1 1
## 7    Olivia 1 1 1 1 1
## 8    Duncan 0 1 0 0 1
## 9     Katie 1 1 1 1 1
## 10  Jackson 1 0 1 1 1
## 11 Victoria 1 1 1 1 1
## 12  Matthew 1 1 0 1 0
## 13  Michael 0 1 1 1 1
## 14   Olivia 1 1 1 1 1
## 15 Samantha 1 0 1 1 1

5. Rename Variables

You will need to insert chunks of code and rename variables in your data sets in this section. I recommend trying to do only one thing per chunk of code.

In the attendance data set, you will need to rename the variables that are currently numbers into text. In the instructions, I called each variable Class and then the number of that class, for example Class1. Instead of using the same variable name as I did, you should call each variable a Meeting.

Attendance %>%
  rename(Meeting1 = "1",
         Meeting2 = "2",
         Meeting3 = "3",
         Meeting4 = "4",
         Meeting5 = "5") -> Attendance

head(Attendance, 10)

##       Name Meeting1 Meeting2 Meeting3 Meeting4 Meeting5
## 1     Noah        1        1        1        1        1
## 2     Jack        0        1        1        1        1
## 3    Emily        1        1        0        0        1
## 4    Colin        1        0        1        1        1
## 5   Hannah        1        1        0        1        1
## 6   Aubrie        1        1        1        1        1
## 7   Olivia        1        1        1        1        1
## 8   Duncan        0        1        0        0        1
## 9    Katie        1        1        1        1        1
## 10 Jackson        1        0        1        1        1

In the grade book data set, rename the variables so that they do not have a . in their names.

GradeBook %>%
  rename(Name = "X1", 
         Midterm1 = "Midterm.1",
         Midterm2 = "Midterm.2",
         Assignment1 = "Assignment.1",
         Assignment2 = "Assignment.2",
         Assignment3 = "Assignment.3",
         Final = "Final") -> GradeBook

After renaming the variables, look at the first 15 observations for each data set.

head(GradeBook, 10)

##       Name Midterm1 Midterm2 Assignment1 Assignment2 Assignment3    Final
## 1     Noah 15.00000 12.00000    5.000000    8.000000    5.661442 30.00000
## 2     Jack 11.00478 15.00000    6.771172   10.000000    8.000000 26.00000
## 3    Emily 20.00000 20.00000    8.000000    8.000000    8.154995 20.00000
## 4    Colin 20.00000 17.00000    8.000000    5.000000    8.673615 25.00000
## 5   Hannah 10.00000 17.00000    6.802136    9.604730   10.000000 20.00000
## 6   Aubrie 20.00000 14.00000    5.000000    6.000000    6.000000 17.78453
## 7   Olivia 14.00000 17.72971   10.000000    7.000000    6.000000 26.00000
## 8   Duncan  9.62783 16.00000    7.000000    8.065708    8.000000 18.95910
## 9    Katie 19.00000 12.00000    9.000000    8.000000    8.967217 20.00000
## 10 Jackson 17.00000 15.00000    8.000000    6.000000    2.549882 25.00000

6. Creating New Attendance Variables

In this section, insert chunks and create the following variables in your attendance data set.

Total number of classes attended.
Total number of classes absent. There are a total of 5 classes that students could potentially attend.
Total number of unexcused absences. Students are allowed up to 2 excused absences.
Penalty on grade for unexcused absences. For each unexcused absence, a student’s grade will be penalized 0.5 points. Based on the number of unexcused absences, calculate the total penalty that should be applied to their grade.

Attendance  %>%
  mutate(Present = ("class1 + class2 + class3 + class4 + class5")) -> Attendance

head(Attendance, 15)

##        Name Meeting1 Meeting2 Meeting3 Meeting4 Meeting5
## 1      Noah        1        1        1        1        1
## 2      Jack        0        1        1        1        1
## 3     Emily        1        1        0        0        1
## 4     Colin        1        0        1        1        1
## 5    Hannah        1        1        0        1        1
## 6    Aubrie        1        1        1        1        1
## 7    Olivia        1        1        1        1        1
## 8    Duncan        0        1        0        0        1
## 9     Katie        1        1        1        1        1
## 10  Jackson        1        0        1        1        1
## 11 Victoria        1        1        1        1        1
## 12  Matthew        1        1        0        1        0
## 13  Michael        0        1        1        1        1
## 14   Olivia        1        1        1        1        1
## 15 Samantha        1        0        1        1        1
##                                       Present
## 1  class1 + class2 + class3 + class4 + class5
## 2  class1 + class2 + class3 + class4 + class5
## 3  class1 + class2 + class3 + class4 + class5
## 4  class1 + class2 + class3 + class4 + class5
## 5  class1 + class2 + class3 + class4 + class5
## 6  class1 + class2 + class3 + class4 + class5
## 7  class1 + class2 + class3 + class4 + class5
## 8  class1 + class2 + class3 + class4 + class5
## 9  class1 + class2 + class3 + class4 + class5
## 10 class1 + class2 + class3 + class4 + class5
## 11 class1 + class2 + class3 + class4 + class5
## 12 class1 + class2 + class3 + class4 + class5
## 13 class1 + class2 + class3 + class4 + class5
## 14 class1 + class2 + class3 + class4 + class5
## 15 class1 + class2 + class3 + class4 + class5

After you have completed these calculations, take a look at the first 15 observations in your data set.

7. Create New Grade Variables

In this section, insert chunks and create the following variables in your grade book data set.

Each assignment in class is worth 10 points. Create a new variable for each assignment where you convert the raw score into a percentage.

GradeBook %>% 
  mutate(PerA.1 = (Assignment1/10)*100) -> GradeBook

print(GradeBook$Assignment1)

##  [1]  5.000000  6.771172  8.000000  8.000000  6.802136  5.000000 10.000000
##  [8]  7.000000  9.000000  8.000000  8.000000 10.000000  8.000000  6.000000
## [15]  6.000000  9.000000

GradeBook %>% 
  mutate(PerA.2 = (Assignment2/10)*100) -> GradeBook

print(GradeBook$Assignment2)

##  [1]  8.000000 10.000000  8.000000  5.000000  9.604730  6.000000  7.000000
##  [8]  8.065708  8.000000  6.000000 10.000000  9.000000 10.000000  7.000000
## [15]  8.000000  7.000000

GradeBook %>% 
  mutate(PerA.3 = (Assignment3/10)*100) -> GradeBook

print(GradeBook$Assignment3)

##  [1]  5.661442  8.000000  8.154995  8.673615 10.000000  6.000000  6.000000
##  [8]  8.000000  8.967217  2.549882  6.701154 10.000000 10.000000  8.000000
## [15]  6.000000  7.000000

Each midterm in the class is worth 20 points. Create a new variable for each midterm where you convert the raw score into a percentage.

GradeBook %>% 
  mutate(PerMT.1 = (Midterm1/20)*100) -> GradeBook

print(GradeBook$Midterm1)

##  [1] 15.00000 11.00478 20.00000 20.00000 10.00000 20.00000 14.00000  9.62783
##  [9] 19.00000 17.00000 11.00000 10.00000  7.00000 14.00000  6.00000 11.00000

GradeBook %>% 
  mutate(PerMT.2 = (Midterm2/20)*100) -> GradeBook

print(GradeBook$Midterm2)

##  [1] 12.00000 15.00000 20.00000 17.00000 17.00000 14.00000 17.72971 16.00000
##  [9] 12.00000 15.00000  9.93236 13.00000 11.00000 12.00000 10.00000 12.00000

print(GradeBook$PerMT.1)

##  [1]  75.00000  55.02392 100.00000 100.00000  50.00000 100.00000  70.00000
##  [8]  48.13915  95.00000  85.00000  55.00000  50.00000  35.00000  70.00000
## [15]  30.00000  55.00000

The final exam is worth 30 points. Create a new variable for the final where you convert the raw score into a percentage.

GradeBook %>%
  mutate(PerF = (Final/30)*100) -> GradeBook

print(GradeBook$PerF)

##  [1] 100.00000  86.66667  66.66667  83.33333  66.66667  59.28176  86.66667
##  [8]  63.19701  66.66667  83.33333  86.66667  86.66667  60.00000  96.66667
## [15]  63.33333  93.33333

There are multiple ways one can calculate the overall grade for the class. You are going to calculate the final grade in two different ways.

GradeBook %>%
  mutate(OverallGrade = ((PerA.1+PerA.2+PerA.3+PerMT.1+PerMT.2+PerF)/600*100)) -> GradeBook

You should provide equal weight to each item in the class regardless of the number of points it was originally worth. To do this, you should add together the percentage grades that you calculated and divide by 600 (you have 6 assignments, each one is worth up to 100 points once the grades were converted to percents).
You should weight items based on the number of points each was originally worth. The most straightforward way to do this is to add together the raw scores for each item and then divide by the total number of points possible. You already have the information you need to calculate the total number of points possible because you know how many points each type of assignment is worth and you know how many of each type of assignment is in the grade book.

After you have completed these calucations, take a look at the first 15 observations in your data set.

head(GradeBook, 15)

##        Name Midterm1 Midterm2 Assignment1 Assignment2 Assignment3    Final
## 1      Noah 15.00000 12.00000    5.000000    8.000000    5.661442 30.00000
## 2      Jack 11.00478 15.00000    6.771172   10.000000    8.000000 26.00000
## 3     Emily 20.00000 20.00000    8.000000    8.000000    8.154995 20.00000
## 4     Colin 20.00000 17.00000    8.000000    5.000000    8.673615 25.00000
## 5    Hannah 10.00000 17.00000    6.802136    9.604730   10.000000 20.00000
## 6    Aubrie 20.00000 14.00000    5.000000    6.000000    6.000000 17.78453
## 7    Olivia 14.00000 17.72971   10.000000    7.000000    6.000000 26.00000
## 8    Duncan  9.62783 16.00000    7.000000    8.065708    8.000000 18.95910
## 9     Katie 19.00000 12.00000    9.000000    8.000000    8.967217 20.00000
## 10  Jackson 17.00000 15.00000    8.000000    6.000000    2.549882 25.00000
## 11 Victoria 11.00000  9.93236    8.000000   10.000000    6.701154 26.00000
## 12  Matthew 10.00000 13.00000   10.000000    9.000000   10.000000 26.00000
## 13  Michael  7.00000 11.00000    8.000000   10.000000   10.000000 18.00000
## 14   Olivia 14.00000 12.00000    6.000000    7.000000    8.000000 29.00000
## 15 Samantha  6.00000 10.00000    6.000000    8.000000    6.000000 19.00000
##       PerA.1    PerA.2    PerA.3   PerMT.1   PerMT.2      PerF OverallGrade
## 1   50.00000  80.00000  56.61442  75.00000  60.00000 100.00000     70.26907
## 2   67.71172 100.00000  80.00000  55.02392  75.00000  86.66667     77.40038
## 3   80.00000  80.00000  81.54995 100.00000 100.00000  66.66667     84.70277
## 4   80.00000  50.00000  86.73615 100.00000  85.00000  83.33333     80.84491
## 5   68.02136  96.04730 100.00000  50.00000  85.00000  66.66667     77.62255
## 6   50.00000  60.00000  60.00000 100.00000  70.00000  59.28176     66.54696
## 7  100.00000  70.00000  60.00000  70.00000  88.64856  86.66667     79.21920
## 8   70.00000  80.65708  80.00000  48.13915  80.00000  63.19701     70.33221
## 9   90.00000  80.00000  89.67217  95.00000  60.00000  66.66667     80.22314
## 10  80.00000  60.00000  25.49882  85.00000  75.00000  83.33333     68.13869
## 11  80.00000 100.00000  67.01154  55.00000  49.66180  86.66667     73.05667
## 12 100.00000  90.00000 100.00000  50.00000  65.00000  86.66667     81.94444
## 13  80.00000 100.00000 100.00000  35.00000  55.00000  60.00000     71.66667
## 14  60.00000  70.00000  80.00000  70.00000  60.00000  96.66667     72.77778
## 15  60.00000  80.00000  60.00000  30.00000  50.00000  63.33333     57.22222

8. Create Objects Containing a Single Value

In this section, insert chunks and calculate the mean, minimum, and maximum for 3 different variables (midterm 2, assignment 3, and the final exam) in the grade book data set. Use the variables that report the scores as a percentage that you created.

mean_PerMT1 <- mean(GradeBook$PerMT.1)

print(mean_PerMT1)

## [1] 67.07269

min_PerMT1 <- min(GradeBook$PerMT.1)
print(min_PerMT1)

## [1] 30

max_PerMT1 <- max(GradeBook$PerMT.1)
print(max_PerMT1)

## [1] 100

9. Create Objects Containing Multiple Values

In this section, insert chunks and produce the following objects that will contain values for each variable in the data set.

Using the attendance data, create two objects. One object should contain the total number of students attending a class session. One object should contain the mean number of students attending a class session.
Using the grade book data, create three objects for the mean, min, and max grades for each of the variable in the data set.

attendsum <- sapply(Attendance[ , c('Meeting1', 'Meeting2','Meeting3', 'Meeting4', 'Meeting5')], sum)

attendmean <- sapply(Attendance[ , c('Meeting1', 'Meeting2','Meeting3', 'Meeting4', 'Meeting5')], mean)

attendmean <- sapply(GradeBook[ , c('Assignment1', 'Assignment2','Assignment3', 'Midterm1', 'Midterm2', 'Final')], mean)

attendmin <- sapply(GradeBook[ , c('Assignment1', 'Assignment2','Assignment3', 'Midterm1', 'Midterm2', 'Final')], min)

attendmax <- sapply(GradeBook[ , c('Assignment1', 'Assignment2','Assignment3', 'Midterm1', 'Midterm2', 'Final')], max)

10. Combining Objects

In this section, insert chunks of code that will combine objects together.

Combine the two objects you created using the attendance data set in the last question into a single object. Print the object where you stored these objects.

AttedanceSummary <- rbind(attendsum,attendmean)

## Warning in rbind(attendsum, attendmean): number of columns of result is not a
## multiple of vector length (arg 1)

print(AttedanceSummary)

##            Assignment1 Assignment2 Assignment3 Midterm1 Midterm2    Final
## attendsum    13.000000   13.000000   12.000000 14.00000 15.00000 13.00000
## attendmean    7.535832    7.916902    7.481769 13.41454 13.97888 23.42148

Combine the three objects you created using the grade book data set in the last question into a single object. Print the object where you stored these objects.

GradeBookSummary <- rbind(attendmin, attendmean, attendmax)

print(GradeBookSummary)

##            Assignment1 Assignment2 Assignment3 Midterm1 Midterm2    Final
## attendmin     5.000000    5.000000    2.549882  6.00000  9.93236 17.78453
## attendmean    7.535832    7.916902    7.481769 13.41454 13.97888 23.42148
## attendmax    10.000000   10.000000   10.000000 20.00000 20.00000 30.00000

11. Export Data Sets

In this section, insert a chunk of code to export the grade book data, the attendance data, the summary grade book, and the summary attendance as one Excel file. Make sure to name your data file something different than the Excel file that had the original data that you loaded into R for this assignment.

write.xlsx(AttedanceSummary, file = "AttendanceSummary.xlsx")

write.xlsx(GradeBookSummary, file = "GradesSummary.xlsx")

sheets<- list("Grades" = GradeBook, "Attendance" = Attendance, AttendanceSummary = "AttedanceSummary")

write.xlsx(sheets, file = "combined.xlsx")

12. Did you recieve help?

Enter the names of anyone one that assisted you with completing this lab. If no one helped you complete the assignment, just type out that no one helped you

Caitlyn Fiocchi

13. Did you provide anyone help with completing this lab?

Enter the names of anyone that you assisted with completing this lab. If you did not help anyone, then just type out that you didn’t help anyone.

14. Knit the Document

Click the “Knit” button to publish your work as an html document. This document or file will appear in the folder specified by your working directory. You will need to upload both this RMarkdown file and the html file it produces to AsU Learn to get all of the lab points for this week. Additionally, you need to upload the Excel file that you exported when completing the assignment to get all of the lab points for this week.

PS/CJ3115 Fall 2025: Playing with Data

Saniya Boger

September 24, 2025