data-collection-and-prep

‘— title: ’Introduction to R: Basic Syntax and Data Manipulation’ subtitle: “Learning Analytics I” author: Nicole Jayne date: today format: html: toc: true toc-depth: 4 toc-location: right theme: light: simplex dark: cyborg editor: markdown: wrap: 72 —

Learning Objectives

By the end of this lesson, you will be able to:

- Use R basic syntax.

- Load and save files in R.

- Clean and manipulate data.

Part 1: Setup and Loading Data

First, we need a library called tidyverse to work on our project. If you get an error message when you click play button , that means you don’t have tidyverse in your work environment. You can then write install.packages("tidyverse") in the Console area below before leading the library.

Using the dataset(s) provided in the data folder in the Files pane, your goal for this module is to familiarize yourself with your data.

Or alternatively, you may use your own data set. If you do decide to use your own data set you must : Load data to start working with data. To do so, you will need to load it into R.

Part 2: Wrangle

For the practice with me we will use sci-online-classes.csv.

In the chunk called read-data below:

Import the sci-online-classes.csv from the data folder (check the files pane on the right side->) and save as a new object called data.
Then inspect your data using a function of your choice.

Also, in R, a pound sign(#) indicates a comment that will not be evaluated. You can use as many times as you want.

# Import/load the dataset
data <- read_csv("data/sci-online-classes.csv")

Rows: 603 Columns: 30
── Column specification ────────────────────────────────────────────────────────
Delimiter: ","
chr  (6): course_id, subject, semester, section, Gradebook_Item, Gender
dbl (23): student_id, total_points_possible, total_points_earned, percentage...
lgl  (1): Grade_Category

ℹ Use `spec()` to retrieve the full column specification for this data.
ℹ Specify the column types or set `show_col_types = FALSE` to quiet this message.

# Inspect your data (feel free to use head(), str(), or summary())
head(data)

# A tibble: 6 × 30
  student_id course_id     total_points_possible total_points_earned
       <dbl> <chr>                         <dbl>               <dbl>
1      43146 FrScA-S216-02                  3280                2220
2      44638 OcnA-S116-01                   3531                2672
3      47448 FrScA-S216-01                  2870                1897
4      47979 OcnA-S216-01                   4562                3090
5      48797 PhysA-S116-01                  2207                1910
6      51943 FrScA-S216-03                  4208                3596
# ℹ 26 more variables: percentage_earned <dbl>, subject <chr>, semester <chr>,
#   section <chr>, Gradebook_Item <chr>, Grade_Category <lgl>,
#   FinalGradeCEMS <dbl>, Points_Possible <dbl>, Points_Earned <dbl>,
#   Gender <chr>, q1 <dbl>, q2 <dbl>, q3 <dbl>, q4 <dbl>, q5 <dbl>, q6 <dbl>,
#   q7 <dbl>, q8 <dbl>, q9 <dbl>, q10 <dbl>, TimeSpent <dbl>,
#   TimeSpent_hours <dbl>, TimeSpent_std <dbl>, int <dbl>, pc <dbl>, uv <dbl>

# TRY OTHER FUNCTIONS BELOW (e.g., head(data))
head(data)

# A tibble: 6 × 30
  student_id course_id     total_points_possible total_points_earned
       <dbl> <chr>                         <dbl>               <dbl>
1      43146 FrScA-S216-02                  3280                2220
2      44638 OcnA-S116-01                   3531                2672
3      47448 FrScA-S216-01                  2870                1897
4      47979 OcnA-S216-01                   4562                3090
5      48797 PhysA-S116-01                  2207                1910
6      51943 FrScA-S216-03                  4208                3596
# ℹ 26 more variables: percentage_earned <dbl>, subject <chr>, semester <chr>,
#   section <chr>, Gradebook_Item <chr>, Grade_Category <lgl>,
#   FinalGradeCEMS <dbl>, Points_Possible <dbl>, Points_Earned <dbl>,
#   Gender <chr>, q1 <dbl>, q2 <dbl>, q3 <dbl>, q4 <dbl>, q5 <dbl>, q6 <dbl>,
#   q7 <dbl>, q8 <dbl>, q9 <dbl>, q10 <dbl>, TimeSpent <dbl>,
#   TimeSpent_hours <dbl>, TimeSpent_std <dbl>, int <dbl>, pc <dbl>, uv <dbl>

Remember, once the chunk is ready, click play button on the top right of the chunk.

Exploring Data

Before cleaning, let’s explore the dataset.

# Check the structure of the data.
str(data)

spc_tbl_ [603 × 30] (S3: spec_tbl_df/tbl_df/tbl/data.frame)
 $ student_id           : num [1:603] 43146 44638 47448 47979 48797 ...
 $ course_id            : chr [1:603] "FrScA-S216-02" "OcnA-S116-01" "FrScA-S216-01" "OcnA-S216-01" ...
 $ total_points_possible: num [1:603] 3280 3531 2870 4562 2207 ...
 $ total_points_earned  : num [1:603] 2220 2672 1897 3090 1910 ...
 $ percentage_earned    : num [1:603] 0.677 0.757 0.661 0.677 0.865 ...
 $ subject              : chr [1:603] "FrScA" "OcnA" "FrScA" "OcnA" ...
 $ semester             : chr [1:603] "S216" "S116" "S216" "S216" ...
 $ section              : chr [1:603] "02" "01" "01" "01" ...
 $ Gradebook_Item       : chr [1:603] "POINTS EARNED & TOTAL COURSE POINTS" "ATTEMPTED" "POINTS EARNED & TOTAL COURSE POINTS" "POINTS EARNED & TOTAL COURSE POINTS" ...
 $ Grade_Category       : logi [1:603] NA NA NA NA NA NA ...
 $ FinalGradeCEMS       : num [1:603] 93.5 81.7 88.5 81.9 84 ...
 $ Points_Possible      : num [1:603] 5 10 10 5 438 5 10 10 443 5 ...
 $ Points_Earned        : num [1:603] NA 10 NA 4 399 NA NA 10 425 2.5 ...
 $ Gender               : chr [1:603] "M" "F" "M" "M" ...
 $ q1                   : num [1:603] 5 4 5 5 4 NA 5 3 4 NA ...
 $ q2                   : num [1:603] 4 4 4 5 3 NA 5 3 3 NA ...
 $ q3                   : num [1:603] 4 3 4 3 3 NA 3 3 3 NA ...
 $ q4                   : num [1:603] 5 4 5 5 4 NA 5 3 4 NA ...
 $ q5                   : num [1:603] 5 4 5 5 4 NA 5 3 4 NA ...
 $ q6                   : num [1:603] 5 4 4 5 4 NA 5 4 3 NA ...
 $ q7                   : num [1:603] 5 4 4 4 4 NA 4 3 3 NA ...
 $ q8                   : num [1:603] 5 5 5 5 4 NA 5 3 4 NA ...
 $ q9                   : num [1:603] 4 4 3 5 NA NA 5 3 2 NA ...
 $ q10                  : num [1:603] 5 4 5 5 3 NA 5 3 5 NA ...
 $ TimeSpent            : num [1:603] 1555 1383 860 1599 1482 ...
 $ TimeSpent_hours      : num [1:603] 25.9 23 14.3 26.6 24.7 ...
 $ TimeSpent_std        : num [1:603] -0.181 -0.308 -0.693 -0.148 -0.235 ...
 $ int                  : num [1:603] 5 4.2 5 5 3.8 4.6 5 3 4.2 NA ...
 $ pc                   : num [1:603] 4.5 3.5 4 3.5 3.5 4 3.5 3 3 NA ...
 $ uv                   : num [1:603] 4.33 4 3.67 5 3.5 ...
 - attr(*, "spec")=
  .. cols(
  ..   student_id = col_double(),
  ..   course_id = col_character(),
  ..   total_points_possible = col_double(),
  ..   total_points_earned = col_double(),
  ..   percentage_earned = col_double(),
  ..   subject = col_character(),
  ..   semester = col_character(),
  ..   section = col_character(),
  ..   Gradebook_Item = col_character(),
  ..   Grade_Category = col_logical(),
  ..   FinalGradeCEMS = col_double(),
  ..   Points_Possible = col_double(),
  ..   Points_Earned = col_double(),
  ..   Gender = col_character(),
  ..   q1 = col_double(),
  ..   q2 = col_double(),
  ..   q3 = col_double(),
  ..   q4 = col_double(),
  ..   q5 = col_double(),
  ..   q6 = col_double(),
  ..   q7 = col_double(),
  ..   q8 = col_double(),
  ..   q9 = col_double(),
  ..   q10 = col_double(),
  ..   TimeSpent = col_double(),
  ..   TimeSpent_hours = col_double(),
  ..   TimeSpent_std = col_double(),
  ..   int = col_double(),
  ..   pc = col_double(),
  ..   uv = col_double()
  .. )
 - attr(*, "problems")=<externalptr>

# Summary statistics
summary(data)

   student_id     course_id         total_points_possible total_points_earned
 Min.   :43146   Length:603         Min.   :  840         Min.   :  651      
 1st Qu.:85612   Class :character   1st Qu.: 2810         1st Qu.: 2050      
 Median :88340   Mode  :character   Median : 3583         Median : 2757      
 Mean   :86070                      Mean   : 4274         Mean   : 3245      
 3rd Qu.:92730                      3rd Qu.: 5069         3rd Qu.: 3875      
 Max.   :97441                      Max.   :15552         Max.   :12208      
                                                                             
 percentage_earned   subject            semester           section         
 Min.   :0.3384    Length:603         Length:603         Length:603        
 1st Qu.:0.7047    Class :character   Class :character   Class :character  
 Median :0.7770    Mode  :character   Mode  :character   Mode  :character  
 Mean   :0.7577                                                            
 3rd Qu.:0.8262                                                            
 Max.   :0.9106                                                            
                                                                           
 Gradebook_Item     Grade_Category FinalGradeCEMS   Points_Possible 
 Length:603         Mode:logical   Min.   :  0.00   Min.   :  5.00  
 Class :character   NA's:603       1st Qu.: 71.25   1st Qu.: 10.00  
 Mode  :character                  Median : 84.57   Median : 10.00  
                                   Mean   : 77.20   Mean   : 76.87  
                                   3rd Qu.: 92.10   3rd Qu.: 30.00  
                                   Max.   :100.00   Max.   :935.00  
                                   NA's   :30                       
 Points_Earned       Gender                q1              q2       
 Min.   :  0.00   Length:603         Min.   :1.000   Min.   :1.000  
 1st Qu.:  7.00   Class :character   1st Qu.:4.000   1st Qu.:3.000  
 Median : 10.00   Mode  :character   Median :4.000   Median :4.000  
 Mean   : 68.63                      Mean   :4.296   Mean   :3.629  
 3rd Qu.: 26.12                      3rd Qu.:5.000   3rd Qu.:4.000  
 Max.   :828.20                      Max.   :5.000   Max.   :5.000  
 NA's   :92                          NA's   :123     NA's   :126    
       q3              q4              q5              q6       
 Min.   :1.000   Min.   :1.000   Min.   :2.000   Min.   :1.000  
 1st Qu.:3.000   1st Qu.:4.000   1st Qu.:4.000   1st Qu.:4.000  
 Median :3.000   Median :4.000   Median :4.000   Median :4.000  
 Mean   :3.327   Mean   :4.268   Mean   :4.191   Mean   :4.008  
 3rd Qu.:4.000   3rd Qu.:5.000   3rd Qu.:5.000   3rd Qu.:5.000  
 Max.   :5.000   Max.   :5.000   Max.   :5.000   Max.   :5.000  
 NA's   :123     NA's   :125     NA's   :127     NA's   :127    
       q7              q8              q9             q10       
 Min.   :1.000   Min.   :1.000   Min.   :1.000   Min.   :1.000  
 1st Qu.:3.000   1st Qu.:4.000   1st Qu.:3.000   1st Qu.:4.000  
 Median :4.000   Median :4.000   Median :4.000   Median :4.000  
 Mean   :3.907   Mean   :4.289   Mean   :3.487   Mean   :4.101  
 3rd Qu.:4.750   3rd Qu.:5.000   3rd Qu.:4.000   3rd Qu.:5.000  
 Max.   :5.000   Max.   :5.000   Max.   :5.000   Max.   :5.000  
 NA's   :129     NA's   :129     NA's   :129     NA's   :129    
   TimeSpent       TimeSpent_hours    TimeSpent_std          int       
 Min.   :   0.45   Min.   :  0.0075   Min.   :-1.3280   Min.   :2.000  
 1st Qu.: 851.90   1st Qu.: 14.1983   1st Qu.:-0.6996   1st Qu.:3.900  
 Median :1550.91   Median : 25.8485   Median :-0.1837   Median :4.200  
 Mean   :1799.75   Mean   : 29.9959   Mean   : 0.0000   Mean   :4.219  
 3rd Qu.:2426.09   3rd Qu.: 40.4348   3rd Qu.: 0.4623   3rd Qu.:4.700  
 Max.   :8870.88   Max.   :147.8481   Max.   : 5.2188   Max.   :5.000  
 NA's   :5         NA's   :5          NA's   :5         NA's   :76     
       pc              uv       
 Min.   :1.500   Min.   :1.000  
 1st Qu.:3.000   1st Qu.:3.333  
 Median :3.500   Median :3.667  
 Mean   :3.608   Mean   :3.719  
 3rd Qu.:4.000   3rd Qu.:4.167  
 Max.   :5.000   Max.   :5.000  
 NA's   :75      NA's   :75

Question: What have you found from the dataset? Share one or two things you found interesting (e.g., missing data, data they collected, type of the variables, repetitive data)

[Some things that I found interesting within the data set is that there is missing data. The missing data is represented by NA. Another thing that I found interesting was the data gathered. We are only focusing on using 1st Qu. and 3rd Qu.information as variables.]

Cleaning Data

Let’s handle missing values and remove any irrelevant columns.

# Remove rows with missing values
data_clean <- na.omit(data)

# Check the first couple of rows
head(data_clean)

# A tibble: 0 × 30
# ℹ 30 variables: student_id <dbl>, course_id <chr>,
#   total_points_possible <dbl>, total_points_earned <dbl>,
#   percentage_earned <dbl>, subject <chr>, semester <chr>, section <chr>,
#   Gradebook_Item <chr>, Grade_Category <lgl>, FinalGradeCEMS <dbl>,
#   Points_Possible <dbl>, Points_Earned <dbl>, Gender <chr>, q1 <dbl>,
#   q2 <dbl>, q3 <dbl>, q4 <dbl>, q5 <dbl>, q6 <dbl>, q7 <dbl>, q8 <dbl>,
#   q9 <dbl>, q10 <dbl>, TimeSpent <dbl>, TimeSpent_hours <dbl>, …

Question: What did you discover from the ‘data_clean’? Why do you think this doesn’t work?

[When I tried to execute the ‘data_clean’ nothing happened. The reason I think that nothing occurred was because the information was not specific enough.]

Let’s try again with a specific variable. Select one variable from the dataset that you think should be deleted if the value of the variable is missing.

# For example, when you look at the value of FinalGradeCEMS, there are many 'NA's. 
# Let's remove rows where FinalGradeCEMS is missing
data_clean <- data %>%
  filter(!is.na(FinalGradeCEMS))

# Check the result and see whether FinalGradeCEMS has any more missing data
# TYPE YOUR CODE BELOW. USE HEAD() FUNCTION.
head(data)

# A tibble: 6 × 30
  student_id course_id     total_points_possible total_points_earned
       <dbl> <chr>                         <dbl>               <dbl>
1      43146 FrScA-S216-02                  3280                2220
2      44638 OcnA-S116-01                   3531                2672
3      47448 FrScA-S216-01                  2870                1897
4      47979 OcnA-S216-01                   4562                3090
5      48797 PhysA-S116-01                  2207                1910
6      51943 FrScA-S216-03                  4208                3596
# ℹ 26 more variables: percentage_earned <dbl>, subject <chr>, semester <chr>,
#   section <chr>, Gradebook_Item <chr>, Grade_Category <lgl>,
#   FinalGradeCEMS <dbl>, Points_Possible <dbl>, Points_Earned <dbl>,
#   Gender <chr>, q1 <dbl>, q2 <dbl>, q3 <dbl>, q4 <dbl>, q5 <dbl>, q6 <dbl>,
#   q7 <dbl>, q8 <dbl>, q9 <dbl>, q10 <dbl>, TimeSpent <dbl>,
#   TimeSpent_hours <dbl>, TimeSpent_std <dbl>, int <dbl>, pc <dbl>, uv <dbl>

Or, delete unnecessary columns.

# Remove unnecessary columns by using a minus sign (-) followed by the column names.
data_delete <- select(data, -Grade_Category)

# Check the data again to see if the column has been deleted.
# TYPE YOUR CODE
head(data_clean)

# A tibble: 6 × 30
  student_id course_id     total_points_possible total_points_earned
       <dbl> <chr>                         <dbl>               <dbl>
1      43146 FrScA-S216-02                  3280                2220
2      44638 OcnA-S116-01                   3531                2672
3      47448 FrScA-S216-01                  2870                1897
4      47979 OcnA-S216-01                   4562                3090
5      48797 PhysA-S116-01                  2207                1910
6      52326 AnPhA-S216-01                  4325                2255
# ℹ 26 more variables: percentage_earned <dbl>, subject <chr>, semester <chr>,
#   section <chr>, Gradebook_Item <chr>, Grade_Category <lgl>,
#   FinalGradeCEMS <dbl>, Points_Possible <dbl>, Points_Earned <dbl>,
#   Gender <chr>, q1 <dbl>, q2 <dbl>, q3 <dbl>, q4 <dbl>, q5 <dbl>, q6 <dbl>,
#   q7 <dbl>, q8 <dbl>, q9 <dbl>, q10 <dbl>, TimeSpent <dbl>,
#   TimeSpent_hours <dbl>, TimeSpent_std <dbl>, int <dbl>, pc <dbl>, uv <dbl>

Part 3: Manipulating Data

We’ll now perform some basic data manipulation, such as selecting, filtering, and creating new variables.

Task: Use the select() function to select student_id, subject, semester, and FinalGradeCEMS. Assign the result to a new object.

Hint If you want to ONLY select and save student_id and FinalGradeCEMS, you can use, select (data, student_id, FinalGradeCEMS)

# Now TYPE YOUR CODE here for the task mentioned above.
selected_data <- select(data, student_id)

# INSPECT YOUR DATA (USE SUMMARY() FUNCTION)
summary(data, student_id)

   student_id     course_id         total_points_possible total_points_earned
 Min.   :43146   Length:603         Min.   :  840         Min.   :  651      
 1st Qu.:85612   Class :character   1st Qu.: 2810         1st Qu.: 2050      
 Median :88340   Mode  :character   Median : 3583         Median : 2757      
 Mean   :86070                      Mean   : 4274         Mean   : 3245      
 3rd Qu.:92730                      3rd Qu.: 5069         3rd Qu.: 3875      
 Max.   :97441                      Max.   :15552         Max.   :12208      
                                                                             
 percentage_earned   subject            semester           section         
 Min.   :0.3384    Length:603         Length:603         Length:603        
 1st Qu.:0.7047    Class :character   Class :character   Class :character  
 Median :0.7770    Mode  :character   Mode  :character   Mode  :character  
 Mean   :0.7577                                                            
 3rd Qu.:0.8262                                                            
 Max.   :0.9106                                                            
                                                                           
 Gradebook_Item     Grade_Category FinalGradeCEMS   Points_Possible 
 Length:603         Mode:logical   Min.   :  0.00   Min.   :  5.00  
 Class :character   NA's:603       1st Qu.: 71.25   1st Qu.: 10.00  
 Mode  :character                  Median : 84.57   Median : 10.00  
                                   Mean   : 77.20   Mean   : 76.87  
                                   3rd Qu.: 92.10   3rd Qu.: 30.00  
                                   Max.   :100.00   Max.   :935.00  
                                   NA's   :30                       
 Points_Earned       Gender                q1              q2       
 Min.   :  0.00   Length:603         Min.   :1.000   Min.   :1.000  
 1st Qu.:  7.00   Class :character   1st Qu.:4.000   1st Qu.:3.000  
 Median : 10.00   Mode  :character   Median :4.000   Median :4.000  
 Mean   : 68.63                      Mean   :4.296   Mean   :3.629  
 3rd Qu.: 26.12                      3rd Qu.:5.000   3rd Qu.:4.000  
 Max.   :828.20                      Max.   :5.000   Max.   :5.000  
 NA's   :92                          NA's   :123     NA's   :126    
       q3              q4              q5              q6       
 Min.   :1.000   Min.   :1.000   Min.   :2.000   Min.   :1.000  
 1st Qu.:3.000   1st Qu.:4.000   1st Qu.:4.000   1st Qu.:4.000  
 Median :3.000   Median :4.000   Median :4.000   Median :4.000  
 Mean   :3.327   Mean   :4.268   Mean   :4.191   Mean   :4.008  
 3rd Qu.:4.000   3rd Qu.:5.000   3rd Qu.:5.000   3rd Qu.:5.000  
 Max.   :5.000   Max.   :5.000   Max.   :5.000   Max.   :5.000  
 NA's   :123     NA's   :125     NA's   :127     NA's   :127    
       q7              q8              q9             q10       
 Min.   :1.000   Min.   :1.000   Min.   :1.000   Min.   :1.000  
 1st Qu.:3.000   1st Qu.:4.000   1st Qu.:3.000   1st Qu.:4.000  
 Median :4.000   Median :4.000   Median :4.000   Median :4.000  
 Mean   :3.907   Mean   :4.289   Mean   :3.487   Mean   :4.101  
 3rd Qu.:4.750   3rd Qu.:5.000   3rd Qu.:4.000   3rd Qu.:5.000  
 Max.   :5.000   Max.   :5.000   Max.   :5.000   Max.   :5.000  
 NA's   :129     NA's   :129     NA's   :129     NA's   :129    
   TimeSpent       TimeSpent_hours    TimeSpent_std          int       
 Min.   :   0.45   Min.   :  0.0075   Min.   :-1.3280   Min.   :2.000  
 1st Qu.: 851.90   1st Qu.: 14.1983   1st Qu.:-0.6996   1st Qu.:3.900  
 Median :1550.91   Median : 25.8485   Median :-0.1837   Median :4.200  
 Mean   :1799.75   Mean   : 29.9959   Mean   : 0.0000   Mean   :4.219  
 3rd Qu.:2426.09   3rd Qu.: 40.4348   3rd Qu.: 0.4623   3rd Qu.:4.700  
 Max.   :8870.88   Max.   :147.8481   Max.   : 5.2188   Max.   :5.000  
 NA's   :5         NA's   :5          NA's   :5         NA's   :76     
       pc              uv       
 Min.   :1.500   Min.   :1.000  
 1st Qu.:3.000   1st Qu.:3.333  
 Median :3.500   Median :3.667  
 Mean   :3.608   Mean   :3.719  
 3rd Qu.:4.000   3rd Qu.:4.167  
 Max.   :5.000   Max.   :5.000  
 NA's   :75      NA's   :75

What do you notice about FinalGradeCEMS? (Hint: NAs?)

[There is NA values and missing informatin within the data set
FinalGradeCEMS. With knowledge of this informatin, we can either locate the missing information or delete it from the dataset.] {possible answer: I notice NA values indicating missing data. This requires handling either by imputation or removal depending on the analysis requirements.}

Task: In code chunk named select-2 select all columns except subjectand section. Assign to a new object. Inspect your data frame.

Hint: You can use - symbol with the variable name(s) you want to exclude. For example, select(data, -semester)

# TYPE YOUR CODE HERE
reduced_data <- select(data, -semester)
select(data, course_id, total_points_possible, total_points_earned,
percentage_earned, Gradebook_Item, Grade_Category, FinalGradeCEMS, Points_Possible,
Points_Earned, Gender, q1, q2, q3, q4, q5, q6, q7, q8, q9, q10, TimeSpent,
TimeSpent_hours, TimeSpent_std, int, pc, uv)

# A tibble: 603 × 26
   course_id     total_points_possible total_points_earned percentage_earned
   <chr>                         <dbl>               <dbl>             <dbl>
 1 FrScA-S216-02                  3280                2220             0.677
 2 OcnA-S116-01                   3531                2672             0.757
 3 FrScA-S216-01                  2870                1897             0.661
 4 OcnA-S216-01                   4562                3090             0.677
 5 PhysA-S116-01                  2207                1910             0.865
 6 FrScA-S216-03                  4208                3596             0.855
 7 AnPhA-S216-01                  4325                2255             0.521
 8 PhysA-S116-01                  2086                1719             0.824
 9 FrScA-S116-01                  4655                3149             0.676
10 FrScA-S116-02                  1710                1402             0.820
# ℹ 593 more rows
# ℹ 22 more variables: Gradebook_Item <chr>, Grade_Category <lgl>,
#   FinalGradeCEMS <dbl>, Points_Possible <dbl>, Points_Earned <dbl>,
#   Gender <chr>, q1 <dbl>, q2 <dbl>, q3 <dbl>, q4 <dbl>, q5 <dbl>, q6 <dbl>,
#   q7 <dbl>, q8 <dbl>, q9 <dbl>, q10 <dbl>, TimeSpent <dbl>,
#   TimeSpent_hours <dbl>, TimeSpent_std <dbl>, int <dbl>, pc <dbl>, uv <dbl>

# INSPECT DATA
head(data)

# A tibble: 6 × 30
  student_id course_id     total_points_possible total_points_earned
       <dbl> <chr>                         <dbl>               <dbl>
1      43146 FrScA-S216-02                  3280                2220
2      44638 OcnA-S116-01                   3531                2672
3      47448 FrScA-S216-01                  2870                1897
4      47979 OcnA-S216-01                   4562                3090
5      48797 PhysA-S116-01                  2207                1910
6      51943 FrScA-S216-03                  4208                3596
# ℹ 26 more variables: percentage_earned <dbl>, subject <chr>, semester <chr>,
#   section <chr>, Gradebook_Item <chr>, Grade_Category <lgl>,
#   FinalGradeCEMS <dbl>, Points_Possible <dbl>, Points_Earned <dbl>,
#   Gender <chr>, q1 <dbl>, q2 <dbl>, q3 <dbl>, q4 <dbl>, q5 <dbl>, q6 <dbl>,
#   q7 <dbl>, q8 <dbl>, q9 <dbl>, q10 <dbl>, TimeSpent <dbl>,
#   TimeSpent_hours <dbl>, TimeSpent_std <dbl>, int <dbl>, pc <dbl>, uv <dbl>

Task: In the code chunk named filter-3, Filter the data frame for students in OcnA courses. Assign to a new object. Use the head() function to examine your data frame.

# TYPE YOUR CODE
ocna_students <- filter(data, subject == "")

# EXAMINE YOUR DATA
head(data)

# A tibble: 6 × 30
  student_id course_id     total_points_possible total_points_earned
       <dbl> <chr>                         <dbl>               <dbl>
1      43146 FrScA-S216-02                  3280                2220
2      44638 OcnA-S116-01                   3531                2672
3      47448 FrScA-S216-01                  2870                1897
4      47979 OcnA-S216-01                   4562                3090
5      48797 PhysA-S116-01                  2207                1910
6      51943 FrScA-S216-03                  4208                3596
# ℹ 26 more variables: percentage_earned <dbl>, subject <chr>, semester <chr>,
#   section <chr>, Gradebook_Item <chr>, Grade_Category <lgl>,
#   FinalGradeCEMS <dbl>, Points_Possible <dbl>, Points_Earned <dbl>,
#   Gender <chr>, q1 <dbl>, q2 <dbl>, q3 <dbl>, q4 <dbl>, q5 <dbl>, q6 <dbl>,
#   q7 <dbl>, q8 <dbl>, q9 <dbl>, q10 <dbl>, TimeSpent <dbl>,
#   TimeSpent_hours <dbl>, TimeSpent_std <dbl>, int <dbl>, pc <dbl>, uv <dbl>

Question: How many rows does the head() function display? Hint: Check the dimensions of your tibble in the console.
- [The head function displays 6 rows of data.] {Possible answer: The head function displays 5 rows of data*}

Task: In the code chunk named filter-4, filter the data frame to remove any rows where total_points_possible is NA. Assign the result to a new object. Then, use glimpse() to examine all columns of your newly filtered data frame.

# TYPE YOUR CODE
no_na_points <- filter(data, !is.na(total_points_possible))
# INSPECT YOUR DATA
glimpse(data)

Rows: 603
Columns: 30
$ student_id            <dbl> 43146, 44638, 47448, 47979, 48797, 51943, 52326,…
$ course_id             <chr> "FrScA-S216-02", "OcnA-S116-01", "FrScA-S216-01"…
$ total_points_possible <dbl> 3280, 3531, 2870, 4562, 2207, 4208, 4325, 2086, …
$ total_points_earned   <dbl> 2220, 2672, 1897, 3090, 1910, 3596, 2255, 1719, …
$ percentage_earned     <dbl> 0.6768293, 0.7567261, 0.6609756, 0.6773345, 0.86…
$ subject               <chr> "FrScA", "OcnA", "FrScA", "OcnA", "PhysA", "FrSc…
$ semester              <chr> "S216", "S116", "S216", "S216", "S116", "S216", …
$ section               <chr> "02", "01", "01", "01", "01", "03", "01", "01", …
$ Gradebook_Item        <chr> "POINTS EARNED & TOTAL COURSE POINTS", "ATTEMPTE…
$ Grade_Category        <lgl> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, …
$ FinalGradeCEMS        <dbl> 93.45372, 81.70184, 88.48758, 81.85260, 84.00000…
$ Points_Possible       <dbl> 5, 10, 10, 5, 438, 5, 10, 10, 443, 5, 12, 10, 5,…
$ Points_Earned         <dbl> NA, 10.00, NA, 4.00, 399.00, NA, NA, 10.00, 425.…
$ Gender                <chr> "M", "F", "M", "M", "F", "F", "M", "F", "F", "M"…
$ q1                    <dbl> 5, 4, 5, 5, 4, NA, 5, 3, 4, NA, NA, 4, 3, 5, NA,…
$ q2                    <dbl> 4, 4, 4, 5, 3, NA, 5, 3, 3, NA, NA, 5, 3, 3, NA,…
$ q3                    <dbl> 4, 3, 4, 3, 3, NA, 3, 3, 3, NA, NA, 3, 3, 5, NA,…
$ q4                    <dbl> 5, 4, 5, 5, 4, NA, 5, 3, 4, NA, NA, 5, 3, 5, NA,…
$ q5                    <dbl> 5, 4, 5, 5, 4, NA, 5, 3, 4, NA, NA, 5, 4, 5, NA,…
$ q6                    <dbl> 5, 4, 4, 5, 4, NA, 5, 4, 3, NA, NA, 5, 3, 5, NA,…
$ q7                    <dbl> 5, 4, 4, 4, 4, NA, 4, 3, 3, NA, NA, 5, 3, 5, NA,…
$ q8                    <dbl> 5, 5, 5, 5, 4, NA, 5, 3, 4, NA, NA, 4, 3, 5, NA,…
$ q9                    <dbl> 4, 4, 3, 5, NA, NA, 5, 3, 2, NA, NA, 5, 2, 2, NA…
$ q10                   <dbl> 5, 4, 5, 5, 3, NA, 5, 3, 5, NA, NA, 4, 4, 5, NA,…
$ TimeSpent             <dbl> 1555.1667, 1382.7001, 860.4335, 1598.6166, 1481.…
$ TimeSpent_hours       <dbl> 25.91944500, 23.04500167, 14.34055833, 26.643610…
$ TimeSpent_std         <dbl> -0.18051496, -0.30780313, -0.69325954, -0.148446…
$ int                   <dbl> 5.0, 4.2, 5.0, 5.0, 3.8, 4.6, 5.0, 3.0, 4.2, NA,…
$ pc                    <dbl> 4.50, 3.50, 4.00, 3.50, 3.50, 4.00, 3.50, 3.00, …
$ uv                    <dbl> 4.333333, 4.000000, 3.666667, 5.000000, 3.500000…

Question: What does ! mean in R in general?

[An “!” in R coding nullifies the logical value it is being used on.]

Task: In code chunk named filter-5, filter the data frame to filter any rows with Final_Grade over 85. Assign to a new object. Use glimpse() to examine all columns of your data frame.

# We can make a condition within the filter() function.
# TYPE YOUR CODE
data_over85 <- filter(data)

# INSPECT YOUR DATA glimpse(data, data_over85)


# Let's practice with another variable 'TimeSpent'
# Filter rows where a variable 'TimeSpent' is greater than 50
# TYPE YOUR CODE
data_timover50 <- filter(data, )

# INSPECT YOUR DATA 
glimpse(data)

Rows: 603
Columns: 30
$ student_id            <dbl> 43146, 44638, 47448, 47979, 48797, 51943, 52326,…
$ course_id             <chr> "FrScA-S216-02", "OcnA-S116-01", "FrScA-S216-01"…
$ total_points_possible <dbl> 3280, 3531, 2870, 4562, 2207, 4208, 4325, 2086, …
$ total_points_earned   <dbl> 2220, 2672, 1897, 3090, 1910, 3596, 2255, 1719, …
$ percentage_earned     <dbl> 0.6768293, 0.7567261, 0.6609756, 0.6773345, 0.86…
$ subject               <chr> "FrScA", "OcnA", "FrScA", "OcnA", "PhysA", "FrSc…
$ semester              <chr> "S216", "S116", "S216", "S216", "S116", "S216", …
$ section               <chr> "02", "01", "01", "01", "01", "03", "01", "01", …
$ Gradebook_Item        <chr> "POINTS EARNED & TOTAL COURSE POINTS", "ATTEMPTE…
$ Grade_Category        <lgl> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, …
$ FinalGradeCEMS        <dbl> 93.45372, 81.70184, 88.48758, 81.85260, 84.00000…
$ Points_Possible       <dbl> 5, 10, 10, 5, 438, 5, 10, 10, 443, 5, 12, 10, 5,…
$ Points_Earned         <dbl> NA, 10.00, NA, 4.00, 399.00, NA, NA, 10.00, 425.…
$ Gender                <chr> "M", "F", "M", "M", "F", "F", "M", "F", "F", "M"…
$ q1                    <dbl> 5, 4, 5, 5, 4, NA, 5, 3, 4, NA, NA, 4, 3, 5, NA,…
$ q2                    <dbl> 4, 4, 4, 5, 3, NA, 5, 3, 3, NA, NA, 5, 3, 3, NA,…
$ q3                    <dbl> 4, 3, 4, 3, 3, NA, 3, 3, 3, NA, NA, 3, 3, 5, NA,…
$ q4                    <dbl> 5, 4, 5, 5, 4, NA, 5, 3, 4, NA, NA, 5, 3, 5, NA,…
$ q5                    <dbl> 5, 4, 5, 5, 4, NA, 5, 3, 4, NA, NA, 5, 4, 5, NA,…
$ q6                    <dbl> 5, 4, 4, 5, 4, NA, 5, 4, 3, NA, NA, 5, 3, 5, NA,…
$ q7                    <dbl> 5, 4, 4, 4, 4, NA, 4, 3, 3, NA, NA, 5, 3, 5, NA,…
$ q8                    <dbl> 5, 5, 5, 5, 4, NA, 5, 3, 4, NA, NA, 4, 3, 5, NA,…
$ q9                    <dbl> 4, 4, 3, 5, NA, NA, 5, 3, 2, NA, NA, 5, 2, 2, NA…
$ q10                   <dbl> 5, 4, 5, 5, 3, NA, 5, 3, 5, NA, NA, 4, 4, 5, NA,…
$ TimeSpent             <dbl> 1555.1667, 1382.7001, 860.4335, 1598.6166, 1481.…
$ TimeSpent_hours       <dbl> 25.91944500, 23.04500167, 14.34055833, 26.643610…
$ TimeSpent_std         <dbl> -0.18051496, -0.30780313, -0.69325954, -0.148446…
$ int                   <dbl> 5.0, 4.2, 5.0, 5.0, 3.8, 4.6, 5.0, 3.0, 4.2, NA,…
$ pc                    <dbl> 4.50, 3.50, 4.00, 3.50, 3.50, 4.00, 3.50, 3.00, …
$ uv                    <dbl> 4.333333, 4.000000, 3.666667, 5.000000, 3.500000…

Task: This time, we’ll mutate the dataset the way we want it to be presented. For example, we will create/add a new variable called added_gender in the current dataset based on the gender information.

# Create a new variable 'added_gender'
data_gender <- mutate(data, added_gender = ifelse(Gender == "F", "Female", "Male"))

# Inspect data 
glimpse(data_gender)

Rows: 603
Columns: 31
$ student_id            <dbl> 43146, 44638, 47448, 47979, 48797, 51943, 52326,…
$ course_id             <chr> "FrScA-S216-02", "OcnA-S116-01", "FrScA-S216-01"…
$ total_points_possible <dbl> 3280, 3531, 2870, 4562, 2207, 4208, 4325, 2086, …
$ total_points_earned   <dbl> 2220, 2672, 1897, 3090, 1910, 3596, 2255, 1719, …
$ percentage_earned     <dbl> 0.6768293, 0.7567261, 0.6609756, 0.6773345, 0.86…
$ subject               <chr> "FrScA", "OcnA", "FrScA", "OcnA", "PhysA", "FrSc…
$ semester              <chr> "S216", "S116", "S216", "S216", "S116", "S216", …
$ section               <chr> "02", "01", "01", "01", "01", "03", "01", "01", …
$ Gradebook_Item        <chr> "POINTS EARNED & TOTAL COURSE POINTS", "ATTEMPTE…
$ Grade_Category        <lgl> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, …
$ FinalGradeCEMS        <dbl> 93.45372, 81.70184, 88.48758, 81.85260, 84.00000…
$ Points_Possible       <dbl> 5, 10, 10, 5, 438, 5, 10, 10, 443, 5, 12, 10, 5,…
$ Points_Earned         <dbl> NA, 10.00, NA, 4.00, 399.00, NA, NA, 10.00, 425.…
$ Gender                <chr> "M", "F", "M", "M", "F", "F", "M", "F", "F", "M"…
$ q1                    <dbl> 5, 4, 5, 5, 4, NA, 5, 3, 4, NA, NA, 4, 3, 5, NA,…
$ q2                    <dbl> 4, 4, 4, 5, 3, NA, 5, 3, 3, NA, NA, 5, 3, 3, NA,…
$ q3                    <dbl> 4, 3, 4, 3, 3, NA, 3, 3, 3, NA, NA, 3, 3, 5, NA,…
$ q4                    <dbl> 5, 4, 5, 5, 4, NA, 5, 3, 4, NA, NA, 5, 3, 5, NA,…
$ q5                    <dbl> 5, 4, 5, 5, 4, NA, 5, 3, 4, NA, NA, 5, 4, 5, NA,…
$ q6                    <dbl> 5, 4, 4, 5, 4, NA, 5, 4, 3, NA, NA, 5, 3, 5, NA,…
$ q7                    <dbl> 5, 4, 4, 4, 4, NA, 4, 3, 3, NA, NA, 5, 3, 5, NA,…
$ q8                    <dbl> 5, 5, 5, 5, 4, NA, 5, 3, 4, NA, NA, 4, 3, 5, NA,…
$ q9                    <dbl> 4, 4, 3, 5, NA, NA, 5, 3, 2, NA, NA, 5, 2, 2, NA…
$ q10                   <dbl> 5, 4, 5, 5, 3, NA, 5, 3, 5, NA, NA, 4, 4, 5, NA,…
$ TimeSpent             <dbl> 1555.1667, 1382.7001, 860.4335, 1598.6166, 1481.…
$ TimeSpent_hours       <dbl> 25.91944500, 23.04500167, 14.34055833, 26.643610…
$ TimeSpent_std         <dbl> -0.18051496, -0.30780313, -0.69325954, -0.148446…
$ int                   <dbl> 5.0, 4.2, 5.0, 5.0, 3.8, 4.6, 5.0, 3.0, 4.2, NA,…
$ pc                    <dbl> 4.50, 3.50, 4.00, 3.50, 3.50, 4.00, 3.50, 3.00, …
$ uv                    <dbl> 4.333333, 4.000000, 3.666667, 5.000000, 3.500000…
$ added_gender          <chr> "Male", "Female", "Male", "Male", "Female", "Fem…

Question: How does ifelse() work? Hint: You can google ifelse() in R to learn about the function.

[Ifelse works by performing element-wise conditional operations.]

Task: In the code chunk called arrange-1, Arrange data by subject then percentage_earned in descending order. Assign to a new object. Use the arrage() function. Then, use desc('variable name'). Use the str() function to examine the data type of each column in your data frame.

arranged_classes <- arrange(data, , desc('subject'))

# INSPECT DATA
str(data)

spc_tbl_ [603 × 30] (S3: spec_tbl_df/tbl_df/tbl/data.frame)
 $ student_id           : num [1:603] 43146 44638 47448 47979 48797 ...
 $ course_id            : chr [1:603] "FrScA-S216-02" "OcnA-S116-01" "FrScA-S216-01" "OcnA-S216-01" ...
 $ total_points_possible: num [1:603] 3280 3531 2870 4562 2207 ...
 $ total_points_earned  : num [1:603] 2220 2672 1897 3090 1910 ...
 $ percentage_earned    : num [1:603] 0.677 0.757 0.661 0.677 0.865 ...
 $ subject              : chr [1:603] "FrScA" "OcnA" "FrScA" "OcnA" ...
 $ semester             : chr [1:603] "S216" "S116" "S216" "S216" ...
 $ section              : chr [1:603] "02" "01" "01" "01" ...
 $ Gradebook_Item       : chr [1:603] "POINTS EARNED & TOTAL COURSE POINTS" "ATTEMPTED" "POINTS EARNED & TOTAL COURSE POINTS" "POINTS EARNED & TOTAL COURSE POINTS" ...
 $ Grade_Category       : logi [1:603] NA NA NA NA NA NA ...
 $ FinalGradeCEMS       : num [1:603] 93.5 81.7 88.5 81.9 84 ...
 $ Points_Possible      : num [1:603] 5 10 10 5 438 5 10 10 443 5 ...
 $ Points_Earned        : num [1:603] NA 10 NA 4 399 NA NA 10 425 2.5 ...
 $ Gender               : chr [1:603] "M" "F" "M" "M" ...
 $ q1                   : num [1:603] 5 4 5 5 4 NA 5 3 4 NA ...
 $ q2                   : num [1:603] 4 4 4 5 3 NA 5 3 3 NA ...
 $ q3                   : num [1:603] 4 3 4 3 3 NA 3 3 3 NA ...
 $ q4                   : num [1:603] 5 4 5 5 4 NA 5 3 4 NA ...
 $ q5                   : num [1:603] 5 4 5 5 4 NA 5 3 4 NA ...
 $ q6                   : num [1:603] 5 4 4 5 4 NA 5 4 3 NA ...
 $ q7                   : num [1:603] 5 4 4 4 4 NA 4 3 3 NA ...
 $ q8                   : num [1:603] 5 5 5 5 4 NA 5 3 4 NA ...
 $ q9                   : num [1:603] 4 4 3 5 NA NA 5 3 2 NA ...
 $ q10                  : num [1:603] 5 4 5 5 3 NA 5 3 5 NA ...
 $ TimeSpent            : num [1:603] 1555 1383 860 1599 1482 ...
 $ TimeSpent_hours      : num [1:603] 25.9 23 14.3 26.6 24.7 ...
 $ TimeSpent_std        : num [1:603] -0.181 -0.308 -0.693 -0.148 -0.235 ...
 $ int                  : num [1:603] 5 4.2 5 5 3.8 4.6 5 3 4.2 NA ...
 $ pc                   : num [1:603] 4.5 3.5 4 3.5 3.5 4 3.5 3 3 NA ...
 $ uv                   : num [1:603] 4.33 4 3.67 5 3.5 ...
 - attr(*, "spec")=
  .. cols(
  ..   student_id = col_double(),
  ..   course_id = col_character(),
  ..   total_points_possible = col_double(),
  ..   total_points_earned = col_double(),
  ..   percentage_earned = col_double(),
  ..   subject = col_character(),
  ..   semester = col_character(),
  ..   section = col_character(),
  ..   Gradebook_Item = col_character(),
  ..   Grade_Category = col_logical(),
  ..   FinalGradeCEMS = col_double(),
  ..   Points_Possible = col_double(),
  ..   Points_Earned = col_double(),
  ..   Gender = col_character(),
  ..   q1 = col_double(),
  ..   q2 = col_double(),
  ..   q3 = col_double(),
  ..   q4 = col_double(),
  ..   q5 = col_double(),
  ..   q6 = col_double(),
  ..   q7 = col_double(),
  ..   q8 = col_double(),
  ..   q9 = col_double(),
  ..   q10 = col_double(),
  ..   TimeSpent = col_double(),
  ..   TimeSpent_hours = col_double(),
  ..   TimeSpent_std = col_double(),
  ..   int = col_double(),
  ..   pc = col_double(),
  ..   uv = col_double()
  .. )
 - attr(*, "problems")=<externalptr>

Task: In the code chunk named final-wrangle, use data and the %\>% or |> pipe operator:

Select student_id, subject, semester, FinalGradeCEMS.
Filter for students in OcnA courses.
Arrange grades by section in descending order.
Assign to a new object.
Examine the contents using a method of your choosing.

# TYPE YOUR CODE
final_data <- data %>%
  select(student_id, subject, semester, FinalGradeCEMS) %>%
  filter(subject == "OcnA") %>%
  arrange(desc(FinalGradeCEMS)) %>%

# CHECK THE MAIPULATED DATA

# Also, try print()
print()

# A tibble: 111 × 4
   student_id subject semester FinalGradeCEMS
        <dbl> <chr>   <chr>             <dbl>
 1      66740 OcnA    S116               99.3
 2      91163 OcnA    S216               97.4
 3      94744 OcnA    S216               96.8
 4      91818 OcnA    S116               96.5
 5      90090 OcnA    S116               96.3
 6      88168 OcnA    S116               96.0
 7      89114 OcnA    S116               95.0
 8      86758 OcnA    S116               94.6
 9      68476 OcnA    S116               94.6
10      79893 OcnA    T116               94.5
# ℹ 101 more rows

Final Tasks: Practice the following by writing your own code in the provided code chunks.

If you choose to use your own dataset instead, make sure to load it correctly. To load data, use the read_csv() function.

# Replace 'your_data.csv' with the name of your dataset.
your_data <- read_csv("data/sci-online-classes.csv")

Rows: 603 Columns: 30
── Column specification ────────────────────────────────────────────────────────
Delimiter: ","
chr  (6): course_id, subject, semester, section, Gradebook_Item, Gender
dbl (23): student_id, total_points_possible, total_points_earned, percentage...
lgl  (1): Grade_Category

ℹ Use `spec()` to retrieve the full column specification for this data.
ℹ Specify the column types or set `show_col_types = FALSE` to quiet this message.

# Inspect your data (feel free to use head(), str(), or summary() functions)
head(your_data)

# A tibble: 6 × 30
  student_id course_id     total_points_possible total_points_earned
       <dbl> <chr>                         <dbl>               <dbl>
1      43146 FrScA-S216-02                  3280                2220
2      44638 OcnA-S116-01                   3531                2672
3      47448 FrScA-S216-01                  2870                1897
4      47979 OcnA-S216-01                   4562                3090
5      48797 PhysA-S116-01                  2207                1910
6      51943 FrScA-S216-03                  4208                3596
# ℹ 26 more variables: percentage_earned <dbl>, subject <chr>, semester <chr>,
#   section <chr>, Gradebook_Item <chr>, Grade_Category <lgl>,
#   FinalGradeCEMS <dbl>, Points_Possible <dbl>, Points_Earned <dbl>,
#   Gender <chr>, q1 <dbl>, q2 <dbl>, q3 <dbl>, q4 <dbl>, q5 <dbl>, q6 <dbl>,
#   q7 <dbl>, q8 <dbl>, q9 <dbl>, q10 <dbl>, TimeSpent <dbl>,
#   TimeSpent_hours <dbl>, TimeSpent_std <dbl>, int <dbl>, pc <dbl>, uv <dbl>

Show two different ways to use the select() function with your/current data, inspect, and save as a new object.

# YOUR CODE HERE
select(data_gender)

# A tibble: 603 × 0

select(data)

# A tibble: 603 × 0

Show one way to use the filter() function with your data, inspect, and save as a new object.

# YOUR CODE HERE
filter(data)

# A tibble: 603 × 30
   student_id course_id     total_points_possible total_points_earned
        <dbl> <chr>                         <dbl>               <dbl>
 1      43146 FrScA-S216-02                  3280                2220
 2      44638 OcnA-S116-01                   3531                2672
 3      47448 FrScA-S216-01                  2870                1897
 4      47979 OcnA-S216-01                   4562                3090
 5      48797 PhysA-S116-01                  2207                1910
 6      51943 FrScA-S216-03                  4208                3596
 7      52326 AnPhA-S216-01                  4325                2255
 8      52446 PhysA-S116-01                  2086                1719
 9      53447 FrScA-S116-01                  4655                3149
10      53475 FrScA-S116-02                  1710                1402
# ℹ 593 more rows
# ℹ 26 more variables: percentage_earned <dbl>, subject <chr>, semester <chr>,
#   section <chr>, Gradebook_Item <chr>, Grade_Category <lgl>,
#   FinalGradeCEMS <dbl>, Points_Possible <dbl>, Points_Earned <dbl>,
#   Gender <chr>, q1 <dbl>, q2 <dbl>, q3 <dbl>, q4 <dbl>, q5 <dbl>, q6 <dbl>,
#   q7 <dbl>, q8 <dbl>, q9 <dbl>, q10 <dbl>, TimeSpent <dbl>,
#   TimeSpent_hours <dbl>, TimeSpent_std <dbl>, int <dbl>, pc <dbl>, uv <dbl>

Show one way to use the arrange() function with your data, inspect, and save as a new object.

# YOUR CODE HERE
'arranged(grades)'

[1] "arranged(grades)"

Render & Submit

Congratulations, you’ve completed the first module!

To receive full score, you will need to render this document and publish via a method such as: Quarto Pub, Posit Cloud, RPubs , GitHub Pages, or other methods. Once you have shared a link to you published document with me and I have reviewed your work, you will be officially done with the current module.

Complete the following steps to submit your work for review by:

First, change the name of the author: in the YAML header at the very top of this document to your name. The YAML header controls the style and feel for knitted document but doesn’t actually display in the final output.
Next, click the “Render” button in the toolbar above to “render” your R Markdown document to a HTML file that will be saved in your R Project folder. You should see a formatted webpage appear in your Viewer tab in the lower right pan or in a new browser window. Let me know if you run into any issues with rendering.
Finally, publish. To do publish, follow the step from https://docs.posit.co/cloud/guide/publish/#publish-from-a-cloud-project

If you have any questions about this module, or run into any technical issues, don’t hesitate to contact me.

Once I have checked your link, you will be notified!