LA Foundations badge

LASER Institute Foundation Learning Lab 1

Author

Andy Accettola

Published

July 18, 2025

The final activity for each learning lab provides space to work with data and to reflect on how the concepts and techniques introduced in each lab might apply to your own research.

To earn a badge for each lab, you are required to respond to a set of prompts for two parts:

Part I: Reflect and Plan

Use the institutional library (e.g. NCSU Library), Google Scholar or search engine to locate a research article, presentation, or resource that applies learning analytics analysis to an educational context or topic of interest. More specifically, locate a study that makes use of one of the data structures we learned today. You are also welcome to select one of your research papers.

  1. Provide an APA citation for your selected study.

    • Pardo, A, Jovanovic, J, Dawson, S, Gasevic, D & Mirriahi, N 2017, ‘Using learning analytics to scale the provision of personalised feedback’, British Journal of Educational Technology, pp. 1-11. https://doi.org/10.1111/bjet.12592
  2. What types of data are associated with LA ?

    • Traces of student feedback engagement behavior in a blended learning context.
  3. What type of data structures are analyzed in the educational context?

    • Three types of interactions were recorded: watching a video, completion of multiple-choice questions, and engagement with a sequence of summative exercises.
  4. How might this article be used to better understand a dataset or educational context of personal or professional interest to you?

    • The second data source was derived from the institutional student evaluation of teaching survey, which was made available to the students during the last three weeks of the semester in the 2013, 2014, and 2015 editions. Students answered to a set of questions that were based on a Likert-like scale with values “strongly disagree”, “disagree”, “neutral”, “agree”, and “strongly agree”. Question 6 of the survey was selected as relevant for the study since it addressed specifically how the students perceived the provision of feedback.
  5. Finally, how do these processes compare with what teachers and educational organizations already do to support and assess student learning?

    • There is little debate regarding the importance of student feedback for improving the learning process. However, there remain significant workload barriers for instructors that impede their capacity to provide timely and meaningful feedback. The increasing role technology is playing in the education space may provide novel solutions to this impediment. As students interact with the various learning technologies in their course of study, they create digital traces that can be captured and analysed. These digital traces form the new kind of data that are frequently used in learning analytics to develop actionable recommendations that can support student learning.

Draft a research question of guided by techniques and data sources that you are potentially interested in exploring in more depth.

  1. What data source(s) should be analyzed or discussed?

    • Traces of student feedback behavior as well as student perception data via survey.

    What is the purpose of your article?

    • The ultimate objective of collecting and analysing such data is to produce actionable knowledge connected with the learning environment that can be used to inform learning and teaching practice.
  2. Explain the analytical level at which these data would need to be collected and analyzed.

    • The response to feedback traces will need to be collected from the learning platform. Analysis consists of analyzing the number of students that received a message, the number of activities considered for the message, and the number of unique emails sent (percentage over the number of students. A one-way between-subjects ANOVA was conducted to compare the effects of the year on the level of student satisfaction with feedback reported by the 2013, 2014, and 2015 cohorts.
  3. How, if at all, will your article touch upon the application(s) of LA to “understand and improve learning and the contexts in which learning occurs?”

    • The study described in this paper highlights the potential of combining digital traces typically captured by technology mediation in a learning environment with instructor knowledge to provide frequent and personalised feedback messages for large student cohorts. The approach included a learning design that used cycles that repeat throughout the length of the experience, a set of messages created by the instructor for each task depending on four levels of engagement, and a mechanism to combine these messages into personal emails. The instructor produced 138 text snippets with advice on 37 tasks that were combined based on the observed indicators.

Part II: Data Product

In our Learning Analytics code-along, we scratched the surface on the number of ways that we can wrangle the data.

Using one of the data sets provided in the data folder, your goal for this lab is to extend the Learning Analytics Workflow from our code-along by preparing and wrangling different data.

Or alternatively, you may use your own data set to use in the workflow. If you do decide to use your own data set you must include:

  • Show two different ways using select function with your data, inspect and save as a new object.

  • Show one way to use filter function with your data, inspect and save as a new object.

  • Show one way using arrange function with your data, inspect and save as a new object.

  • Use the pipe operator to bring it all together.

Feel free to create a new script in your lab 2 to work through the following problems. Then when satisfied add the code in the code chunks below. Don’t forget to run the code to make sure it works.

Instructions:

  1. Add your name to the document in author.

  2. Set up the first (or, two if using an Introduction) phases of the LA workflow below. I’ve added the wrangle section for you. You will need to Prepare the libraries necessary to wrangle the data.

  3. In the chunk called read-data: Import the sci-online-classes.csv from the data folder and save as a new object called sci_classes. Then inspect your data using a function of your choice.

# Type your code here
#load tidyverse
library(tidyverse)
── Attaching core tidyverse packages ──────────────────────── tidyverse 2.0.0 ──
✔ dplyr     1.1.4     ✔ readr     2.1.5
✔ forcats   1.0.0     ✔ stringr   1.5.1
✔ ggplot2   3.5.2     ✔ tibble    3.3.0
✔ lubridate 1.9.4     ✔ tidyr     1.3.1
✔ purrr     1.0.4     
── Conflicts ────────────────────────────────────────── tidyverse_conflicts() ──
✖ dplyr::filter() masks stats::filter()
✖ dplyr::lag()    masks stats::lag()
ℹ Use the conflicted package (<http://conflicted.r-lib.org/>) to force all conflicts to become errors
#import
sci_classes <- read_csv("data/sci-online-classes.csv")
Rows: 603 Columns: 30
── Column specification ────────────────────────────────────────────────────────
Delimiter: ","
chr  (6): course_id, subject, semester, section, Gradebook_Item, Gender
dbl (23): student_id, total_points_possible, total_points_earned, percentage...
lgl  (1): Grade_Category

ℹ Use `spec()` to retrieve the full column specification for this data.
ℹ Specify the column types or set `show_col_types = FALSE` to quiet this message.
#inspect your data
glimpse(sci_classes)
Rows: 603
Columns: 30
$ student_id            <dbl> 43146, 44638, 47448, 47979, 48797, 51943, 52326,…
$ course_id             <chr> "FrScA-S216-02", "OcnA-S116-01", "FrScA-S216-01"…
$ total_points_possible <dbl> 3280, 3531, 2870, 4562, 2207, 4208, 4325, 2086, …
$ total_points_earned   <dbl> 2220, 2672, 1897, 3090, 1910, 3596, 2255, 1719, …
$ percentage_earned     <dbl> 0.6768293, 0.7567261, 0.6609756, 0.6773345, 0.86…
$ subject               <chr> "FrScA", "OcnA", "FrScA", "OcnA", "PhysA", "FrSc…
$ semester              <chr> "S216", "S116", "S216", "S216", "S116", "S216", …
$ section               <chr> "02", "01", "01", "01", "01", "03", "01", "01", …
$ Gradebook_Item        <chr> "POINTS EARNED & TOTAL COURSE POINTS", "ATTEMPTE…
$ Grade_Category        <lgl> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, …
$ FinalGradeCEMS        <dbl> 93.45372, 81.70184, 88.48758, 81.85260, 84.00000…
$ Points_Possible       <dbl> 5, 10, 10, 5, 438, 5, 10, 10, 443, 5, 12, 10, 5,…
$ Points_Earned         <dbl> NA, 10.00, NA, 4.00, 399.00, NA, NA, 10.00, 425.…
$ Gender                <chr> "M", "F", "M", "M", "F", "F", "M", "F", "F", "M"…
$ q1                    <dbl> 5, 4, 5, 5, 4, NA, 5, 3, 4, NA, NA, 4, 3, 5, NA,…
$ q2                    <dbl> 4, 4, 4, 5, 3, NA, 5, 3, 3, NA, NA, 5, 3, 3, NA,…
$ q3                    <dbl> 4, 3, 4, 3, 3, NA, 3, 3, 3, NA, NA, 3, 3, 5, NA,…
$ q4                    <dbl> 5, 4, 5, 5, 4, NA, 5, 3, 4, NA, NA, 5, 3, 5, NA,…
$ q5                    <dbl> 5, 4, 5, 5, 4, NA, 5, 3, 4, NA, NA, 5, 4, 5, NA,…
$ q6                    <dbl> 5, 4, 4, 5, 4, NA, 5, 4, 3, NA, NA, 5, 3, 5, NA,…
$ q7                    <dbl> 5, 4, 4, 4, 4, NA, 4, 3, 3, NA, NA, 5, 3, 5, NA,…
$ q8                    <dbl> 5, 5, 5, 5, 4, NA, 5, 3, 4, NA, NA, 4, 3, 5, NA,…
$ q9                    <dbl> 4, 4, 3, 5, NA, NA, 5, 3, 2, NA, NA, 5, 2, 2, NA…
$ q10                   <dbl> 5, 4, 5, 5, 3, NA, 5, 3, 5, NA, NA, 4, 4, 5, NA,…
$ TimeSpent             <dbl> 1555.1667, 1382.7001, 860.4335, 1598.6166, 1481.…
$ TimeSpent_hours       <dbl> 25.91944500, 23.04500167, 14.34055833, 26.643610…
$ TimeSpent_std         <dbl> -0.18051496, -0.30780313, -0.69325954, -0.148446…
$ int                   <dbl> 5.0, 4.2, 5.0, 5.0, 3.8, 4.6, 5.0, 3.0, 4.2, NA,…
$ pc                    <dbl> 4.50, 3.50, 4.00, 3.50, 3.50, 4.00, 3.50, 3.00, …
$ uv                    <dbl> 4.333333, 4.000000, 3.666667, 5.000000, 3.500000…
  1. In the select-1 code chunk: Use the ‘select’ function to select student_id, subject, semester, FinalGradeCEMS. Assign to a new object with a different name (you choose the name).
# Type your code here
Data_1 <- select(sci_classes, student_id, subject, semester, FinalGradeCEMS)


#inspect your data
Data_1
# A tibble: 603 × 4
   student_id subject semester FinalGradeCEMS
        <dbl> <chr>   <chr>             <dbl>
 1      43146 FrScA   S216               93.5
 2      44638 OcnA    S116               81.7
 3      47448 FrScA   S216               88.5
 4      47979 OcnA    S216               81.9
 5      48797 PhysA   S116               84  
 6      51943 FrScA   S216               NA  
 7      52326 AnPhA   S216               83.6
 8      52446 PhysA   S116               97.8
 9      53447 FrScA   S116               96.1
10      53475 FrScA   S116               NA  
# ℹ 593 more rows

What do you notice about FinalGradeCEMS? (*Hint: NAs?)

  • Answer here {possible answer: The NA values in the FinalGradeCEMS needs to be addressed, either by imputation (averaging) or dropping the cases all together.}
  1. In code chunk named select-2 select all columns except subject and section. Assign to a new object with a different name. Inspect your data frame with a different function.
# Type your code here
Data_2 <- select(sci_classes, subject, section)


#inspect data
Data_2
# A tibble: 603 × 2
   subject section
   <chr>   <chr>  
 1 FrScA   02     
 2 OcnA    01     
 3 FrScA   01     
 4 OcnA    01     
 5 PhysA   01     
 6 FrScA   03     
 7 AnPhA   01     
 8 PhysA   01     
 9 FrScA   01     
10 FrScA   02     
# ℹ 593 more rows
  1. In the code chunk named filter-1, Filter the sci_classes data frame for students in OcnA courses. Assign to a new object with a different name. Use the head() function to examine your data frame.
#Type your code here

filter_1 <- filter(sci_classes, subject == "OcnA")

#inspect your data
filter_1
# A tibble: 111 × 30
   student_id course_id    total_points_possible total_points_earned
        <dbl> <chr>                        <dbl>               <dbl>
 1      44638 OcnA-S116-01                  3531                2672
 2      47979 OcnA-S216-01                  4562                3090
 3      54066 OcnA-S116-01                  4641                3429
 4      54282 OcnA-S116-02                  3581                2777
 5      54342 OcnA-S116-02                  3256                2876
 6      54346 OcnA-S116-01                  4471                3773
 7      54567 OcnA-S216-02                  3871                3286
 8      57981 OcnA-S116-01                  3587                2879
 9      58178 OcnA-S116-01                  3940                3348
10      62175 OcnA-S216-01                  3169                2249
# ℹ 101 more rows
# ℹ 26 more variables: percentage_earned <dbl>, subject <chr>, semester <chr>,
#   section <chr>, Gradebook_Item <chr>, Grade_Category <lgl>,
#   FinalGradeCEMS <dbl>, Points_Possible <dbl>, Points_Earned <dbl>,
#   Gender <chr>, q1 <dbl>, q2 <dbl>, q3 <dbl>, q4 <dbl>, q5 <dbl>, q6 <dbl>,
#   q7 <dbl>, q8 <dbl>, q9 <dbl>, q10 <dbl>, TimeSpent <dbl>,
#   TimeSpent_hours <dbl>, TimeSpent_std <dbl>, int <dbl>, pc <dbl>, uv <dbl>
head(filter_1)
# A tibble: 6 × 30
  student_id course_id    total_points_possible total_points_earned
       <dbl> <chr>                        <dbl>               <dbl>
1      44638 OcnA-S116-01                  3531                2672
2      47979 OcnA-S216-01                  4562                3090
3      54066 OcnA-S116-01                  4641                3429
4      54282 OcnA-S116-02                  3581                2777
5      54342 OcnA-S116-02                  3256                2876
6      54346 OcnA-S116-01                  4471                3773
# ℹ 26 more variables: percentage_earned <dbl>, subject <chr>, semester <chr>,
#   section <chr>, Gradebook_Item <chr>, Grade_Category <lgl>,
#   FinalGradeCEMS <dbl>, Points_Possible <dbl>, Points_Earned <dbl>,
#   Gender <chr>, q1 <dbl>, q2 <dbl>, q3 <dbl>, q4 <dbl>, q5 <dbl>, q6 <dbl>,
#   q7 <dbl>, q8 <dbl>, q9 <dbl>, q10 <dbl>, TimeSpent <dbl>,
#   TimeSpent_hours <dbl>, TimeSpent_std <dbl>, int <dbl>, pc <dbl>, uv <dbl>

Q: How many rows does the head() function display? Hint: Check the dimensions of your tibble in the console.

  • Answer here

{Possible answerr: 6}

  1. In code chunk named filter-2, filter the sci_classes data frame so rows with NA for points earned are removed. Assign to a new object with a different name. Use glimpse() to examine all columns of your data frame.
# Type your code here
filter_2 <- sci_classes %>%
  drop_na(total_points_earned)



#inspect data 
view(filter_2)
  1. In the code chunk called arrange-1, Arrange sci_classes data by subject then percentage_earned in descending order. Assign to a new object. Use the str() function to examine the data type of each column in your data frame.
# Type your code here
sci_sorted <- sci_classes %>%
  arrange(subject, desc(percentage_earned))

#inpsect data
str(sci_sorted)
spc_tbl_ [603 × 30] (S3: spec_tbl_df/tbl_df/tbl/data.frame)
 $ student_id           : num [1:603] 70192 86488 96690 91175 86267 ...
 $ course_id            : chr [1:603] "AnPhA-S116-02" "AnPhA-S116-01" "AnPhA-S216-01" "AnPhA-S116-02" ...
 $ total_points_possible: num [1:603] 1936 3342 4804 3199 3045 ...
 $ total_points_earned  : num [1:603] 1763 3033 4309 2867 2705 ...
 $ percentage_earned    : num [1:603] 0.911 0.908 0.897 0.896 0.888 ...
 $ subject              : chr [1:603] "AnPhA" "AnPhA" "AnPhA" "AnPhA" ...
 $ semester             : chr [1:603] "S116" "S116" "S216" "S116" ...
 $ section              : chr [1:603] "02" "01" "01" "02" ...
 $ Gradebook_Item       : chr [1:603] "POINTS EARNED & TOTAL COURSE POINTS" "POINTS EARNED & TOTAL COURSE POINTS" "POINTS EARNED & TOTAL COURSE POINTS" "POINTS EARNED & TOTAL COURSE POINTS" ...
 $ Grade_Category       : logi [1:603] NA NA NA NA NA NA ...
 $ FinalGradeCEMS       : num [1:603] 96 87.4 64.8 82.2 35.1 ...
 $ Points_Possible      : num [1:603] 10 28 10 5 50 15 10 10 353 460 ...
 $ Points_Earned        : num [1:603] 7 26 3 5 50 11 8 10 330 452 ...
 $ Gender               : chr [1:603] "F" "M" "F" "F" ...
 $ q1                   : num [1:603] 4 4 4 5 5 4 5 4 NA NA ...
 $ q2                   : num [1:603] 3 4 3 3 5 2 4 4 NA NA ...
 $ q3                   : num [1:603] 3 2 2 3 3 3 4 3 NA NA ...
 $ q4                   : num [1:603] 4 3 5 5 5 4 5 4 NA NA ...
 $ q5                   : num [1:603] 4 3 4 5 5 4 5 4 NA NA ...
 $ q6                   : num [1:603] 3 3 4 4 5 3 5 4 NA NA ...
 $ q7                   : num [1:603] 3 3 3 3 4 4 5 4 NA NA ...
 $ q8                   : num [1:603] 5 2 4 5 5 4 4 4 NA NA ...
 $ q9                   : num [1:603] 2 3 3 3 5 1 4 4 NA NA ...
 $ q10                  : num [1:603] 5 3 2 5 5 2 5 4 NA NA ...
 $ TimeSpent            : num [1:603] 1537 3600 1970 1315 406 ...
 $ TimeSpent_hours      : num [1:603] 25.62 60 32.83 21.92 6.77 ...
 $ TimeSpent_std        : num [1:603] -0.194 1.328 0.125 -0.358 -1.029 ...
 $ int                  : num [1:603] 4.4 3 3.8 5 5 3.9 4.6 4 4.8 4.6 ...
 $ pc                   : num [1:603] 3 2.5 2.5 3 3.5 3.5 3.75 3.5 3.5 4.5 ...
 $ uv                   : num [1:603] 2.67 3.33 3.33 3.33 5 ...
 - attr(*, "spec")=
  .. cols(
  ..   student_id = col_double(),
  ..   course_id = col_character(),
  ..   total_points_possible = col_double(),
  ..   total_points_earned = col_double(),
  ..   percentage_earned = col_double(),
  ..   subject = col_character(),
  ..   semester = col_character(),
  ..   section = col_character(),
  ..   Gradebook_Item = col_character(),
  ..   Grade_Category = col_logical(),
  ..   FinalGradeCEMS = col_double(),
  ..   Points_Possible = col_double(),
  ..   Points_Earned = col_double(),
  ..   Gender = col_character(),
  ..   q1 = col_double(),
  ..   q2 = col_double(),
  ..   q3 = col_double(),
  ..   q4 = col_double(),
  ..   q5 = col_double(),
  ..   q6 = col_double(),
  ..   q7 = col_double(),
  ..   q8 = col_double(),
  ..   q9 = col_double(),
  ..   q10 = col_double(),
  ..   TimeSpent = col_double(),
  ..   TimeSpent_hours = col_double(),
  ..   TimeSpent_std = col_double(),
  ..   int = col_double(),
  ..   pc = col_double(),
  ..   uv = col_double()
  .. )
 - attr(*, "problems")=<externalptr> 
  1. In the code chunk name final-wrangle, use sci_classes data data and the %>% pipe operator:
  • Select student_id, subject, semester, FinalGradeCEMS.
  • Filter for students in OcnA courses.
  • Arrange grades by section in descending order.
  • Assign to a new object.
  • Examine the contents using a method of your choosing.
#Type your code here
final_wrangle <- sci_classes %>%
  select(student_id, subject, semester, FinalGradeCEMS) %>%
  filter(subject == "OcnA") %>%
  arrange(desc(FinalGradeCEMS))


str(final_wrangle)
tibble [111 × 4] (S3: tbl_df/tbl/data.frame)
 $ student_id    : num [1:111] 66740 91163 94744 91818 90090 ...
 $ subject       : chr [1:111] "OcnA" "OcnA" "OcnA" "OcnA" ...
 $ semester      : chr [1:111] "S116" "S216" "S216" "S116" ...
 $ FinalGradeCEMS: num [1:111] 99.3 97.4 96.8 96.5 96.3 ...

Render & Submit

Congratulations, you’ve completed Foundations Learning Badge 1!

To receive your the Foundations Badge, you will need to render this document and publish via a method designated by your instructor such as: Quarto Pub, Posit Cloud, RPubs , GitHub Pages, or other methods. Once you have shared a link to you published document with your instructor and they have reviewed your work, you will be provided a physical or digital version of the badge pictured at the top of this document!

If you have any questions about this badge, or run into any technical issues, don’t hesitate to contact your instructor. Once your instructor has checked your link, you will be provided a physical version of the badge!

Complete the following steps to submit your work for review:

  1. First, change the name of the author: in the YAML header at the very top of this document to your name. The YAML header controls the style and feel for knitted document but doesn’t actually display in the final output.

  2. Next, click the knit button in the toolbar above to “knit” your R Markdown document to a HTML file that will be saved in your R Project folder. You should see a formatted webpage appear in your Viewer tab in the lower right pan or in a new browser window. Let’s us know if you run into any issues with knitting.

  3. Finally, publish.