The final activity for each learning lab provides space to work with data and to reflect on how the concepts and techniques introduced in each lab might apply to your own research.

To earn a badge for each lab, you are required to respond to a set of prompts for two parts:

Part I: Reflect and Plan

Use the institutional library (e.g. NCSU Library), Google Scholar or search engine to locate a research article, presentation, or resource that applies learning analytics analysis to an educational context or topic of interest. More specifically, locate a study that makes use of the Learning Analytics Workflow we learned today. You are also welcome to select one of your research papers.

  1. Provide an APA citation for your selected study.

    • Nawaz, S., Kennedy, G., Bailey, J., & Mead, C. (2020). Moments of Confusion in Simulation-Based Learning Environments. Journal of Learning Analytics, 7(3), 118-137.
  2. What educational issue, “problem of practice,” and/or questions were addressed?

    • Students’ confusion in simulation-based learning environments
  3. Briefly describe any steps of the data-intensive research workflow that detailed in your article or presentation.

    • Prepare-explore-model-communicate
  4. What were the key findings or conclusions? What value, if any, might education practitioners find in these results?

    • confidence in prior knowledge is an important factor that can contribute to students' confusion. Students mostly struggled when they discovered a mismatch between the subjective and objective correctness of their responses.
  5. Finally, how, if at at, were educators in your self-selected article involved prior to wrangling and analysis?

    • No specific info

Draft a new research question of guided by the the phases of the Learning Analytics Workflow. Or use one of your current research questions.

  1. What educational issue, “problem of practice,” and/or questions is addressed??

    • What strategies improve learners’ SDL skills in online learning?
  2. Briefly describe any steps of the data-intensive research workflow that can be detailed in your article or presentation.

    • Prepare-wrangel-explore-model-communicate
  3. How, if at all, will your article touch upon the application(s) of LA to “understand and improve learning and the contexts in which learning occurs?”

    • Yes.

Part II: Data Product

In our Learning Analytics code-along, we scratched the surface on the number of ways that we can wrangle the data.

Using one of the data sets provided in the data folder, your goal for this lab is to extend the Learning Analytics Workflow from our code-along by preparing and wrangling different data.

Or alternatively, you may use your own data set to use in the workflow. If you do decide to use your own data set you must include:

Feel free to create a new script in your lab 2 to work through the following problems. Then when satisfied add the code in the code chunks below. Don;t forget to run the code to make sure it works.

Instructions:

  1. Add your name to the document in author.

  2. Set up the first (or, two if using an Introduction) phases of the LA workflow below. I’ve added the wrangle section for you. You will need to Prepare the libraries necessary to wrangle the data.

Wrangle

  1. In the chunk called read-data: Import the sci-online-classes.csv from the data folder and save as a new object called sci_classes. Then inspect your data using a function of your choice.
# Type your code here

library(tidyverse)
## ── Attaching packages ─────────────────────────────────────── tidyverse 1.3.1 ──
## ✔ ggplot2 3.3.5     ✔ purrr   0.3.4
## ✔ tibble  3.1.7     ✔ dplyr   1.0.9
## ✔ tidyr   1.2.0     ✔ stringr 1.4.0
## ✔ readr   2.1.2     ✔ forcats 0.5.1
## ── Conflicts ────────────────────────────────────────── tidyverse_conflicts() ──
## ✖ dplyr::filter() masks stats::filter()
## ✖ dplyr::lag()    masks stats::lag()
sci_classess <- read_csv("data/sci-online-classes.csv")
## Rows: 603 Columns: 30
## ── Column specification ────────────────────────────────────────────────────────
## Delimiter: ","
## chr  (6): course_id, subject, semester, section, Gradebook_Item, Gender
## dbl (23): student_id, total_points_possible, total_points_earned, percentage...
## lgl  (1): Grade_Category
## 
## ℹ Use `spec()` to retrieve the full column specification for this data.
## ℹ Specify the column types or set `show_col_types = FALSE` to quiet this message.
sci_classess
## # A tibble: 603 × 30
##    student_id course_id     total_points_poss… total_points_ea… percentage_earn…
##         <dbl> <chr>                      <dbl>            <dbl>            <dbl>
##  1      43146 FrScA-S216-02               3280             2220            0.677
##  2      44638 OcnA-S116-01                3531             2672            0.757
##  3      47448 FrScA-S216-01               2870             1897            0.661
##  4      47979 OcnA-S216-01                4562             3090            0.677
##  5      48797 PhysA-S116-01               2207             1910            0.865
##  6      51943 FrScA-S216-03               4208             3596            0.855
##  7      52326 AnPhA-S216-01               4325             2255            0.521
##  8      52446 PhysA-S116-01               2086             1719            0.824
##  9      53447 FrScA-S116-01               4655             3149            0.676
## 10      53475 FrScA-S116-02               1710             1402            0.820
## # … with 593 more rows, and 25 more variables: subject <chr>, semester <chr>,
## #   section <chr>, Gradebook_Item <chr>, Grade_Category <lgl>,
## #   FinalGradeCEMS <dbl>, Points_Possible <dbl>, Points_Earned <dbl>,
## #   Gender <chr>, q1 <dbl>, q2 <dbl>, q3 <dbl>, q4 <dbl>, q5 <dbl>, q6 <dbl>,
## #   q7 <dbl>, q8 <dbl>, q9 <dbl>, q10 <dbl>, TimeSpent <dbl>,
## #   TimeSpent_hours <dbl>, TimeSpent_std <dbl>, int <dbl>, pc <dbl>, uv <dbl>
  1. In the select-1 code chunk: Use the ‘select’ function to select student_id, subject, semester, FinalGradeCEMS. Assign to a new object with a different name (you choose the name).
# Type your code here

new_sci_classess <- sci_classess %>% 
  select("student_id","subject","semester","FinalGradeCEMS")

What do you notice about FinalGradeCEMS?(*Hint: NAs?)

  1. In code chunk named select-2 select all columns except subject and section. Assign to a new object with a different name. Examine your data frame with a different function.
# Type your code here

new2_sci_classess <- sci_classess %>% 
  select(-c("subject","section"))

glimpse(new2_sci_classess)
## Rows: 603
## Columns: 28
## $ student_id            <dbl> 43146, 44638, 47448, 47979, 48797, 51943, 52326,…
## $ course_id             <chr> "FrScA-S216-02", "OcnA-S116-01", "FrScA-S216-01"…
## $ total_points_possible <dbl> 3280, 3531, 2870, 4562, 2207, 4208, 4325, 2086, …
## $ total_points_earned   <dbl> 2220, 2672, 1897, 3090, 1910, 3596, 2255, 1719, …
## $ percentage_earned     <dbl> 0.6768293, 0.7567261, 0.6609756, 0.6773345, 0.86…
## $ semester              <chr> "S216", "S116", "S216", "S216", "S116", "S216", …
## $ Gradebook_Item        <chr> "POINTS EARNED & TOTAL COURSE POINTS", "ATTEMPTE…
## $ Grade_Category        <lgl> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, …
## $ FinalGradeCEMS        <dbl> 93.45372, 81.70184, 88.48758, 81.85260, 84.00000…
## $ Points_Possible       <dbl> 5, 10, 10, 5, 438, 5, 10, 10, 443, 5, 12, 10, 5,…
## $ Points_Earned         <dbl> NA, 10.00, NA, 4.00, 399.00, NA, NA, 10.00, 425.…
## $ Gender                <chr> "M", "F", "M", "M", "F", "F", "M", "F", "F", "M"…
## $ q1                    <dbl> 5, 4, 5, 5, 4, NA, 5, 3, 4, NA, NA, 4, 3, 5, NA,…
## $ q2                    <dbl> 4, 4, 4, 5, 3, NA, 5, 3, 3, NA, NA, 5, 3, 3, NA,…
## $ q3                    <dbl> 4, 3, 4, 3, 3, NA, 3, 3, 3, NA, NA, 3, 3, 5, NA,…
## $ q4                    <dbl> 5, 4, 5, 5, 4, NA, 5, 3, 4, NA, NA, 5, 3, 5, NA,…
## $ q5                    <dbl> 5, 4, 5, 5, 4, NA, 5, 3, 4, NA, NA, 5, 4, 5, NA,…
## $ q6                    <dbl> 5, 4, 4, 5, 4, NA, 5, 4, 3, NA, NA, 5, 3, 5, NA,…
## $ q7                    <dbl> 5, 4, 4, 4, 4, NA, 4, 3, 3, NA, NA, 5, 3, 5, NA,…
## $ q8                    <dbl> 5, 5, 5, 5, 4, NA, 5, 3, 4, NA, NA, 4, 3, 5, NA,…
## $ q9                    <dbl> 4, 4, 3, 5, NA, NA, 5, 3, 2, NA, NA, 5, 2, 2, NA…
## $ q10                   <dbl> 5, 4, 5, 5, 3, NA, 5, 3, 5, NA, NA, 4, 4, 5, NA,…
## $ TimeSpent             <dbl> 1555.1667, 1382.7001, 860.4335, 1598.6166, 1481.…
## $ TimeSpent_hours       <dbl> 25.91944500, 23.04500167, 14.34055833, 26.643610…
## $ TimeSpent_std         <dbl> -0.18051496, -0.30780313, -0.69325954, -0.148446…
## $ int                   <dbl> 5.0, 4.2, 5.0, 5.0, 3.8, 4.6, 5.0, 3.0, 4.2, NA,…
## $ pc                    <dbl> 4.50, 3.50, 4.00, 3.50, 3.50, 4.00, 3.50, 3.00, …
## $ uv                    <dbl> 4.333333, 4.000000, 3.666667, 5.000000, 3.500000…
  1. In the code chunk named filter-1, Filter the sci_classes data frame for students in OcnA courses. Assign to a new object with a different name. Use the head() function to examine your data frame.
#Type your code here

filter_1 <- sci_classess %>%
  filter ( subject == "OcnA")

head(filter_1)
## # A tibble: 6 × 30
##   student_id course_id    total_points_possib… total_points_ea… percentage_earn…
##        <dbl> <chr>                       <dbl>            <dbl>            <dbl>
## 1      44638 OcnA-S116-01                 3531             2672            0.757
## 2      47979 OcnA-S216-01                 4562             3090            0.677
## 3      54066 OcnA-S116-01                 4641             3429            0.739
## 4      54282 OcnA-S116-02                 3581             2777            0.775
## 5      54342 OcnA-S116-02                 3256             2876            0.883
## 6      54346 OcnA-S116-01                 4471             3773            0.844
## # … with 25 more variables: subject <chr>, semester <chr>, section <chr>,
## #   Gradebook_Item <chr>, Grade_Category <lgl>, FinalGradeCEMS <dbl>,
## #   Points_Possible <dbl>, Points_Earned <dbl>, Gender <chr>, q1 <dbl>,
## #   q2 <dbl>, q3 <dbl>, q4 <dbl>, q5 <dbl>, q6 <dbl>, q7 <dbl>, q8 <dbl>,
## #   q9 <dbl>, q10 <dbl>, TimeSpent <dbl>, TimeSpent_hours <dbl>,
## #   TimeSpent_std <dbl>, int <dbl>, pc <dbl>, uv <dbl>

Q: How many rows does the head() function display? Hint: Check the dimensions of your tibble in the console.

  1. In code chunk named filter-2, filter the sci_classes data frame so rows with NA for points earned are removed. Assign to a new object with a different name. Use glimpse() to examine all columns of your data frame.
# Type your code here
filter_2 <- sci_classess %>%  
  drop_na(Points_Earned)

head(filter_2)
## # A tibble: 6 × 30
##   student_id course_id     total_points_possi… total_points_ea… percentage_earn…
##        <dbl> <chr>                       <dbl>            <dbl>            <dbl>
## 1      44638 OcnA-S116-01                 3531             2672            0.757
## 2      47979 OcnA-S216-01                 4562             3090            0.677
## 3      48797 PhysA-S116-01                2207             1910            0.865
## 4      52446 PhysA-S116-01                2086             1719            0.824
## 5      53447 FrScA-S116-01                4655             3149            0.676
## 6      53475 FrScA-S116-02                1710             1402            0.820
## # … with 25 more variables: subject <chr>, semester <chr>, section <chr>,
## #   Gradebook_Item <chr>, Grade_Category <lgl>, FinalGradeCEMS <dbl>,
## #   Points_Possible <dbl>, Points_Earned <dbl>, Gender <chr>, q1 <dbl>,
## #   q2 <dbl>, q3 <dbl>, q4 <dbl>, q5 <dbl>, q6 <dbl>, q7 <dbl>, q8 <dbl>,
## #   q9 <dbl>, q10 <dbl>, TimeSpent <dbl>, TimeSpent_hours <dbl>,
## #   TimeSpent_std <dbl>, int <dbl>, pc <dbl>, uv <dbl>
  1. In the code chunk called arrange-1, Arrange sci_classes data by subject then percentage_earned in descending order. Assign to a new object. Use the str() function to examine the data type of each column in your data frame.

    sci_classess %>% 
      arrange(desc(subject),desc(percentage_earned))
    ## # A tibble: 603 × 30
    ##    student_id course_id     total_points_poss… total_points_ea… percentage_earn…
    ##         <dbl> <chr>                      <dbl>            <dbl>            <dbl>
    ##  1      92733 PhysA-S116-01               2829             2549            0.901
    ##  2      62576 PhysA-S116-01               2215             1931            0.872
    ##  3      94189 PhysA-S216-01               3682             3187            0.866
    ##  4      48797 PhysA-S116-01               2207             1910            0.865
    ##  5      87171 PhysA-S116-01               6318             5466            0.865
    ##  6      86353 PhysA-S116-01               5736             4953            0.863
    ##  7      92726 PhysA-S116-01               2739             2356            0.860
    ##  8      90326 PhysA-S116-01               2966             2539            0.856
    ##  9      85953 PhysA-S116-01               6564             5614            0.855
    ## 10      96027 PhysA-S216-01               2981             2534            0.850
    ## # … with 593 more rows, and 25 more variables: subject <chr>, semester <chr>,
    ## #   section <chr>, Gradebook_Item <chr>, Grade_Category <lgl>,
    ## #   FinalGradeCEMS <dbl>, Points_Possible <dbl>, Points_Earned <dbl>,
    ## #   Gender <chr>, q1 <dbl>, q2 <dbl>, q3 <dbl>, q4 <dbl>, q5 <dbl>, q6 <dbl>,
    ## #   q7 <dbl>, q8 <dbl>, q9 <dbl>, q10 <dbl>, TimeSpent <dbl>,
    ## #   TimeSpent_hours <dbl>, TimeSpent_std <dbl>, int <dbl>, pc <dbl>, uv <dbl>
    #%>%  arrange(desc(percentage_earned))
  2. In the code chunk name final-wrangle, use sci_classes data data and the %>% pipe operator:

#Type your code here

final_wrangle <- sci_classess %>%
   select(student_id,subject,semester,FinalGradeCEMS) %>%
   filter(subject=="OcnA") %>%
   arrange(desc(FinalGradeCEMS))
            
glimpse(final_wrangle)
## Rows: 111
## Columns: 4
## $ student_id     <dbl> 66740, 91163, 94744, 91818, 90090, 88168, 89114, 86758,…
## $ subject        <chr> "OcnA", "OcnA", "OcnA", "OcnA", "OcnA", "OcnA", "OcnA",…
## $ semester       <chr> "S116", "S216", "S216", "S116", "S116", "S116", "S116",…
## $ FinalGradeCEMS <dbl> 99.32998, 97.37018, 96.79732, 96.46231, 96.29816, 95.96…

Knit & Submit

Congratulations, you’ve completed your Foundation Badge on Learning Analytics Workflow! Complete the following steps to submit your work for review by

  1. Change the name of the author: in the YAML header at the very top of this document to your name. As noted in Reproducible Research in R, The YAML header controls the style and feel for knitted document but doesn’t actually display in the final output.

  2. Click the yarn icon above to “knit” your data product to a HTML file that will be saved in your R Project folder.

  3. Commit your changes in GitHub Desktop and push them to your online GitHub repository.

  4. Publish your HTML page the web using one of the following publishing methods: Publish on RPubs by clicking the “Publish” button located in the Viewer Pane when you knit your document. Note, you will need to quickly create a RPubs account. Publishing on GitHub using either GitHub Pages or the HTML previewer.

  5. Post a new discussion on GitHub to our Foundations Badges forum. In your post, include a link to your published web page and write a short reflection highlighting one thing you learned from this lab and one thing you’d like to explore further.