The final activity for each learning lab provides space to work with data and to reflect on how the concepts and techniques introduced in each lab might apply to your own research.
To earn a badge for each lab, you are required to respond to a set of prompts for two parts:
In Part I, you will reflect on your understanding of key concepts and begin to think about potential next steps for your own study.
In Part II, you will create a simple data product in R that demonstrates your ability to apply a data analysis technique introduced in this learning lab.
Part I: Reflect and Plan
Use the institutional library (e.g. NCSU Library), Google Scholar or search engine to locate a research article, presentation, or resource that applies learning analytics analysis to an educational context or topic of interest. More specifically, locate a study that makes use of one of the data structures we learned today. You are also welcome to select one of your research papers.
Provide an APA citation for your selected study.
Wise, A. F., & Jung, Y. (2019). Teaching with analytics: Towards a situated model of instructional decision-making. Journal of Learning Analytics, 6(2), 53-69.
What types of data are associated with LA ?
Clickstream logs, Time spent on tasks, Quiz results, Assignment submissions, Forum participation, Navigation paths within learning management systems (LMS)
What type of data structures are analyzed in the educational context?
Event-based data (e.g., clicks, page views, timestamps); Sequence data (e.g., order of learning actions); Relational data (e.g., interactions among students); Hierarchical data.
How might this article be used to better understand a dataset or educational context of personal or professional interest to you?
It is useful to understand student engagement patterns, and the authors mentioned cluster analysis which was used to detect learning strategies.
Finally, how do these processes compare with what teachers and educational organizations already do to support and assess student learning?
With traditional method, teachers collect student achievement, do observations and then do evaluation about student learning outcomes.
In LA, we can collect real-time information, student learning behavior data, and the source will be fruitful for prediction. It also helps for intervention.
Draft a research question of guided by techniques and data sources that you are potentially interested in exploring in more depth.
How do teachers’ engagement behaviors in professional learning platforms (e.g., time spent, participation in modules, interaction with peers) predict their self-efficacy?
What is the purpose of your article?
This article aims to explore how teachers’ learning behaviors in online or blended professional development (PD) environments influence their reported teaching self-efficacy.
What data source(s) should be analyzed or discussed?
Digital trace data from PL platforms (e.g., login frequency, module completion, discussion forum participation, video viewing time); Teacher self-efficacy surveys (pre/post training, or periodic check-ins)
Explain the analytical level at which these data would need to be collected and analyzed.
Level 1 (Individual Teacher): Engagement behaviors, efficacy survey responses. Level 2 (School): School PD culture, leadership support, contextual variables. HLM will be used to assess both individual and contextual predictors of teacher efficacy outcomes. Learning Analytics techniques (e.g., clustering, sequence analysis) may be used to identify patterns of PD engagement.
How, if at all, will your article touch upon the application(s) of LA to “understand and improve learning and the contexts in which learning occurs?”
This will helps me to answer: How teachers learn in digital PD environments. What types of learning behaviors are most effective in promoting professional confidence.
Part II: Data Product
In our Learning Analytics code-along, we scratched the surface on the number of ways that we can wrangle the data.
Using one of the data sets provided in the data folder, your goal for this lab is to extend the Learning Analytics Workflow from our code-along by preparing and wrangling different data.
Or alternatively, you may use your own data set to use in the workflow. If you do decide to use your own data set you must include:
Show two different ways using select function with your data, inspect and save as a new object.
Show one way to use filter function with your data, inspect and save as a new object.
Show one way using arrange function with your data, inspect and save as a new object.
Use the pipe operator to bring it all together.
Feel free to create a new script in your lab 2 to work through the following problems. Then when satisfied add the code in the code chunks below. Don’t forget to run the code to make sure it works.
Instructions:
Add your name to the document in author.
Set up the first (or, two if using an Introduction) phases of the LA workflow below. I’ve added the wrangle section for you. You will need to Prepare the libraries necessary to wrangle the data.
In the chunk called read-data: Import the sci-online-classes.csv from the data folder and save as a new object called sci_classes. Then inspect your data using a function of your choice.
# Type your code here#load todyverse#importlibrary(tidyverse)
── Attaching core tidyverse packages ──────────────────────── tidyverse 2.0.0 ──
✔ dplyr 1.1.4 ✔ readr 2.1.5
✔ forcats 1.0.0 ✔ stringr 1.5.1
✔ ggplot2 3.5.2 ✔ tibble 3.3.0
✔ lubridate 1.9.4 ✔ tidyr 1.3.1
✔ purrr 1.0.4
── Conflicts ────────────────────────────────────────── tidyverse_conflicts() ──
✖ dplyr::filter() masks stats::filter()
✖ dplyr::lag() masks stats::lag()
ℹ Use the conflicted package (<http://conflicted.r-lib.org/>) to force all conflicts to become errors
Rows: 603 Columns: 30
── Column specification ────────────────────────────────────────────────────────
Delimiter: ","
chr (6): course_id, subject, semester, section, Gradebook_Item, Gender
dbl (23): student_id, total_points_possible, total_points_earned, percentage...
lgl (1): Grade_Category
ℹ Use `spec()` to retrieve the full column specification for this data.
ℹ Specify the column types or set `show_col_types = FALSE` to quiet this message.
In the select-1 code chunk: Use the ‘select’ function to selectstudent_id, subject, semester, FinalGradeCEMS. Assign to a new object with a different name (you choose the name).
# Type your code heresci_classes %>%select(student_id, subject, semester, FinalGradeCEMS)
What do you notice about FinalGradeCEMS? (*Hint: NAs?)
Answer here: I notice NA values indicating missing data. This requires handling either by imputation or removal depending on the analysis requirements.
In code chunk named select-2select all columns except subject and section. Assign to a new object with a different name. Inspect your data frame with a different function.
# Type your code heresci_data <-select(sci_classes, -subject, -section)#inspect datastr(sci_data)
tibble [603 × 28] (S3: tbl_df/tbl/data.frame)
$ student_id : num [1:603] 43146 44638 47448 47979 48797 ...
$ course_id : chr [1:603] "FrScA-S216-02" "OcnA-S116-01" "FrScA-S216-01" "OcnA-S216-01" ...
$ total_points_possible: num [1:603] 3280 3531 2870 4562 2207 ...
$ total_points_earned : num [1:603] 2220 2672 1897 3090 1910 ...
$ percentage_earned : num [1:603] 0.677 0.757 0.661 0.677 0.865 ...
$ semester : chr [1:603] "S216" "S116" "S216" "S216" ...
$ Gradebook_Item : chr [1:603] "POINTS EARNED & TOTAL COURSE POINTS" "ATTEMPTED" "POINTS EARNED & TOTAL COURSE POINTS" "POINTS EARNED & TOTAL COURSE POINTS" ...
$ Grade_Category : logi [1:603] NA NA NA NA NA NA ...
$ FinalGradeCEMS : num [1:603] 93.5 81.7 88.5 81.9 84 ...
$ Points_Possible : num [1:603] 5 10 10 5 438 5 10 10 443 5 ...
$ Points_Earned : num [1:603] NA 10 NA 4 399 NA NA 10 425 2.5 ...
$ Gender : chr [1:603] "M" "F" "M" "M" ...
$ q1 : num [1:603] 5 4 5 5 4 NA 5 3 4 NA ...
$ q2 : num [1:603] 4 4 4 5 3 NA 5 3 3 NA ...
$ q3 : num [1:603] 4 3 4 3 3 NA 3 3 3 NA ...
$ q4 : num [1:603] 5 4 5 5 4 NA 5 3 4 NA ...
$ q5 : num [1:603] 5 4 5 5 4 NA 5 3 4 NA ...
$ q6 : num [1:603] 5 4 4 5 4 NA 5 4 3 NA ...
$ q7 : num [1:603] 5 4 4 4 4 NA 4 3 3 NA ...
$ q8 : num [1:603] 5 5 5 5 4 NA 5 3 4 NA ...
$ q9 : num [1:603] 4 4 3 5 NA NA 5 3 2 NA ...
$ q10 : num [1:603] 5 4 5 5 3 NA 5 3 5 NA ...
$ TimeSpent : num [1:603] 1555 1383 860 1599 1482 ...
$ TimeSpent_hours : num [1:603] 25.9 23 14.3 26.6 24.7 ...
$ TimeSpent_std : num [1:603] -0.181 -0.308 -0.693 -0.148 -0.235 ...
$ int : num [1:603] 5 4.2 5 5 3.8 4.6 5 3 4.2 NA ...
$ pc : num [1:603] 4.5 3.5 4 3.5 3.5 4 3.5 3 3 NA ...
$ uv : num [1:603] 4.33 4 3.67 5 3.5 ...
In the code chunk named filter-1, Filter the sci_classes data frame for students in OcnA courses. Assign to a new object with a different name. Use the head() function to examine your data frame.
#Type your code hereocna_students <-filter(sci_classes, subject =="OcnA")head(ocna_students)
Q: How many rows does the head() function display? Hint: Check the dimensions of your tibble in the console.
Answer here: 6
{Possible answerr: The head function displays 5 rows of data}
In code chunk named filter-2, filter the sci_classes data frame so rows with NA for points earned are removed. Assign to a new object with a different name. Use glimpse() to examine all columns of your data frame.
# Type your code heredelet_na_points <-filter(sci_classes, !is.na(total_points_possible))#inspect data delet_na_points
In the code chunk called arrange-1, Arrangesci_classes data by subject then percentage_earned in descending order. Assign to a new object. Use the str() function to examine the data type of each column in your data frame.
# Type your code herearranged_classes <-arrange(sci_classes, subject, desc(percentage_earned))#inpsect datastr(arranged_classes)
spc_tbl_ [603 × 30] (S3: spec_tbl_df/tbl_df/tbl/data.frame)
$ student_id : num [1:603] 70192 86488 96690 91175 86267 ...
$ course_id : chr [1:603] "AnPhA-S116-02" "AnPhA-S116-01" "AnPhA-S216-01" "AnPhA-S116-02" ...
$ total_points_possible: num [1:603] 1936 3342 4804 3199 3045 ...
$ total_points_earned : num [1:603] 1763 3033 4309 2867 2705 ...
$ percentage_earned : num [1:603] 0.911 0.908 0.897 0.896 0.888 ...
$ subject : chr [1:603] "AnPhA" "AnPhA" "AnPhA" "AnPhA" ...
$ semester : chr [1:603] "S116" "S116" "S216" "S116" ...
$ section : chr [1:603] "02" "01" "01" "02" ...
$ Gradebook_Item : chr [1:603] "POINTS EARNED & TOTAL COURSE POINTS" "POINTS EARNED & TOTAL COURSE POINTS" "POINTS EARNED & TOTAL COURSE POINTS" "POINTS EARNED & TOTAL COURSE POINTS" ...
$ Grade_Category : logi [1:603] NA NA NA NA NA NA ...
$ FinalGradeCEMS : num [1:603] 96 87.4 64.8 82.2 35.1 ...
$ Points_Possible : num [1:603] 10 28 10 5 50 15 10 10 353 460 ...
$ Points_Earned : num [1:603] 7 26 3 5 50 11 8 10 330 452 ...
$ Gender : chr [1:603] "F" "M" "F" "F" ...
$ q1 : num [1:603] 4 4 4 5 5 4 5 4 NA NA ...
$ q2 : num [1:603] 3 4 3 3 5 2 4 4 NA NA ...
$ q3 : num [1:603] 3 2 2 3 3 3 4 3 NA NA ...
$ q4 : num [1:603] 4 3 5 5 5 4 5 4 NA NA ...
$ q5 : num [1:603] 4 3 4 5 5 4 5 4 NA NA ...
$ q6 : num [1:603] 3 3 4 4 5 3 5 4 NA NA ...
$ q7 : num [1:603] 3 3 3 3 4 4 5 4 NA NA ...
$ q8 : num [1:603] 5 2 4 5 5 4 4 4 NA NA ...
$ q9 : num [1:603] 2 3 3 3 5 1 4 4 NA NA ...
$ q10 : num [1:603] 5 3 2 5 5 2 5 4 NA NA ...
$ TimeSpent : num [1:603] 1537 3600 1970 1315 406 ...
$ TimeSpent_hours : num [1:603] 25.62 60 32.83 21.92 6.77 ...
$ TimeSpent_std : num [1:603] -0.194 1.328 0.125 -0.358 -1.029 ...
$ int : num [1:603] 4.4 3 3.8 5 5 3.9 4.6 4 4.8 4.6 ...
$ pc : num [1:603] 3 2.5 2.5 3 3.5 3.5 3.75 3.5 3.5 4.5 ...
$ uv : num [1:603] 2.67 3.33 3.33 3.33 5 ...
- attr(*, "spec")=
.. cols(
.. student_id = col_double(),
.. course_id = col_character(),
.. total_points_possible = col_double(),
.. total_points_earned = col_double(),
.. percentage_earned = col_double(),
.. subject = col_character(),
.. semester = col_character(),
.. section = col_character(),
.. Gradebook_Item = col_character(),
.. Grade_Category = col_logical(),
.. FinalGradeCEMS = col_double(),
.. Points_Possible = col_double(),
.. Points_Earned = col_double(),
.. Gender = col_character(),
.. q1 = col_double(),
.. q2 = col_double(),
.. q3 = col_double(),
.. q4 = col_double(),
.. q5 = col_double(),
.. q6 = col_double(),
.. q7 = col_double(),
.. q8 = col_double(),
.. q9 = col_double(),
.. q10 = col_double(),
.. TimeSpent = col_double(),
.. TimeSpent_hours = col_double(),
.. TimeSpent_std = col_double(),
.. int = col_double(),
.. pc = col_double(),
.. uv = col_double()
.. )
- attr(*, "problems")=<externalptr>
In the code chunk name final-wrangle, usesci_classes data data and the %>% pipe operator:
To receive your the Foundations Badge, you will need to render this document and publish via a method designated by your instructor such as: Quarto Pub, Posit Cloud, RPubs , GitHub Pages, or other methods. Once you have shared a link to you published document with your instructor and they have reviewed your work, you will be provided a physical or digital version of the badge pictured at the top of this document!
If you have any questions about this badge, or run into any technical issues, don’t hesitate to contact your instructor. Once your instructor has checked your link, you will be provided a physical version of the badge!
Complete the following steps to submit your work for review:
First, change the name of the author: in the YAML header at the very top of this document to your name. The YAML header controls the style and feel for knitted document but doesn’t actually display in the final output.
Next, click the knit button in the toolbar above to “knit” your R Markdown document to a HTML file that will be saved in your R Project folder. You should see a formatted webpage appear in your Viewer tab in the lower right pan or in a new browser window. Let’s us know if you run into any issues with knitting.