The final activity for each learning lab provides space to work with data and to reflect on how the concepts and techniques introduced in each lab might apply to your own research.
To earn a badge for each lab, you are required to respond to a set of prompts for two parts:
In Part I, you will reflect on your understanding of key concepts and begin to think about potential next steps for your own study.
In Part II, you will create a simple data product in R that demonstrates your ability to apply a data analysis technique introduced in this learning lab.
Use the institutional library (e.g. NCSU Library), Google Scholar or search engine to locate a research article, presentation, or resource that applies learning analytics analysis to an educational context or topic of interest. More specifically, locate a study that makes use of one of the data structures we learned today. You are also welcome to select one of your research papers.
Provide an APA citation for your selected study.
What types of data are associated with LA ?
What type of data structures are analyzed in the educational context?
How might this article be used to better understand a dataset or educational context of personal or professional interest to you?
Finally, how do these processes compare with what teachers and educational organizations already do to support and assess student learning?
Draft a research question of guided by techniques and data sources that you are potentially interested in exploring in more depth.
Draft RQ: Which supervised learning technique can most accurately predict teachers who facilitate less effective classroom discussions?
What data source(s) should be analyzed or discussed?
What is the purpose of your article?
Explain the analytical level at which these data would need to be collected and analyzed.
The raw data would comprised every teachers' interactions with the platform contents. The teachers would be placed as unique rows and their interaction variables would be placed as columns to be analyzed. The interaction variables would be as follows:
Interaction Variable - Variable Description
session - The number of sessions by the teacher
Time - The total time the teacher has spent on the Moodle LMS
UniqueDay - The number of unique days logged in by the teacher
ResourcePath - The starting and ending path associated with each resource engagement
TotalAction - The number of total activities/resource path events
CourseView - The number of lesson views
ResourceView - The number of lesson resource views
The outcome variable (the variable to be predicted) would be discussion effectiveness (EffectiveDiscussion). I would apply five commonly used classification algorithms: k-nearest neighbors (kNN), decision trees (DT), naïve Bayes (NB), random forest (RF), and support vector machines (SVM). I would use performance metrics and confusion matrices for model evaluation.
How, if at all, will your article touch upon the application(s) of LA to “understand and improve learning and the contexts in which learning occurs?”
After you finish the script file for lab1_badge add it to the community board.
Create a data frame that includes two columns, one named “students” and the other named “foods.” The first column should be this vector (note the intentional repeated values): Thor, Rogue, Electra, Electra, Wolverine
The second column should be this vector: Bread, Orange, Chocolate, Carrots, Milk
# YOUR FINAL CODE BELOW
df_stuFoods <- data.frame(students = c("Thor", "Rogue", "Electra", "Electra",
"Wolverine"),
foods = c("Bread", "Orange", "Chocolate", "Carrots",
"Milk"))
df_stuFoods
## students foods
## 1 Thor Bread
## 2 Rogue Orange
## 3 Electra Chocolate
## 4 Electra Carrots
## 5 Wolverine Milk
Using the data frame created in Problem 2, use the table() command to create a frequency table for the column called “students”
# YOUR FINAL CODE BELOW
table(df_stuFoods)
## foods
## students Bread Carrots Chocolate Milk Orange
## Electra 0 1 1 0 0
## Rogue 0 0 0 0 1
## Thor 1 0 0 0 0
## Wolverine 0 0 0 1 0
Create a vector of five numbers of your choice between 0 and 10, save that vector to an object, and use the sum() function to calculate the sum of the numbers.
# YOUR FINAL CODE BELOW
numbers <- c(9,4,2,7,1)
sum(numbers)
## [1] 23
Create code to read the data/sci-online-classes.csv file into R using function(s) from the tidyverse package. (Note: this requires the package tidyverse). Save the data as an object called sci_classes.
Examine the contents of sci_classes in your console. Is your object a tibble? How do you know? (Hint: Check the output in the console.)
# YOUR FINAL CODE BELOW
#Answer to a.
library(tidyverse)
## ── Attaching core tidyverse packages ──────────────────────── tidyverse 2.0.0 ──
## ✔ dplyr 1.1.2 ✔ readr 2.1.4
## ✔ forcats 1.0.0 ✔ stringr 1.5.0
## ✔ ggplot2 3.4.2 ✔ tibble 3.2.1
## ✔ lubridate 1.9.2 ✔ tidyr 1.3.0
## ✔ purrr 1.0.1
## ── Conflicts ────────────────────────────────────────── tidyverse_conflicts() ──
## ✖ dplyr::filter() masks stats::filter()
## ✖ dplyr::lag() masks stats::lag()
## ℹ Use the conflicted package (<http://conflicted.r-lib.org/>) to force all conflicts to become errors
sci_classes <- read_csv("data/sci-online-classes.csv")
## Rows: 603 Columns: 30
## ── Column specification ────────────────────────────────────────────────────────
## Delimiter: ","
## chr (6): course_id, subject, semester, section, Gradebook_Item, Gender
## dbl (23): student_id, total_points_possible, total_points_earned, percentage...
## lgl (1): Grade_Category
##
## ℹ Use `spec()` to retrieve the full column specification for this data.
## ℹ Specify the column types or set `show_col_types = FALSE` to quiet this message.
sci_classes
## # A tibble: 603 × 30
## student_id course_id total_points_possible total_points_earned
## <dbl> <chr> <dbl> <dbl>
## 1 43146 FrScA-S216-02 3280 2220
## 2 44638 OcnA-S116-01 3531 2672
## 3 47448 FrScA-S216-01 2870 1897
## 4 47979 OcnA-S216-01 4562 3090
## 5 48797 PhysA-S116-01 2207 1910
## 6 51943 FrScA-S216-03 4208 3596
## 7 52326 AnPhA-S216-01 4325 2255
## 8 52446 PhysA-S116-01 2086 1719
## 9 53447 FrScA-S116-01 4655 3149
## 10 53475 FrScA-S116-02 1710 1402
## # ℹ 593 more rows
## # ℹ 26 more variables: percentage_earned <dbl>, subject <chr>, semester <chr>,
## # section <chr>, Gradebook_Item <chr>, Grade_Category <lgl>,
## # FinalGradeCEMS <dbl>, Points_Possible <dbl>, Points_Earned <dbl>,
## # Gender <chr>, q1 <dbl>, q2 <dbl>, q3 <dbl>, q4 <dbl>, q5 <dbl>, q6 <dbl>,
## # q7 <dbl>, q8 <dbl>, q9 <dbl>, q10 <dbl>, TimeSpent <dbl>,
## # TimeSpent_hours <dbl>, TimeSpent_std <dbl>, int <dbl>, pc <dbl>, uv <dbl>
#View(sci_classes)
#Answer to b.
#Yes, sci_classes is a tibble because it says "A tibble: 603 x 30". Also, the title of the object visual is spec_tbl_df, suggesting that the object has inherited from the tbl_df class.
is_tibble(sci_classes) #This function confirms the tibble status by return a value of "TRUE"
## [1] TRUE
Using the sci_classes data frame:
Select all columns except subject and section.
Assign to a new object with a different name.
Examine your data frame.
# YOUR FINAL CODE BELOW
new_sci_classes = subset(sci_classes, select = -c(subject,section))
new_sci_classes
## # A tibble: 603 × 28
## student_id course_id total_points_possible total_points_earned
## <dbl> <chr> <dbl> <dbl>
## 1 43146 FrScA-S216-02 3280 2220
## 2 44638 OcnA-S116-01 3531 2672
## 3 47448 FrScA-S216-01 2870 1897
## 4 47979 OcnA-S216-01 4562 3090
## 5 48797 PhysA-S116-01 2207 1910
## 6 51943 FrScA-S216-03 4208 3596
## 7 52326 AnPhA-S216-01 4325 2255
## 8 52446 PhysA-S116-01 2086 1719
## 9 53447 FrScA-S116-01 4655 3149
## 10 53475 FrScA-S116-02 1710 1402
## # ℹ 593 more rows
## # ℹ 24 more variables: percentage_earned <dbl>, semester <chr>,
## # Gradebook_Item <chr>, Grade_Category <lgl>, FinalGradeCEMS <dbl>,
## # Points_Possible <dbl>, Points_Earned <dbl>, Gender <chr>, q1 <dbl>,
## # q2 <dbl>, q3 <dbl>, q4 <dbl>, q5 <dbl>, q6 <dbl>, q7 <dbl>, q8 <dbl>,
## # q9 <dbl>, q10 <dbl>, TimeSpent <dbl>, TimeSpent_hours <dbl>,
## # TimeSpent_std <dbl>, int <dbl>, pc <dbl>, uv <dbl>
#View(new_sci_classes)
Congratulations, you’ve completed your Data Sources Badge!
Complete the following steps to submit your work for review by:
Complete the following steps to knit and publish your work:
First, change the name of the author: in the YAML
header at the very top of this document to your name. The YAML
header controls the style and feel for knitted document but doesn’t
actually display in the final output.
Next, click the knit button in the toolbar above to “knit” your R Markdown document to a HTML file that will be saved in your R Project folder. You should see a formatted webpage appear in your Viewer tab in the lower right pan or in a new browser window. Let’s us know if you run into any issues with knitting.
Finally, publish your webpage on on Posit Cloud by clicking the “Publish” button located in the Viewer Pane after you knit your document. See screenshot below.
Congratulations, you’ve completed Foundations Learning Badge 1! To receive credit for this assignment and earn the an official Foundations LASER Badge, share the link to published webpage under an empty Badge Artifact column on the 2023 LASER Scholar Information and Documents spreadsheet: https://go.ncsu.edu/laser-sheet. We recommend bookmarking this spreadsheet as we’ll be using it throughout the year to keep track of your progress.
Once your instructor has checked your link, you will be provided a physical version of the badge below!