The final activity for each learning lab provides space to work with data and to reflect on how the concepts and techniques introduced in each lab might apply to your own research.
To earn a badge for each lab, you are required to respond to a set of prompts for two parts:
In Part I, you will reflect on your understanding of key concepts and begin to think about potential next steps for your own study.
In Part II, you will create a simple data product in R that demonstrates your ability to apply a data analysis technique introduced in this learning lab.
Use the institutional library (e.g. NCSU Library), Google Scholar or search engine to locate a research article, presentation, or resource that applies learning analytics analysis to an educational context or topic of interest. More specifically, locate a study that makes use of one of the data structures we learned today. You are also welcome to select one of your research papers.
Provide an APA citation for your selected study.
What types of data are associated with LA ?
What type of data structures are analyzed in the educational context?
How might this article be used to better understand a dataset or educational context of personal or professional interest to you?
Finally, how do these processes compare with what teachers and educational organizations already do to support and assess student learning?
Draft a research question of guided by techniques and data sources that you are potentially interested in exploring in more depth.
What data source(s) should be analyzed or discussed?
What is the purpose of your article?
Explain the analytical level at which these data would need to be collected and analyzed.
How, if at all, will your article touch upon the application(s) of LA to “understand and improve learning and the contexts in which learning occurs?”
After you finish the script file for lab1_badge add it to the community board.
Create a data frame that includes two columns, one named “Students” and the other named “Foods”. The first column should be this vector (note the intentional repeated values): Thor, Rogue, Electra, Electra, Wolverine
The second column should be this vector: Bread, Orange, Chocolate, Carrots, Milk
# YOUR FINAL CODE HERE
Students <- c("Thor", "Rogue", "Electra", "Electra", "Wolverine")
Foods <- c("Bread", "Orange", "Chocolate", "Carrots", "Milk")
df <- data.frame(Students = Students, Foods = Foods)
df
## Students Foods
## 1 Thor Bread
## 2 Rogue Orange
## 3 Electra Chocolate
## 4 Electra Carrots
## 5 Wolverine Milk
table(df)
## Foods
## Students Bread Carrots Chocolate Milk Orange
## Electra 0 1 1 0 0
## Rogue 0 0 0 0 1
## Thor 1 0 0 0 0
## Wolverine 0 0 0 1 0
Using the data frame created in Problem 1, use the table() command to create a frequency table for the column called “Students”
students_freq <- table(df$Students)
students_freq
##
## Electra Rogue Thor Wolverine
## 2 1 1 1
table(Students)
## Students
## Electra Rogue Thor Wolverine
## 2 1 1 1
Create a vector of five numbers of your choice between 0 and 10, save that vector to an object, and use the sum() function to calculate the sum of the numbers.
# YOUR FINAL CODE HERE
vec <- c(2, 3, 4, 5, 6)
sum <-sum(vec)
print(sum)
## [1] 20
Create code to read the data/sci-online-classes.csv file into R using function(s) from the tidyverse. (Note: this package loads with library(tidyverse). Save the data as an object called sci_classes.
Examine the contents of sci_classes in your console.Is your object a tibble? How do you know? (Hint: Check the output in the console.)
# YOUR FINAL CODE HERE
library(readr)
sci_online_classes <- read_csv("data/sci-online-classes.csv")
## Rows: 603 Columns: 30
## ── Column specification ────────────────────────────────────────────────────────
## Delimiter: ","
## chr (6): course_id, subject, semester, section, Gradebook_Item, Gender
## dbl (23): student_id, total_points_possible, total_points_earned, percentage...
## lgl (1): Grade_Category
##
## ℹ Use `spec()` to retrieve the full column specification for this data.
## ℹ Specify the column types or set `show_col_types = FALSE` to quiet this message.
sci_online_classes
## # A tibble: 603 × 30
## student_id course_id total…¹ total…² perce…³ subject semes…⁴ section Grade…⁵
## <dbl> <chr> <dbl> <dbl> <dbl> <chr> <chr> <chr> <chr>
## 1 43146 FrScA-S21… 3280 2220 0.677 FrScA S216 02 POINTS…
## 2 44638 OcnA-S116… 3531 2672 0.757 OcnA S116 01 ATTEMP…
## 3 47448 FrScA-S21… 2870 1897 0.661 FrScA S216 01 POINTS…
## 4 47979 OcnA-S216… 4562 3090 0.677 OcnA S216 01 POINTS…
## 5 48797 PhysA-S11… 2207 1910 0.865 PhysA S116 01 POINTS…
## 6 51943 FrScA-S21… 4208 3596 0.855 FrScA S216 03 POINTS…
## 7 52326 AnPhA-S21… 4325 2255 0.521 AnPhA S216 01 POINTS…
## 8 52446 PhysA-S11… 2086 1719 0.824 PhysA S116 01 POINTS…
## 9 53447 FrScA-S11… 4655 3149 0.676 FrScA S116 01 POINTS…
## 10 53475 FrScA-S11… 1710 1402 0.820 FrScA S116 02 POINTS…
## # … with 593 more rows, 21 more variables: Grade_Category <lgl>,
## # FinalGradeCEMS <dbl>, Points_Possible <dbl>, Points_Earned <dbl>,
## # Gender <chr>, q1 <dbl>, q2 <dbl>, q3 <dbl>, q4 <dbl>, q5 <dbl>, q6 <dbl>,
## # q7 <dbl>, q8 <dbl>, q9 <dbl>, q10 <dbl>, TimeSpent <dbl>,
## # TimeSpent_hours <dbl>, TimeSpent_std <dbl>, int <dbl>, pc <dbl>, uv <dbl>,
## # and abbreviated variable names ¹total_points_possible,
## # ²total_points_earned, ³percentage_earned, ⁴semester, ⁵Gradebook_Item
Using the sci_classes data frame:
Select all columns except subject and section.
Assign to a new object with a different name.
Examine your data frame.
# YOUR FINAL CODE HERE
new_df <- sci_online_classes[, !(names(sci_online_classes) %in% c("subject", "section"))]
new_df
## # A tibble: 603 × 28
## student_id course_id total…¹ total…² perce…³ semes…⁴ Grade…⁵ Grade…⁶ Final…⁷
## <dbl> <chr> <dbl> <dbl> <dbl> <chr> <chr> <lgl> <dbl>
## 1 43146 FrScA-S21… 3280 2220 0.677 S216 POINTS… NA 93.5
## 2 44638 OcnA-S116… 3531 2672 0.757 S116 ATTEMP… NA 81.7
## 3 47448 FrScA-S21… 2870 1897 0.661 S216 POINTS… NA 88.5
## 4 47979 OcnA-S216… 4562 3090 0.677 S216 POINTS… NA 81.9
## 5 48797 PhysA-S11… 2207 1910 0.865 S116 POINTS… NA 84
## 6 51943 FrScA-S21… 4208 3596 0.855 S216 POINTS… NA NA
## 7 52326 AnPhA-S21… 4325 2255 0.521 S216 POINTS… NA 83.6
## 8 52446 PhysA-S11… 2086 1719 0.824 S116 POINTS… NA 97.8
## 9 53447 FrScA-S11… 4655 3149 0.676 S116 POINTS… NA 96.1
## 10 53475 FrScA-S11… 1710 1402 0.820 S116 POINTS… NA NA
## # … with 593 more rows, 19 more variables: Points_Possible <dbl>,
## # Points_Earned <dbl>, Gender <chr>, q1 <dbl>, q2 <dbl>, q3 <dbl>, q4 <dbl>,
## # q5 <dbl>, q6 <dbl>, q7 <dbl>, q8 <dbl>, q9 <dbl>, q10 <dbl>,
## # TimeSpent <dbl>, TimeSpent_hours <dbl>, TimeSpent_std <dbl>, int <dbl>,
## # pc <dbl>, uv <dbl>, and abbreviated variable names ¹total_points_possible,
## # ²total_points_earned, ³percentage_earned, ⁴semester, ⁵Gradebook_Item,
## # ⁶Grade_Category, ⁷FinalGradeCEMS
Congratulations, you’ve completed your Data Sources Badge!
Complete the following steps to submit your work for review by:
Change the name of the author: in the YAML header at the very top of this document to your name. As noted in Reproducible Research in R, The YAML header controls the style and feel for knitted document but doesn’t actually display in the final output.
Click the yarn icon above to “knit” your data product to a HTML file that will be saved in your R Project folder.
Commit your changes in GitHub Desktop and push them to your online GitHub repository.
Publish your HTML page the web using one of the following publishing methods: Publish on RPubs by clicking the “Publish” button located in the Viewer Pane when you knit your document. Note, you will need to quickly create a RPubs account. Publishing on GitHub using either GitHub Pages or the HTML previewer.
Post a new discussion on GitHub to our Foundations
Badges forum. In your post, include a link to your published web
page and write a short reflection highlighting one thing
you learned from this lab and one thing you’d like to explore
further.