# YOUR FINAL CODE HERE
<- c("Thor", "Rogue", "Electra", "Electra", "Wolverine")
Students <- c("Bread", "Orange", "Chocolate", "Carrots", "Milk") Foods
Data Sources badge
LASER Institute Foundation Learning Lab 1
The final activity for each learning lab provides space to work with data and to reflect on how the concepts and techniques introduced in each lab might apply to your own research.
To earn a badge for each lab, you are required to respond to a set of prompts for two parts:
In Part I, you will reflect on your understanding of key concepts and begin to think about potential next steps for your own study.
In Part II, you will create a simple data product in R that demonstrates your ability to apply a data analysis technique introduced in this learning lab.
Part I: Reflect and Plan
Use the institutional library (e.g. NCSU Library), Google Scholar or search engine to locate a research article, presentation, or resource that applies learning analytics analysis to an educational context or topic of interest. More specifically, locate a study that makes use of one of the data structures we learned today. You are also welcome to select one of your research papers.
Provide an APA citation for your selected study. Brouwer, J., Fernandes, C., Steglich, C., Jansen, E., Hofman, W.H., & Flache, A. (2022). The development of peer networks and academic performance in learning communities in higher education. Learning and Instruction, 80. https://doi.org/10.1016/j.learninstruc.2022.101603
What types of data are associated with LA ?
- Student Information Systems Administrative Data Structured
What type of data structures are analyzed in the educational context?
- Friendship nomination, help-seeking nomination, and GPA as academic performance.
How might this article be used to better understand a dataset or educational context of personal or professional interest to you?
- How authors examined learning communities.
Finally, how do these processes compare with what teachers and educational organizations already do to support and assess student learning?
- This study is about small groups to improve first-years students’ academic performance and successful transition.
Draft a research question of guided by techniques and data sources that you are potentially interested in exploring in more depth. - Is interaction among peers different across role-based, debate, and case-based discussions?
What data source(s) should be analyzed or discussed?
- LMS discussion board LA
What is the purpose of your article?
- To examine how the network is formed based on the design of online discussions.
Explain the analytical level at which these data would need to be collected and analyzed.
- Whom students interact with peers during discussions. Do they choose friends or random peers?
How, if at all, will your article touch upon the application(s) of LA to “understand and improve learning and the contexts in which learning occurs?”
- The article I chose will guide through the analysis to craft my own study. I will follow their steps while running my own dataset.
Part II: Data Product
After you finish the script file for lab1_badge add it to the community board.
Problem 1:
Create a data frame that includes two columns, one named “Students” and the other named “Foods”. The first column should be this vector (note the intentional repeated values): Thor, Rogue, Electra, Electra, Wolverine
The second column should be this vector: Bread, Orange, Chocolate, Carrots, Milk
Problem 2
Using the data frame created in Problem 2, use the table() command to create a frequency table for the column called “Students”
table(Students)
Students
Electra Rogue Thor Wolverine
2 1 1 1
Problem 3
Create a vector of five numbers of your choice between 0 and 10, save that vector to an object, and use the sum() function to calculate the sum of the numbers.
# YOUR FINAL CODE HERE
c(3,5,7,9)
[1] 3 5 7 9
<- c(3,5,7,9)
vec sum (vec)
[1] 24
Problem 4
Create code to read the data/sci-online-classes.csv file into R using function(s) from the tidyverse. (Note: this package loads with library(tidyverse). Save the data as an object called sci_classes.
Examine the contents of sci_classes in your console.Is your object a tibble? How do you know? (Hint: Check the output in the console.)
# YOUR FINAL CODE HERE
library(readr)
library(tidyverse)
── Attaching packages ─────────────────────────────────────── tidyverse 1.3.2 ──
✔ ggplot2 3.3.6 ✔ dplyr 1.0.10
✔ tibble 3.1.8 ✔ stringr 1.4.1
✔ tidyr 1.2.1 ✔ forcats 0.5.2
✔ purrr 0.3.4
── Conflicts ────────────────────────────────────────── tidyverse_conflicts() ──
✖ dplyr::filter() masks stats::filter()
✖ dplyr::lag() masks stats::lag()
<- read_csv("data/sci-online-classes.csv") sci_online_classes
Rows: 603 Columns: 30
── Column specification ────────────────────────────────────────────────────────
Delimiter: ","
chr (6): course_id, subject, semester, section, Gradebook_Item, Gender
dbl (23): student_id, total_points_possible, total_points_earned, percentage...
lgl (1): Grade_Category
ℹ Use `spec()` to retrieve the full column specification for this data.
ℹ Specify the column types or set `show_col_types = FALSE` to quiet this message.
sci_online_classes
# A tibble: 603 × 30
student_id course_id total…¹ total…² perce…³ subject semes…⁴ section Grade…⁵
<dbl> <chr> <dbl> <dbl> <dbl> <chr> <chr> <chr> <chr>
1 43146 FrScA-S21… 3280 2220 0.677 FrScA S216 02 POINTS…
2 44638 OcnA-S116… 3531 2672 0.757 OcnA S116 01 ATTEMP…
3 47448 FrScA-S21… 2870 1897 0.661 FrScA S216 01 POINTS…
4 47979 OcnA-S216… 4562 3090 0.677 OcnA S216 01 POINTS…
5 48797 PhysA-S11… 2207 1910 0.865 PhysA S116 01 POINTS…
6 51943 FrScA-S21… 4208 3596 0.855 FrScA S216 03 POINTS…
7 52326 AnPhA-S21… 4325 2255 0.521 AnPhA S216 01 POINTS…
8 52446 PhysA-S11… 2086 1719 0.824 PhysA S116 01 POINTS…
9 53447 FrScA-S11… 4655 3149 0.676 FrScA S116 01 POINTS…
10 53475 FrScA-S11… 1710 1402 0.820 FrScA S116 02 POINTS…
# … with 593 more rows, 21 more variables: Grade_Category <lgl>,
# FinalGradeCEMS <dbl>, Points_Possible <dbl>, Points_Earned <dbl>,
# Gender <chr>, q1 <dbl>, q2 <dbl>, q3 <dbl>, q4 <dbl>, q5 <dbl>, q6 <dbl>,
# q7 <dbl>, q8 <dbl>, q9 <dbl>, q10 <dbl>, TimeSpent <dbl>,
# TimeSpent_hours <dbl>, TimeSpent_std <dbl>, int <dbl>, pc <dbl>, uv <dbl>,
# and abbreviated variable names ¹total_points_possible,
# ²total_points_earned, ³percentage_earned, ⁴semester, ⁵Gradebook_Item
glimpse(sci_online_classes)
Rows: 603
Columns: 30
$ student_id <dbl> 43146, 44638, 47448, 47979, 48797, 51943, 52326,…
$ course_id <chr> "FrScA-S216-02", "OcnA-S116-01", "FrScA-S216-01"…
$ total_points_possible <dbl> 3280, 3531, 2870, 4562, 2207, 4208, 4325, 2086, …
$ total_points_earned <dbl> 2220, 2672, 1897, 3090, 1910, 3596, 2255, 1719, …
$ percentage_earned <dbl> 0.6768293, 0.7567261, 0.6609756, 0.6773345, 0.86…
$ subject <chr> "FrScA", "OcnA", "FrScA", "OcnA", "PhysA", "FrSc…
$ semester <chr> "S216", "S116", "S216", "S216", "S116", "S216", …
$ section <chr> "02", "01", "01", "01", "01", "03", "01", "01", …
$ Gradebook_Item <chr> "POINTS EARNED & TOTAL COURSE POINTS", "ATTEMPTE…
$ Grade_Category <lgl> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, …
$ FinalGradeCEMS <dbl> 93.45372, 81.70184, 88.48758, 81.85260, 84.00000…
$ Points_Possible <dbl> 5, 10, 10, 5, 438, 5, 10, 10, 443, 5, 12, 10, 5,…
$ Points_Earned <dbl> NA, 10.00, NA, 4.00, 399.00, NA, NA, 10.00, 425.…
$ Gender <chr> "M", "F", "M", "M", "F", "F", "M", "F", "F", "M"…
$ q1 <dbl> 5, 4, 5, 5, 4, NA, 5, 3, 4, NA, NA, 4, 3, 5, NA,…
$ q2 <dbl> 4, 4, 4, 5, 3, NA, 5, 3, 3, NA, NA, 5, 3, 3, NA,…
$ q3 <dbl> 4, 3, 4, 3, 3, NA, 3, 3, 3, NA, NA, 3, 3, 5, NA,…
$ q4 <dbl> 5, 4, 5, 5, 4, NA, 5, 3, 4, NA, NA, 5, 3, 5, NA,…
$ q5 <dbl> 5, 4, 5, 5, 4, NA, 5, 3, 4, NA, NA, 5, 4, 5, NA,…
$ q6 <dbl> 5, 4, 4, 5, 4, NA, 5, 4, 3, NA, NA, 5, 3, 5, NA,…
$ q7 <dbl> 5, 4, 4, 4, 4, NA, 4, 3, 3, NA, NA, 5, 3, 5, NA,…
$ q8 <dbl> 5, 5, 5, 5, 4, NA, 5, 3, 4, NA, NA, 4, 3, 5, NA,…
$ q9 <dbl> 4, 4, 3, 5, NA, NA, 5, 3, 2, NA, NA, 5, 2, 2, NA…
$ q10 <dbl> 5, 4, 5, 5, 3, NA, 5, 3, 5, NA, NA, 4, 4, 5, NA,…
$ TimeSpent <dbl> 1555.1667, 1382.7001, 860.4335, 1598.6166, 1481.…
$ TimeSpent_hours <dbl> 25.91944500, 23.04500167, 14.34055833, 26.643610…
$ TimeSpent_std <dbl> -0.18051496, -0.30780313, -0.69325954, -0.148446…
$ int <dbl> 5.0, 4.2, 5.0, 5.0, 3.8, 4.6, 5.0, 3.0, 4.2, NA,…
$ pc <dbl> 4.50, 3.50, 4.00, 3.50, 3.50, 4.00, 3.50, 3.00, …
$ uv <dbl> 4.333333, 4.000000, 3.666667, 5.000000, 3.500000…
as_tibble(sci_online_classes)
# A tibble: 603 × 30
student_id course_id total…¹ total…² perce…³ subject semes…⁴ section Grade…⁵
<dbl> <chr> <dbl> <dbl> <dbl> <chr> <chr> <chr> <chr>
1 43146 FrScA-S21… 3280 2220 0.677 FrScA S216 02 POINTS…
2 44638 OcnA-S116… 3531 2672 0.757 OcnA S116 01 ATTEMP…
3 47448 FrScA-S21… 2870 1897 0.661 FrScA S216 01 POINTS…
4 47979 OcnA-S216… 4562 3090 0.677 OcnA S216 01 POINTS…
5 48797 PhysA-S11… 2207 1910 0.865 PhysA S116 01 POINTS…
6 51943 FrScA-S21… 4208 3596 0.855 FrScA S216 03 POINTS…
7 52326 AnPhA-S21… 4325 2255 0.521 AnPhA S216 01 POINTS…
8 52446 PhysA-S11… 2086 1719 0.824 PhysA S116 01 POINTS…
9 53447 FrScA-S11… 4655 3149 0.676 FrScA S116 01 POINTS…
10 53475 FrScA-S11… 1710 1402 0.820 FrScA S116 02 POINTS…
# … with 593 more rows, 21 more variables: Grade_Category <lgl>,
# FinalGradeCEMS <dbl>, Points_Possible <dbl>, Points_Earned <dbl>,
# Gender <chr>, q1 <dbl>, q2 <dbl>, q3 <dbl>, q4 <dbl>, q5 <dbl>, q6 <dbl>,
# q7 <dbl>, q8 <dbl>, q9 <dbl>, q10 <dbl>, TimeSpent <dbl>,
# TimeSpent_hours <dbl>, TimeSpent_std <dbl>, int <dbl>, pc <dbl>, uv <dbl>,
# and abbreviated variable names ¹total_points_possible,
# ²total_points_earned, ³percentage_earned, ⁴semester, ⁵Gradebook_Item
<- sci_online_classes
sci_classes %>% select(c(!subject, !section)) sci_online_classes
# A tibble: 603 × 30
student_id course_id total…¹ total…² perce…³ semes…⁴ section Grade…⁵ Grade…⁶
<dbl> <chr> <dbl> <dbl> <dbl> <chr> <chr> <chr> <lgl>
1 43146 FrScA-S21… 3280 2220 0.677 S216 02 POINTS… NA
2 44638 OcnA-S116… 3531 2672 0.757 S116 01 ATTEMP… NA
3 47448 FrScA-S21… 2870 1897 0.661 S216 01 POINTS… NA
4 47979 OcnA-S216… 4562 3090 0.677 S216 01 POINTS… NA
5 48797 PhysA-S11… 2207 1910 0.865 S116 01 POINTS… NA
6 51943 FrScA-S21… 4208 3596 0.855 S216 03 POINTS… NA
7 52326 AnPhA-S21… 4325 2255 0.521 S216 01 POINTS… NA
8 52446 PhysA-S11… 2086 1719 0.824 S116 01 POINTS… NA
9 53447 FrScA-S11… 4655 3149 0.676 S116 01 POINTS… NA
10 53475 FrScA-S11… 1710 1402 0.820 S116 02 POINTS… NA
# … with 593 more rows, 21 more variables: FinalGradeCEMS <dbl>,
# Points_Possible <dbl>, Points_Earned <dbl>, Gender <chr>, q1 <dbl>,
# q2 <dbl>, q3 <dbl>, q4 <dbl>, q5 <dbl>, q6 <dbl>, q7 <dbl>, q8 <dbl>,
# q9 <dbl>, q10 <dbl>, TimeSpent <dbl>, TimeSpent_hours <dbl>,
# TimeSpent_std <dbl>, int <dbl>, pc <dbl>, uv <dbl>, subject <chr>, and
# abbreviated variable names ¹total_points_possible, ²total_points_earned,
# ³percentage_earned, ⁴semester, ⁵Gradebook_Item, ⁶Grade_Category
Problem 5
Using the sci_classes data frame:
Select all columns except subject and section.
Assign to a new object with a different name.
Examine your data frame.
Knit & Submit
Congratulations, you’ve completed your Data Sources Badge!
Complete the following steps to submit your work for review by:
Change the name of the author: in the YAML header at the very top of this document to your name. As noted in Reproducible Research in R, The YAML header controls the style and feel for knitted document but doesn’t actually display in the final output.
Click the yarn icon above to “knit” your data product to a HTML file that will be saved in your R Project folder.
Commit your changes in GitHub Desktop and push them to your online GitHub repository.
Publish your HTML page the web using one of the following publishing methods: Publish on RPubs by clicking the “Publish” button located in the Viewer Pane when you knit your document. Note, you will need to quickly create a RPubs account. Publishing on GitHub using either GitHub Pages or the HTML previewer.
Post a new discussion on GitHub to our Foundations Badges forum. In your post, include a link to your published web page and
write
a short reflection highlighting one thing you learned from this lab and one thing you’d like to explore further.