The final
activity for each learning lab provides space to work with data and to
reflect on how the concepts and techniques introduced in each lab might
apply to your own research.
To earn a badge for each lab, you are required to respond to a set of prompts for two parts:
In Part I, you will reflect on your understanding of key concepts and begin to think about potential next steps for your own study.
In Part II, you will create a simple data product in R that demonstrates your ability to apply a data analysis technique introduced in this learning lab.
Use the institutional library (e.g. NCSU Library), Google Scholar or search engine to locate a research article, presentation, or resource that applies learning analytics analysis to an educational context or topic of interest. More specifically, locate a study that makes use of one of the data structures we learned today. You are also welcome to select one of your research papers.
Provide an APA citation for your selected study.
Xing, W., Zhu, G., Arslan, O., Shim, O., & Popov, V. (2021). Using learning analytics to explore the multifaceted engagement in collaborative learning. Journal of Computing in Higher Education.
What educational issue, “problem of practice,” and/or questions were addressed?
Measuring multifaceted engagement (behavioral, social, cognitive, emotional, meta-cognitive) when students solved the problem together.
What are some common approaches EDA approaches used and what did they entail?
How were data visualization or feature engineering used to support analysis, if at all?What were the key findings or conclusions?
Finally, what value, if any, might education practitioners find in these results?
Draft a new research question of guided by the the phases of the Learning Anlytics Workflow. Or use one of your current research questions.
What educational issue, “problem of practice,” and/or questions is addressed??
Briefly describe any steps of the EDA approach that will be used..
What elements of EDA might require human judgement and decision making?
In our Learning Analytics code-along, we only scratched the surface on the number of ways that we can wrangle the data.
Using one of the data sets provided in the data folder, your goal for
this lab is to extend the Data Visualizations using ggplot
for Learning Analytics. You have three options for completing the Data
Product portion: You can create the visualization exercise provided.
Create a visualization of your choice using a data set from the data
folder Create a visualization using your own data.
I highly recommend creating a new R script in your lab-3 folder to complete this task. When your code is ready to share, use the code chunk below to share the final code for your model and answer the questions that follow.
Exercise 1: - Using the `sci-online to create a basic visualization that: + Examine the relationship between two categorical variables. + Add an appropriate title to your chart. + Add a caption that poses a question educators may have about this data that your visualization could help answer.
# YOUR FINAL CODE HERE
library(tidyverse)
## ── Attaching packages ─────────────────────────────────────── tidyverse 1.3.2 ──
## ✔ ggplot2 3.4.0 ✔ purrr 0.3.4
## ✔ tibble 3.1.8 ✔ dplyr 1.0.10
## ✔ tidyr 1.2.1 ✔ stringr 1.4.1
## ✔ readr 2.1.2 ✔ forcats 0.5.2
## ── Conflicts ────────────────────────────────────────── tidyverse_conflicts() ──
## ✖ dplyr::filter() masks stats::filter()
## ✖ dplyr::lag() masks stats::lag()
library(readxl)
library(readr)
sci_online_classes <- read_csv("data/sci-online-classes.csv")
## Rows: 603 Columns: 30
## ── Column specification ────────────────────────────────────────────────────────
## Delimiter: ","
## chr (6): course_id, subject, semester, section, Gradebook_Item, Gender
## dbl (23): student_id, total_points_possible, total_points_earned, percentage...
## lgl (1): Grade_Category
##
## ℹ Use `spec()` to retrieve the full column specification for this data.
## ℹ Specify the column types or set `show_col_types = FALSE` to quiet this message.
view(sci_online_classes)
library(here)
## here() starts at /Users/lolesova/Dropbox (UFL)/LASER2022/Foundation Labs/foundational-skills/foundation_lab_3
library(skimr)
skim(sci_online_classes)
Name | sci_online_classes |
Number of rows | 603 |
Number of columns | 30 |
_______________________ | |
Column type frequency: | |
character | 6 |
logical | 1 |
numeric | 23 |
________________________ | |
Group variables | None |
Variable type: character
skim_variable | n_missing | complete_rate | min | max | empty | n_unique | whitespace |
---|---|---|---|---|---|---|---|
course_id | 0 | 1 | 12 | 13 | 0 | 26 | 0 |
subject | 0 | 1 | 4 | 5 | 0 | 5 | 0 |
semester | 0 | 1 | 4 | 4 | 0 | 3 | 0 |
section | 0 | 1 | 2 | 2 | 0 | 4 | 0 |
Gradebook_Item | 0 | 1 | 9 | 35 | 0 | 3 | 0 |
Gender | 0 | 1 | 1 | 1 | 0 | 2 | 0 |
Variable type: logical
skim_variable | n_missing | complete_rate | mean | count |
---|---|---|---|---|
Grade_Category | 603 | 0 | NaN | : |
Variable type: numeric
skim_variable | n_missing | complete_rate | mean | sd | p0 | p25 | p50 | p75 | p100 | hist |
---|---|---|---|---|---|---|---|---|---|---|
student_id | 0 | 1.00 | 86069.54 | 10548.60 | 43146.00 | 85612.50 | 88340.00 | 92730.50 | 97441.00 | ▁▁▁▃▇ |
total_points_possible | 0 | 1.00 | 4274.41 | 2312.74 | 840.00 | 2809.50 | 3583.00 | 5069.00 | 15552.00 | ▇▅▂▁▁ |
total_points_earned | 0 | 1.00 | 3244.69 | 1832.00 | 651.00 | 2050.50 | 2757.00 | 3875.00 | 12208.00 | ▇▅▁▁▁ |
percentage_earned | 0 | 1.00 | 0.76 | 0.09 | 0.34 | 0.70 | 0.78 | 0.83 | 0.91 | ▁▁▃▇▇ |
FinalGradeCEMS | 30 | 0.95 | 77.20 | 22.23 | 0.00 | 71.25 | 84.57 | 92.10 | 100.00 | ▁▁▁▃▇ |
Points_Possible | 0 | 1.00 | 76.87 | 167.51 | 5.00 | 10.00 | 10.00 | 30.00 | 935.00 | ▇▁▁▁▁ |
Points_Earned | 92 | 0.85 | 68.63 | 145.26 | 0.00 | 7.00 | 10.00 | 26.12 | 828.20 | ▇▁▁▁▁ |
q1 | 123 | 0.80 | 4.30 | 0.68 | 1.00 | 4.00 | 4.00 | 5.00 | 5.00 | ▁▁▂▇▇ |
q2 | 126 | 0.79 | 3.63 | 0.93 | 1.00 | 3.00 | 4.00 | 4.00 | 5.00 | ▁▂▆▇▃ |
q3 | 123 | 0.80 | 3.33 | 0.91 | 1.00 | 3.00 | 3.00 | 4.00 | 5.00 | ▁▃▇▅▂ |
q4 | 125 | 0.79 | 4.27 | 0.85 | 1.00 | 4.00 | 4.00 | 5.00 | 5.00 | ▁▁▂▇▇ |
q5 | 127 | 0.79 | 4.19 | 0.68 | 2.00 | 4.00 | 4.00 | 5.00 | 5.00 | ▁▂▁▇▅ |
q6 | 127 | 0.79 | 4.01 | 0.80 | 1.00 | 4.00 | 4.00 | 5.00 | 5.00 | ▁▁▃▇▅ |
q7 | 129 | 0.79 | 3.91 | 0.82 | 1.00 | 3.00 | 4.00 | 4.75 | 5.00 | ▁▁▅▇▅ |
q8 | 129 | 0.79 | 4.29 | 0.68 | 1.00 | 4.00 | 4.00 | 5.00 | 5.00 | ▁▁▂▇▆ |
q9 | 129 | 0.79 | 3.49 | 0.98 | 1.00 | 3.00 | 4.00 | 4.00 | 5.00 | ▁▃▇▇▃ |
q10 | 129 | 0.79 | 4.10 | 0.93 | 1.00 | 4.00 | 4.00 | 5.00 | 5.00 | ▁▂▃▇▇ |
TimeSpent | 5 | 0.99 | 1799.75 | 1354.93 | 0.45 | 851.90 | 1550.91 | 2426.09 | 8870.88 | ▇▅▁▁▁ |
TimeSpent_hours | 5 | 0.99 | 30.00 | 22.58 | 0.01 | 14.20 | 25.85 | 40.43 | 147.85 | ▇▅▁▁▁ |
TimeSpent_std | 5 | 0.99 | 0.00 | 1.00 | -1.33 | -0.70 | -0.18 | 0.46 | 5.22 | ▇▅▁▁▁ |
int | 76 | 0.87 | 4.22 | 0.59 | 2.00 | 3.90 | 4.20 | 4.70 | 5.00 | ▁▁▃▇▇ |
pc | 75 | 0.88 | 3.61 | 0.64 | 1.50 | 3.00 | 3.50 | 4.00 | 5.00 | ▁▁▇▅▂ |
uv | 75 | 0.88 | 3.72 | 0.70 | 1.00 | 3.33 | 3.67 | 4.17 | 5.00 | ▁▁▆▇▅ |
ggplot(sci_online_classes,
aes(x = TimeSpent_hours,
y = percentage_earned,
color = FinalGradeCEMS)) +
geom_point()+
labs(title="How Time Spent on Course LMS is Related to Percentage Earned in the Course",
x="Time Spent (Hours)",
y = "Percentage of Points Earned")
## Warning: Removed 5 rows containing missing values (`geom_point()`).
Exercise 2: - Using the `sci-online to create a basic visualization that: + examines the relationship between two continuous variables. (scatterplot with layers, #’ a log-log or line plot, or one using coord functions.) + Add an appropriate title to your chart. + Add a caption that poses a question educators may have about this data that your visualization could help answer. + Add or adjust any aesthetics to improve the readability of visual appeal of your viz. + Use a color scale if appropriate to modify the default colors used by ggplot. + Adjust or remove your legend as appropriate.
# YOUR FINAL CODE HERE
sci_online_classes %>%
ggplot(aes(x = TimeSpent_hours)) +
geom_histogram(bins = 10,
fill = "red",
colour = "black")+
labs(title="Time Spent on LMS histogram plot",x="Time Spent(hours)", y = "Count")+
theme_classic()
## Warning: Removed 5 rows containing non-finite values (`stat_bin()`).
Congratulations, you’ve completed your Foundation Badge on Learning Analytics Workflow! Complete the following steps to submit your work for review by:
Change the name of the author: in the YAML header at the very top of this document to your name. As noted in Reproducible Research in R, The YAML header controls the style and feel for knitted document but doesn’t actually display in the final output.
Click the yarn icon above to “knit” your data product to a HTML file that will be saved in your R Project folder.
Commit your changes in GitHub Desktop and push them to your online GitHub repository.
Publish your HTML page the web using one of the following publishing methods: Publish on RPubs by clicking the “Publish” button located in the Viewer Pane when you knit your document. Note, you will need to quickly create a RPubs account. Publishing on GitHub using either GitHub Pages or the HTML previewer.
Post a new discussion on GitHub to our Foundations
Badges forum. In your post, include a link to your published web
page and write
a short reflection highlighting one thing
you learned from this lab and one thing you’d like to explore
further.