INTRODUCTION
My independent analysis for this week focuses on looking at a sample
of K-8 teachers’ interpretations of computational thinking (“CT”) before
and after participating in a CT-focused intervention. This work is part
of a larger research project that focuses on helping teachers develop an
understanding of CT knowledge and practices. Participants were asked to
complete a pre-survey before they started their learning about CT and a
post-survey at the end of the training. This analysis looks at one
open-ended question asked in both surveys, which is about the CT
definition.
RESEARCH QUESTION
This analysis is guided by the question of how participated teachers’
understanding of CT definition differs from the beginning and the end of
their participation in a CT-focused intervention.
DATA
To answer the research question, I selected a sample of 51 teachers
and used their pre- and post-responses to the question, “What is
Computational Thinking?” in this analysis. The data wrangle and cleaning
process is not presented in this analysis; instead, the focus of this
analysis is on using n-grams and data visualization to look at the
differences between the participants’ responses before (“Time1”) and
after (“Time2”) the training.
library(dplyr)
library(tidytext)
library(tidyverse)
library(tidyr)
library(ggplot2)
library(igraph)
library(ggraph)
CT_def <- read_csv("definition.csv")
## Rows: 102 Columns: 3
## ── Column specification ────────────────────────────────────────────────────────
## Delimiter: ","
## chr (2): Type, definition
## dbl (1): ID
##
## ℹ Use `spec()` to retrieve the full column specification for this data.
## ℹ Specify the column types or set `show_col_types = FALSE` to quiet this message.
ANALYSIS
In this section, I present the steps for analyzing the text data
using n-grams.
CT_bigrams <- CT_def %>% unnest_tokens(bigram,definition,token = "ngrams", n = 2)
CT_bigrams %>% count(bigram, sort = TRUE)
## # A tibble: 1,227 × 2
## bigram n
## <chr> <int>
## 1 computational thinking 62
## 2 thinking is 30
## 3 i think 29
## 4 thinking means 19
## 5 a way 18
## 6 problem solving 16
## 7 way of 16
## 8 a computer 14
## 9 is a 13
## 10 of thinking 13
## # … with 1,217 more rows
bigrams_separated <- CT_bigrams %>%
separate(bigram, c("word1","word2"), sep = " ")
bigrams_filtered <- bigrams_separated %>%
filter(!word1 %in% stop_words$word) %>%
filter(!word2 %in% stop_words$word)
bigram_counts <- bigrams_filtered %>%
count(word1, word2, sort = TRUE)
bigrams_united <- bigrams_filtered %>%
unite(bigram, word1, word2, sep = " ")
VISUALIZATION (Part 1)
In this section, I look at the most common bigrams (n = 5) in both
Time 1 and Time 2. As shown in the graph below, teachers did not provide
very specific thoughts on defining CT, and the word combinations showed
that participants were not very familiar with the terms. In contrast,
the Time 2 bar graph showed more variations of word combinations in
which participants defined CT using keywords such as “critical
thinking”, and “solving” a lot more frequently.
#function
facet_bar <- function(df, y, x, by, nrow = 2, ncol = 2, scales = "free") {
mapping <- aes(y = reorder_within({{ y }}, {{ x }}, {{ by }}),
x = {{ x }},
fill = {{ by }})
facet <- facet_wrap(vars({{ by }}),
nrow = nrow,
ncol = ncol,
scales = scales)
ggplot(df, mapping = mapping) +
geom_col(show.legend = FALSE) +
scale_y_reordered() +
facet +
ylab("")
graph <- bigrams_united %>%
count(Type, bigram, sort = TRUE) %>%
bind_log_odds(set = Type, feature = bigram, n = n) %>%
group_by(Type) %>%
top_n(5) %>%
ungroup() %>%
facet_bar(y = bigram, x = log_odds_weighted, by = Type, nrow = 3)
}

VISUALIZATION (Part 2)
In order to look at how each word related to each other among
participants’ understanding of CT. I created the word networks for Time
1 and Time 2 to examine the differences in responses. As shown in the
graphs below, the differences can be noticed between the two
networks.
Time 1 Results:
CT_def_T1 <- bigrams_filtered %>% filter(Type == "Time1")
bigram_counts_T1 <- CT_def_T1 %>%
count(word1, word2, sort = TRUE)
bigram_graph_T1 <- bigram_counts_T1 %>%
graph_from_data_frame()
bigram_graph_filtered_T1 <- bigram_counts_T1 %>%
filter(n > 1) %>%
graph_from_data_frame()
set.seed(588)
a <- grid::arrow(type = "open", length = unit(.2, "inches"))
ggraph(bigram_graph_filtered_T1, layout = "fr") +
geom_edge_link(aes(edge_alpha = n), show.legend = FALSE,
arrow = a, end_cap = circle(.07, 'inches')) +
geom_node_point(color = "red", size = 3) +
geom_node_text(aes(label = name), vjust = 1, hjust = 1) +
theme_void()
## Warning: Using the `size` aesthetic in this geom was deprecated in ggplot2 3.4.0.
## ℹ Please use `linewidth` in the `default_aes` field and elsewhere instead.

Time 2 Results:
CT_def_T2 <- bigrams_filtered %>% filter(Type == "Time2")
bigram_counts_T2 <- CT_def_T2 %>%
count(word1, word2, sort = TRUE)
bigram_graph_T2 <- bigram_counts_T2 %>%
graph_from_data_frame()
bigram_graph_filtered_T2 <- bigram_counts_T2 %>%
filter(n > 1) %>%
graph_from_data_frame()
set.seed(589)
a <- grid::arrow(type = "open", length = unit(.2, "inches"))
ggraph(bigram_graph_filtered_T2, layout = "fr") +
geom_edge_link(aes(edge_alpha = n), show.legend = FALSE,
arrow = a, end_cap = circle(.07, 'inches')) +
geom_node_point(color = "blue", size = 3) +
geom_node_text(aes(label = name), vjust = 1, hjust = 1) +
theme_void()

CONCLUSION
I often conduct qualitative analysis on open-ended questions, which
often require time and resources to establish credible coding schema and
results. In this analysis, I used n-grams and data visualizations to
examine the differences about teachers defined CT before and after they
participated in a CT-related intervention. Results from this analysis
show that the participants were able to enact CT terms and vocabularies
compared to where they were before the intervention. However, more
research is needed to unpack whether such understanding could last
long-term and whether participants would be able to apply their
understanding into teaching practices.
---
title: "Exploring K-8 Teachers’ Computational Thinking Understandings Before
  and After a CT-focused Intervention"
output:
  html_document:
    toc: yes
    toc_depth: 3
    toc_float: yes
    code_folding: hide
    code_download: yes
editor_options:
  markdown:
    wrap: 72
---

```{r setup, include=FALSE}
knitr::opts_chunk$set(echo = TRUE)
```

## INTRODUCTION

My independent analysis for this week focuses on looking at a sample of K-8 teachers’ interpretations of computational thinking (“CT”) before and after participating in a CT-focused intervention. This work is part of a larger research project that focuses on helping teachers develop an understanding of CT knowledge and practices. Participants were asked to complete a pre-survey before they started their learning about CT and a post-survey at the end of the training. This analysis looks at one open-ended question asked in both surveys, which is about the CT definition. 

## RESEARCH QUESTION

This analysis is guided by the question of how participated teachers’ understanding of CT definition differs from the beginning and the end of their participation in a CT-focused intervention.


## DATA

To answer the research question, I selected a sample of 51 teachers and used their pre- and post-responses to the question, “What is Computational Thinking?” in this analysis. The data wrangle and cleaning process is not presented in this analysis; instead, the focus of this analysis is on using n-grams and data visualization to look at the differences between the participants’ responses before (“Time1”) and after (“Time2”) the training. 

```{r load-packages, message=FALSE}
library(dplyr)
library(tidytext)
library(tidyverse)
library(tidyr)
library(ggplot2)
library(igraph)
library(ggraph)
```

```{r read-csv}
CT_def <- read_csv("definition.csv")
```

## ANALYSIS

In this section, I present the steps for analyzing the text data using n-grams. 
```{r tokenize}
CT_bigrams <- CT_def %>% unnest_tokens(bigram,definition,token = "ngrams", n = 2)
```

```{r count-bigrams}
CT_bigrams %>%  count(bigram, sort = TRUE)
```

```{r remove stop words}
bigrams_separated <- CT_bigrams %>% 
          separate(bigram, c("word1","word2"), sep = " ")
bigrams_filtered <- bigrams_separated %>% 
          filter(!word1 %in% stop_words$word) %>% 
          filter(!word2 %in% stop_words$word)
bigram_counts <- bigrams_filtered %>% 
          count(word1, word2, sort = TRUE)
bigrams_united <- bigrams_filtered %>%
  unite(bigram, word1, word2, sep = " ")
```

## VISUALIZATION (Part 1)
In this section, I look at the most common bigrams (n = 5) in both Time 1 and Time 2. As shown in the graph below, teachers did not provide very specific thoughts on defining CT, and the word combinations showed that participants were not very familiar with the terms. In contrast, the Time 2 bar graph showed more variations of word combinations in which participants defined CT using keywords such as “critical thinking”, and “solving” a lot more frequently. 

```{r visual_T1 and T2}
#function
facet_bar <- function(df, y, x, by, nrow = 2, ncol = 2, scales = "free") {
          mapping <- aes(y = reorder_within({{ y }}, {{ x }}, {{ by }}), 
                         x = {{ x }}, 
                         fill = {{ by }})
          
          facet <- facet_wrap(vars({{ by }}), 
                              nrow = nrow, 
                              ncol = ncol,
                              scales = scales) 
          
          ggplot(df, mapping = mapping) + 
                    geom_col(show.legend = FALSE) + 
                    scale_y_reordered() + 
                    facet + 
                    ylab("")
graph <- bigrams_united %>% 
          count(Type, bigram, sort = TRUE) %>% 
          bind_log_odds(set = Type, feature = bigram, n = n) %>% 
          group_by(Type) %>% 
          top_n(5) %>% 
          ungroup() %>%
          facet_bar(y = bigram, x = log_odds_weighted, by = Type, nrow = 3)
} 
```
![](graph.png){width="100%"}

## VISUALIZATION (Part 2)
In order to look at how each word related to each other among participants’ understanding of CT. I created the word networks for Time 1 and Time 2 to examine the differences in responses. As shown in the graphs below, the differences can be noticed between the two networks.

Time 1 Results:
```{r word net_T1}
CT_def_T1 <- bigrams_filtered %>% filter(Type == "Time1")

bigram_counts_T1 <- CT_def_T1 %>% 
          count(word1, word2, sort = TRUE)
bigram_graph_T1 <- bigram_counts_T1 %>% 
          graph_from_data_frame()

bigram_graph_filtered_T1 <- bigram_counts_T1 %>%
          filter(n > 1) %>%
          graph_from_data_frame()

set.seed(588)
a <- grid::arrow(type = "open", length = unit(.2, "inches"))
ggraph(bigram_graph_filtered_T1, layout = "fr") +
          geom_edge_link(aes(edge_alpha = n), show.legend = FALSE,
                         arrow = a, end_cap = circle(.07, 'inches')) +
          geom_node_point(color = "red", size = 3) +
          geom_node_text(aes(label = name), vjust = 1, hjust = 1) +
          theme_void()
```

Time 2 Results:
```{r word net_T2}
CT_def_T2 <- bigrams_filtered %>% filter(Type == "Time2")

bigram_counts_T2 <- CT_def_T2 %>% 
          count(word1, word2, sort = TRUE)
bigram_graph_T2 <- bigram_counts_T2 %>% 
          graph_from_data_frame()

bigram_graph_filtered_T2 <- bigram_counts_T2 %>%
          filter(n > 1) %>%
          graph_from_data_frame()

set.seed(589)
a <- grid::arrow(type = "open", length = unit(.2, "inches"))
ggraph(bigram_graph_filtered_T2, layout = "fr") +
          geom_edge_link(aes(edge_alpha = n), show.legend = FALSE,
                         arrow = a, end_cap = circle(.07, 'inches')) +
          geom_node_point(color = "blue", size = 3) +
          geom_node_text(aes(label = name), vjust = 1, hjust = 1) +
          theme_void()
```

#### CONCLUSION
I often conduct qualitative analysis on open-ended questions, which often require time and resources to establish credible coding schema and results. In this analysis, I used n-grams and data visualizations to examine the differences about teachers defined CT before and after they participated in a CT-related intervention. Results from this analysis show that the participants were able to enact CT terms and vocabularies compared to where they were before the intervention. However, more research is needed to unpack whether such understanding could last long-term and whether participants would be able to apply their understanding into teaching practices. 