INTRODUCTION

My independent analysis for this week focuses on looking at a sample of K-8 teachers’ interpretations of computational thinking (“CT”) before and after participating in a CT-focused intervention. This work is part of a larger research project that focuses on helping teachers develop an understanding of CT knowledge and practices. Participants were asked to complete a pre-survey before they started their learning about CT and a post-survey at the end of the training. This analysis looks at one open-ended question asked in both surveys, which is about the CT definition.

RESEARCH QUESTION

This analysis is guided by the question of how participated teachers’ understanding of CT definition differs from the beginning and the end of their participation in a CT-focused intervention.

DATA

To answer the research question, I selected a sample of 51 teachers and used their pre- and post-responses to the question, “What is Computational Thinking?” in this analysis. The data wrangle and cleaning process is not presented in this analysis; instead, the focus of this analysis is on using n-grams and data visualization to look at the differences between the participants’ responses before (“Time1”) and after (“Time2”) the training.

library(dplyr)
library(tidytext)
library(tidyverse)
library(tidyr)
library(ggplot2)
library(igraph)
library(ggraph)
CT_def <- read_csv("definition.csv")
## Rows: 102 Columns: 3
## ── Column specification ────────────────────────────────────────────────────────
## Delimiter: ","
## chr (2): Type, definition
## dbl (1): ID
## 
## ℹ Use `spec()` to retrieve the full column specification for this data.
## ℹ Specify the column types or set `show_col_types = FALSE` to quiet this message.

ANALYSIS

In this section, I present the steps for analyzing the text data using n-grams.

CT_bigrams <- CT_def %>% unnest_tokens(bigram,definition,token = "ngrams", n = 2)
CT_bigrams %>%  count(bigram, sort = TRUE)
## # A tibble: 1,227 × 2
##    bigram                     n
##    <chr>                  <int>
##  1 computational thinking    62
##  2 thinking is               30
##  3 i think                   29
##  4 thinking means            19
##  5 a way                     18
##  6 problem solving           16
##  7 way of                    16
##  8 a computer                14
##  9 is a                      13
## 10 of thinking               13
## # … with 1,217 more rows
bigrams_separated <- CT_bigrams %>% 
          separate(bigram, c("word1","word2"), sep = " ")
bigrams_filtered <- bigrams_separated %>% 
          filter(!word1 %in% stop_words$word) %>% 
          filter(!word2 %in% stop_words$word)
bigram_counts <- bigrams_filtered %>% 
          count(word1, word2, sort = TRUE)
bigrams_united <- bigrams_filtered %>%
  unite(bigram, word1, word2, sep = " ")

VISUALIZATION (Part 1)

In this section, I look at the most common bigrams (n = 5) in both Time 1 and Time 2. As shown in the graph below, teachers did not provide very specific thoughts on defining CT, and the word combinations showed that participants were not very familiar with the terms. In contrast, the Time 2 bar graph showed more variations of word combinations in which participants defined CT using keywords such as “critical thinking”, and “solving” a lot more frequently.

#function
facet_bar <- function(df, y, x, by, nrow = 2, ncol = 2, scales = "free") {
          mapping <- aes(y = reorder_within({{ y }}, {{ x }}, {{ by }}), 
                         x = {{ x }}, 
                         fill = {{ by }})
          
          facet <- facet_wrap(vars({{ by }}), 
                              nrow = nrow, 
                              ncol = ncol,
                              scales = scales) 
          
          ggplot(df, mapping = mapping) + 
                    geom_col(show.legend = FALSE) + 
                    scale_y_reordered() + 
                    facet + 
                    ylab("")
graph <- bigrams_united %>% 
          count(Type, bigram, sort = TRUE) %>% 
          bind_log_odds(set = Type, feature = bigram, n = n) %>% 
          group_by(Type) %>% 
          top_n(5) %>% 
          ungroup() %>%
          facet_bar(y = bigram, x = log_odds_weighted, by = Type, nrow = 3)
} 

VISUALIZATION (Part 2)

In order to look at how each word related to each other among participants’ understanding of CT. I created the word networks for Time 1 and Time 2 to examine the differences in responses. As shown in the graphs below, the differences can be noticed between the two networks.

Time 1 Results:

CT_def_T1 <- bigrams_filtered %>% filter(Type == "Time1")

bigram_counts_T1 <- CT_def_T1 %>% 
          count(word1, word2, sort = TRUE)
bigram_graph_T1 <- bigram_counts_T1 %>% 
          graph_from_data_frame()

bigram_graph_filtered_T1 <- bigram_counts_T1 %>%
          filter(n > 1) %>%
          graph_from_data_frame()

set.seed(588)
a <- grid::arrow(type = "open", length = unit(.2, "inches"))
ggraph(bigram_graph_filtered_T1, layout = "fr") +
          geom_edge_link(aes(edge_alpha = n), show.legend = FALSE,
                         arrow = a, end_cap = circle(.07, 'inches')) +
          geom_node_point(color = "red", size = 3) +
          geom_node_text(aes(label = name), vjust = 1, hjust = 1) +
          theme_void()
## Warning: Using the `size` aesthetic in this geom was deprecated in ggplot2 3.4.0.
## ℹ Please use `linewidth` in the `default_aes` field and elsewhere instead.

Time 2 Results:

CT_def_T2 <- bigrams_filtered %>% filter(Type == "Time2")

bigram_counts_T2 <- CT_def_T2 %>% 
          count(word1, word2, sort = TRUE)
bigram_graph_T2 <- bigram_counts_T2 %>% 
          graph_from_data_frame()

bigram_graph_filtered_T2 <- bigram_counts_T2 %>%
          filter(n > 1) %>%
          graph_from_data_frame()

set.seed(589)
a <- grid::arrow(type = "open", length = unit(.2, "inches"))
ggraph(bigram_graph_filtered_T2, layout = "fr") +
          geom_edge_link(aes(edge_alpha = n), show.legend = FALSE,
                         arrow = a, end_cap = circle(.07, 'inches')) +
          geom_node_point(color = "blue", size = 3) +
          geom_node_text(aes(label = name), vjust = 1, hjust = 1) +
          theme_void()

CONCLUSION

I often conduct qualitative analysis on open-ended questions, which often require time and resources to establish credible coding schema and results. In this analysis, I used n-grams and data visualizations to examine the differences about teachers defined CT before and after they participated in a CT-related intervention. Results from this analysis show that the participants were able to enact CT terms and vocabularies compared to where they were before the intervention. However, more research is needed to unpack whether such understanding could last long-term and whether participants would be able to apply their understanding into teaching practices.

---
title: "Exploring K-8 Teachers’ Computational Thinking Understandings Before
  and After a CT-focused Intervention"
output:
  html_document:
    toc: yes
    toc_depth: 3
    toc_float: yes
    code_folding: hide
    code_download: yes
editor_options:
  markdown:
    wrap: 72
---

```{r setup, include=FALSE}
knitr::opts_chunk$set(echo = TRUE)
```

## INTRODUCTION

My independent analysis for this week focuses on looking at a sample of K-8 teachers’ interpretations of computational thinking (“CT”) before and after participating in a CT-focused intervention. This work is part of a larger research project that focuses on helping teachers develop an understanding of CT knowledge and practices. Participants were asked to complete a pre-survey before they started their learning about CT and a post-survey at the end of the training. This analysis looks at one open-ended question asked in both surveys, which is about the CT definition. 

## RESEARCH QUESTION

This analysis is guided by the question of how participated teachers’ understanding of CT definition differs from the beginning and the end of their participation in a CT-focused intervention.


## DATA

To answer the research question, I selected a sample of 51 teachers and used their pre- and post-responses to the question, “What is Computational Thinking?” in this analysis. The data wrangle and cleaning process is not presented in this analysis; instead, the focus of this analysis is on using n-grams and data visualization to look at the differences between the participants’ responses before (“Time1”) and after (“Time2”) the training. 

```{r load-packages, message=FALSE}
library(dplyr)
library(tidytext)
library(tidyverse)
library(tidyr)
library(ggplot2)
library(igraph)
library(ggraph)
```

```{r read-csv}
CT_def <- read_csv("definition.csv")
```

## ANALYSIS

In this section, I present the steps for analyzing the text data using n-grams. 
```{r tokenize}
CT_bigrams <- CT_def %>% unnest_tokens(bigram,definition,token = "ngrams", n = 2)
```

```{r count-bigrams}
CT_bigrams %>%  count(bigram, sort = TRUE)
```

```{r remove stop words}
bigrams_separated <- CT_bigrams %>% 
          separate(bigram, c("word1","word2"), sep = " ")
bigrams_filtered <- bigrams_separated %>% 
          filter(!word1 %in% stop_words$word) %>% 
          filter(!word2 %in% stop_words$word)
bigram_counts <- bigrams_filtered %>% 
          count(word1, word2, sort = TRUE)
bigrams_united <- bigrams_filtered %>%
  unite(bigram, word1, word2, sep = " ")
```

## VISUALIZATION (Part 1)
In this section, I look at the most common bigrams (n = 5) in both Time 1 and Time 2. As shown in the graph below, teachers did not provide very specific thoughts on defining CT, and the word combinations showed that participants were not very familiar with the terms. In contrast, the Time 2 bar graph showed more variations of word combinations in which participants defined CT using keywords such as “critical thinking”, and “solving” a lot more frequently. 

```{r visual_T1 and T2}
#function
facet_bar <- function(df, y, x, by, nrow = 2, ncol = 2, scales = "free") {
          mapping <- aes(y = reorder_within({{ y }}, {{ x }}, {{ by }}), 
                         x = {{ x }}, 
                         fill = {{ by }})
          
          facet <- facet_wrap(vars({{ by }}), 
                              nrow = nrow, 
                              ncol = ncol,
                              scales = scales) 
          
          ggplot(df, mapping = mapping) + 
                    geom_col(show.legend = FALSE) + 
                    scale_y_reordered() + 
                    facet + 
                    ylab("")
graph <- bigrams_united %>% 
          count(Type, bigram, sort = TRUE) %>% 
          bind_log_odds(set = Type, feature = bigram, n = n) %>% 
          group_by(Type) %>% 
          top_n(5) %>% 
          ungroup() %>%
          facet_bar(y = bigram, x = log_odds_weighted, by = Type, nrow = 3)
} 
```
![](graph.png){width="100%"}

## VISUALIZATION (Part 2)
In order to look at how each word related to each other among participants’ understanding of CT. I created the word networks for Time 1 and Time 2 to examine the differences in responses. As shown in the graphs below, the differences can be noticed between the two networks.

Time 1 Results:
```{r word net_T1}
CT_def_T1 <- bigrams_filtered %>% filter(Type == "Time1")

bigram_counts_T1 <- CT_def_T1 %>% 
          count(word1, word2, sort = TRUE)
bigram_graph_T1 <- bigram_counts_T1 %>% 
          graph_from_data_frame()

bigram_graph_filtered_T1 <- bigram_counts_T1 %>%
          filter(n > 1) %>%
          graph_from_data_frame()

set.seed(588)
a <- grid::arrow(type = "open", length = unit(.2, "inches"))
ggraph(bigram_graph_filtered_T1, layout = "fr") +
          geom_edge_link(aes(edge_alpha = n), show.legend = FALSE,
                         arrow = a, end_cap = circle(.07, 'inches')) +
          geom_node_point(color = "red", size = 3) +
          geom_node_text(aes(label = name), vjust = 1, hjust = 1) +
          theme_void()
```

Time 2 Results:
```{r word net_T2}
CT_def_T2 <- bigrams_filtered %>% filter(Type == "Time2")

bigram_counts_T2 <- CT_def_T2 %>% 
          count(word1, word2, sort = TRUE)
bigram_graph_T2 <- bigram_counts_T2 %>% 
          graph_from_data_frame()

bigram_graph_filtered_T2 <- bigram_counts_T2 %>%
          filter(n > 1) %>%
          graph_from_data_frame()

set.seed(589)
a <- grid::arrow(type = "open", length = unit(.2, "inches"))
ggraph(bigram_graph_filtered_T2, layout = "fr") +
          geom_edge_link(aes(edge_alpha = n), show.legend = FALSE,
                         arrow = a, end_cap = circle(.07, 'inches')) +
          geom_node_point(color = "blue", size = 3) +
          geom_node_text(aes(label = name), vjust = 1, hjust = 1) +
          theme_void()
```

#### CONCLUSION
I often conduct qualitative analysis on open-ended questions, which often require time and resources to establish credible coding schema and results. In this analysis, I used n-grams and data visualizations to examine the differences about teachers defined CT before and after they participated in a CT-related intervention. Results from this analysis show that the participants were able to enact CT terms and vocabularies compared to where they were before the intervention. However, more research is needed to unpack whether such understanding could last long-term and whether participants would be able to apply their understanding into teaching practices. 