1 Introduction

This is just a quick attempt to build a directed network-based model of the relationship between modal verbs in double modal (DM) constructions based on Twitter data using Markov models. Basically, the idea is to visualise relationships between pairs of modals based on which modals tend to combine.

2 Packages

library(markovchain)

## Package:  markovchain
## Version:  0.8.5-2
## Date:     2020-09-07
## BugReport: https://github.com/spedygiorgio/markovchain/issues

library(igraph)

## 
## Attaching package: 'igraph'

## The following objects are masked from 'package:stats':
## 
##     decompose, spectrum

## The following object is masked from 'package:base':
## 
##     union

library(papeR)

## Loading required package: car

## Loading required package: carData

## Loading required package: xtable

## Registered S3 method overwritten by 'papeR':
##   method    from
##   Anova.lme car

## 
## Attaching package: 'papeR'

## The following object is masked from 'package:utils':
## 
##     toLatex

3 Data

The data is just a frequency matrix, with indivudal modal verbs as rows and columns, where the frequency in each cell being the frequency of the DM formed by combining the modal in the row in first position with the modal in the column in second position.

DM <- as.data.frame(read.table("DM_FREQ.csv", sep=",", header=TRUE, row.names=1))
DM

4 Network Visualisation

So now we want to make a network graph that shows the relationship between 11 modals. And I think we probably want magnitudes reflected here, so like we don’t want to row normalise or make a transition matrix, for example, especially given uneven the distribution is.

DM <- as.matrix(DM)
IG <- graph_from_adjacency_matrix(DM, weighted=TRUE, mode = "directed")
IG

## IGRAPH 27cac52 DNW- 11 76 -- 
## + attr: name (v/c), weight (e/n)
## + edges from 27cac52 (vertex names):
##  [1] can  ->could    can  ->may      can  ->might    can  ->must    
##  [5] can  ->shall    can  ->should   can  ->will     can  ->would   
##  [9] could->can      could->may      could->might    could->should  
## [13] could->will     could->would    may  ->can      may  ->could   
## [17] may  ->might    may  ->must     may  ->ought_to may  ->shall   
## [21] may  ->should   may  ->used_to  may  ->will     may  ->would   
## [25] might->can      might->could    might->may      might->must    
## [29] might->ought_to might->shall    might->should   might->used_to 
## + ... omitted several edges

plot(IG)

Lots of infrequent links here so we can trim and re-plot…

DM.trim <- DM
DM.trim[DM.trim < 20] <- 0
DM.trim

##           can could may might must ought_to shall should used_to will would
## can         0   135  21     0    0        0     0     20       0  111     0
## could      60     0   0    20    0        0     0      0       0    0    41
## may       273    53   0     0    0        0     0      0       0   22     0
## might    1733   932   0     0    0       32     0    146       0   52   243
## must      164     0   0     0    0        0     0      0       0    0     0
## ought_to    0     0   0     0    0        0     0      0       0    0     0
## shall       0     0   0     0    0        0     0      0       0    0     0
## should      0    33   0     0    0        0     0      0       0    0   177
## used_to     0   144   0     0    0        0     0      0       0    0    20
## will      111    20  26    59    0        0    21     23       0    0    80
## would       0   171   0    29    0        0     0     81      40   52     0

IG <- graph_from_adjacency_matrix(DM.trim, weighted=TRUE, mode = "directed")
IG

## IGRAPH 35a2ebc DNW- 11 33 -- 
## + attr: name (v/c), weight (e/n)
## + edges from 35a2ebc (vertex names):
##  [1] can    ->could    can    ->may      can    ->should   can    ->will    
##  [5] could  ->can      could  ->might    could  ->would    may    ->can     
##  [9] may    ->could    may    ->will     might  ->can      might  ->could   
## [13] might  ->ought_to might  ->should   might  ->will     might  ->would   
## [17] must   ->can      should ->could    should ->would    used_to->could   
## [21] used_to->would    will   ->can      will   ->could    will   ->may     
## [25] will   ->might    will   ->shall    will   ->should   will   ->would   
## [29] would  ->could    would  ->might    would  ->should   would  ->used_to 
## + ... omitted several edges

plot(IG)

If we trim down too much though so that some modals have no links, then the graph kinda breaks down….

DM.trim[DM.trim < 50] <- 0
DM.trim

##           can could may might must ought_to shall should used_to will would
## can         0   135   0     0    0        0     0      0       0  111     0
## could      60     0   0     0    0        0     0      0       0    0     0
## may       273    53   0     0    0        0     0      0       0    0     0
## might    1733   932   0     0    0        0     0    146       0   52   243
## must      164     0   0     0    0        0     0      0       0    0     0
## ought_to    0     0   0     0    0        0     0      0       0    0     0
## shall       0     0   0     0    0        0     0      0       0    0     0
## should      0     0   0     0    0        0     0      0       0    0   177
## used_to     0   144   0     0    0        0     0      0       0    0     0
## will      111     0   0    59    0        0     0      0       0    0    80
## would       0   171   0     0    0        0     0     81       0   52     0

IG <- graph_from_adjacency_matrix(DM.trim, weighted=TRUE, mode = "directed")
IG

## IGRAPH e3b1a6a DNW- 11 19 -- 
## + attr: name (v/c), weight (e/n)
## + edges from e3b1a6a (vertex names):
##  [1] can    ->could  can    ->will   could  ->can    may    ->can   
##  [5] may    ->could  might  ->can    might  ->could  might  ->should
##  [9] might  ->will   might  ->would  must   ->can    should ->would 
## [13] used_to->could  will   ->can    will   ->might  will   ->would 
## [17] would  ->could  would  ->should would  ->will

plot(IG)

Not totally sure how to easily remove variables with all zeroes in rows AND columns, so just remove manually and rerun…

DM.trim <- subset(DM.trim, row.names(DM.trim)!=c("shall", "ought_to"), select=-c(shall, ought_to))

## Warning in row.names(DM.trim) != c("shall", "ought_to"): longer object length is
## not a multiple of shorter object length

IG <- graph_from_adjacency_matrix(DM.trim, weighted=TRUE, mode = "directed")
IG

## IGRAPH 022044b DNW- 9 19 -- 
## + attr: name (v/c), weight (e/n)
## + edges from 022044b (vertex names):
##  [1] can    ->could  can    ->will   could  ->can    may    ->can   
##  [5] may    ->could  might  ->can    might  ->could  might  ->should
##  [9] might  ->will   might  ->would  must   ->can    should ->would 
## [13] used_to->could  will   ->can    will   ->might  will   ->would 
## [17] would  ->could  would  ->should would  ->will

plot(IG)

Looking pretty good now, but we can clean up…

 par(mar=c(.0,0,0,0))
plot.igraph(IG, 
              vertex.size = 24,
              vertex.color="greenyellow",
              vertex.shape="circle",
              vertex.frame.color="greenyellow",
              vertex.frame.lwd=1,
              vertex.label.color="black",
              vertex.label.family="sans",
              vertex.label.cex=.5,
              edge.color="black",
              edge.width=.8,
              edge.arrow.size=.5)

We could also figure out how to scale the arrows and the nodes based on magnitude.

And we could also use chord diagrams. Look cool – but harder to read, especially directions.

library(circlize)

## ========================================
## circlize version 0.4.10
## CRAN page: https://cran.r-project.org/package=circlize
## Github page: https://github.com/jokergoo/circlize
## Documentation: https://jokergoo.github.io/circlize_book/book/
## 
## If you use it in published research, please cite:
## Gu, Z. circlize implements and enhances circular visualization
##   in R. Bioinformatics 2014.
## 
## This message can be suppressed by:
##   suppressPackageStartupMessages(library(circlize))
## ========================================

## 
## Attaching package: 'circlize'

## The following object is masked from 'package:igraph':
## 
##     degree

chordDiagram(DM )

chordDiagram(DM.trim)

DM Network Analysis Attempt

Jack Grieve

27/10/2020

1 Introduction

2 Packages

3 Data

4 Network Visualisation