Networks Unlimited

Generate an unlimited number of networks based on distance threshold - questions, math and examples

Hsun-Yi Hsieh

Generating networks for data analysis

A network is composed of nodes and edges. With information of nodes and edges, we can compute various properties of a network.

However, there are times that we do not know whether there is an edge between two nodes. One possible thing that we can do is to generate as many networks as we can and test which one makes the most sense to a special occasion.

This sounds like a daunting task... It is likely to do with a computer and some simple math, however.

The purpose of this Shiny APP

I create this Shiny APP for users to generate as many networks as they would like to based on the 'distance' between two nodes.

Hopefully this APP will be accompanied with R package ulnet, standing for 'unlimited networks'. Or, at least, I can add the functions into some existing R package.

Shiny APP example

The Shiny APP uses the spatial data of an ant species as an example. Each node on the plot is an ant nest. It has its coordinate on the plot. Each node is seperate from another with a specific distance.

Link to the old version of the Shiny APP

The user can use the slider on the left to set up the 'critical distance', measured in meter, to generate a network. Two nests belong to the same compartment if their distance is less than this 'critical distance'. They should be signified with the same color on the plot.

Five Features of the updated Shiny APP

(1) User's data-input on the Shiny APP.
(2) User's set-up of distance range, interval and threshold
(3) The exclusion of nodes located within the threashold distance to plot margins
(4) A spatial network with node information
(5) Zooming function (in and out) accompanied with node information

Open-source

(1) Source code will be available on GitHub
(2) Hopefully I will make a R package to accommodate functions

Math

The application of the fundamental matrix of the Markov chain.

\[ S = I + T + T^2 + ...\] \[S-I = T + T^2 + ... = TS\] \[(I-T)S = I\] \[ S = (I-T)^{-1}\]

More detailed math will be provided.

The application to ecological questions

More trafficed nodes by natural enemies would be less stable. The ant-scale mutualism would be more likely to be driven to extinction at highly trafficed nodes.

A less trafficed node, in contrast, would more likely to become an ant-scale mutualism patch.

Likewise, a patch close to other nodes would be more likely to go extinct. A node that is strongly conntected to other nodes would also be more likely to go extinct.

All of the above contribute to the observations of detectable ant-scale mutualism patches in the field.

(However, think about this - An ant nest that is weakly connected to other ant nests would have weaker 'collective' protection. So it would be a hump-shaped curve? Should we predice to see a greater probability of observing mutualism pathes in 'intermediately' connected nodes?)

Visualize the network of ant nest, summer 2006, D=20m

Zoom in the plot

plot of chunk unnamed-chunk-4

Zoom in the plot

plot of chunk unnamed-chunk-5

Zoom in the plot

plot of chunk unnamed-chunk-6

plot of chunk unnamed-chunk-7

How does the ant nest network affect the ant-scale mutualism?

plot of chunk unnamed-chunk-8

## [1] 229   6

## [1] 114   6

Degree centrality
The number of ties that a node has.

How does the ant nest network affect the ant-scale mutualism?

plot of chunk unnamed-chunk-9

Betweenness centrality
Betweenness centrality quantifies the number of times a node acts as a bridge along the shortest path between two other nodes.

How does the ant nest network affect the ant-scale mutualism?

plot of chunk unnamed-chunk-10

Closeness centrality

In connected graphs there is a natural distance metric between all pairs of nodes, defined by the length of their shortest paths. The farness of a node x is defined as the sum of its distances from all other nodes, and its closeness was defined by Bavelas as the reciprocal of the farness.

Logit Regression - automatic selections of best models and automatic model predictions

In summer 2006, there were 343 trees with ant nests. Among them, 114 had ant-scale mutualism.

Randomly sample 114 of the 229 records of 'no mutualism', combining with the 114 records of "with mutualism".

Model selections based on three variables: betweenness centrality, closeness centrality and degree centrality.

Obtain the prediction error of the best model. Simulate for 1000 times.

The same procedure is repeated against networks generated by a series of 'threshold distance' to obtain the spatial scale that explains the most of the appearance of ant-scale mutualism.

Preliminary results

D5 --> Very bad, none of the predictors does the work.
D10 --> prediction error = 41.64%
D12 --> prediction error = 38.98%
D13 --> prediction error = 37.13%
D14 --> prediction error = 40.33%
D15 --> prediction error = 40.6%
D17 --> prediction error = 42%

Note: D stands for 'critical distance', the number after it is the distance in meter (ex. 5m, 10m, 12m, etc.)

How does the ant-scale mutualism affect the build-up of predator colony?

plot of chunk unnamed-chunk-11

## 
## Call:
## glm(formula = predator_incidence ~ c, family = binomial)
## 
## Deviance Residuals: 
##     Min       1Q   Median       3Q      Max  
## -1.0852  -0.7188  -0.6829  -0.6656   1.7978  
## 
## Coefficients:
##               Estimate Std. Error z value Pr(>|z|)
## (Intercept) -1.115e+01  7.053e+00  -1.581    0.114
## c            1.145e+06  8.070e+05   1.418    0.156
## 
## (Dispersion parameter for binomial family taken to be 1)
## 
##     Null deviance: 124.81  on 113  degrees of freedom
## Residual deviance: 122.88  on 112  degrees of freedom
## AIC: 126.88
## 
## Number of Fisher Scoring iterations: 4

plot of chunk unnamed-chunk-11

## 
## Call:
## glm(formula = predator_incidence ~ d, family = binomial)
## 
## Deviance Residuals: 
##     Min       1Q   Median       3Q      Max  
## -0.8711  -0.7467  -0.7177  -0.6895   1.7625  
## 
## Coefficients:
##             Estimate Std. Error z value Pr(>|z|)    
## (Intercept) -1.31553    0.33496  -3.927 8.59e-05 ***
## d            0.04517    0.07624   0.592    0.554    
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## (Dispersion parameter for binomial family taken to be 1)
## 
##     Null deviance: 124.81  on 113  degrees of freedom
## Residual deviance: 124.47  on 112  degrees of freedom
## AIC: 128.47
## 
## Number of Fisher Scoring iterations: 4

plot of chunk unnamed-chunk-11

References

Ulanowicz, R. E. & Puccia, C. J. Mixed trophic impacts in ecosystems, Coenoses 5(1): 7-16, 1990.

Samantha Tyner & Heike Hofmann (2015). geomnet: Network Visualization in the 'ggplot2' Framework. R package version 0.0.1. http://CRAN.R-project.org/package=geomnet