HW3

# get the list of names and convert to character
names <- trophnet$X.1
names <- as.character(names)

# convert data to adjacency matrix
trophnetmat <- as.matrix(trophnet)
mat <- trophnetmat[1:106, 3:108]

# create a statnet object from the adjaceny matrix, set the node names
net <- as.network(mat, matrix.type = "adjacency")
network.vertex.names(net) <- names

# create a stochastic block model from the network, set the output data
# structure p<-sample(1:106,95)
sbm <- mixer(as.matrix.network.adjacency(net), qmin = 2, qmax = 15)

## Mixer: the adjacency matrix has been transformed in a directed edge list

sbm.output <- getModel(sbm)

# get array of likelihood of each node's membership in the 11 discovered
# classes
taus <- sbm.output$Taus
paste("Number of Communities::", dim(taus)[1])

## [1] "Number of Communities:: 11"

# create a vector that has the highest class membership probability for each
# node
membership <- 1:106
for (i in 1:106) {
    membership[i] = which.max(taus[, i])
}

Q.1: Comparison to paper:

Instead of 14, we are finding 11 communties to be the best representation of the network. Also, the robustness of number of clusters has not been reproduced given the number of communities is falling mean 8.2 when the network is subsampled to 70%. This has been repeated 100 times.

This shows that the probabilistic algorithm is not robust against perturbations due to species removal. We believe the topology captured by the communities is not still not well defined and cab be pertubed by subsampling. #Following are the p-values for the association test between class assignment and Metadata.

# import the libraries for this part
library(ergm)
library(readxl)

# read in the data
metadata<-read_excel("chilean_metadata.xls")

# set vertex attributes based on the metadata
net%v%"Spec"<-metadata$`Spec`
net%v%"BodyMass"<-metadata$BodyMass
net%v%"sessile/mobile"<-metadata$`sessile/mobile`
net%v%"Cluster"<-metadata$`Cluster`
net%v%"Shore Height 1 conservative"<-metadata$`Shore Height 1 conservative`
net%v%"ShoreHt_C_Ordinal"<-metadata$`ShoreHt_C_Ordinal`
net%v%"ShoreHt_C_Breadth"<-metadata$`ShoreHt_C_Breadth`
net%v%"Shore Height 2 restrictive"<-metadata$`Shore Height 2 restrictive`
net%v%"ShoreHt_R_Ordinal"<-metadata$`ShoreHt_R_Ordinal`
net%v%"ShortHt_R_Breadth"<-metadata$`ShortHt_R_Breadth`
net%v%"Phyllum"<-metadata$`Phyllum`
net%v%"subphyllum"<-metadata$`subphyllum`
net%v%"trophic"<-metadata$`trophic`


net%v%"Class Assignment"<-membership

Here we add the class membership of each node from part 1 as an attribute. Probably this will be an impactful (though artificial) factor in fitting an ergm. Since the classes are constructed to capture tendencies for connectivity based on membership.’)

Fit a model including potential homophilly based on all possible attributes

Note: including “Spec” in this model causes criteria to diverge, so we omit it. No two species have the same entry for “Spec” since it is a unique identifier. No there is no possible homophilly relationship based on this attribute.

## [1] "Significance of association with tropic 1.7036083061751e-17"
## [1] "Significance of association with Bodymass 2.87366538396223e-05"
## [1] "Significance of association with CLuster 3.17970035037161e-07"
## [1] "Significance of association with Mobility 1.25731507366451e-16"
## [1] "Significance of association with Shore Height 3.83242312801362e-06"
## [1] "Significance of association with SHore_Ord 0.208308503378737"
## [1] "Significance of association with Shortht_Bread 0.000754588452163381"
## [1] "Significance of association with SHore_ht_Rest 0.0754257967968651"
## [1] "Significance of association with Phyllum 4.17174360551444e-18"
## [1] "Significance of association with Subphyllum 3.52347820779515e-06"
## membership
##  1  2  3  4  5  6  7  8  9 10 11 
##  4 10  4  6  9  9  8  4 25 22  5

Q.1 ..Continue

Based on the such strong Cluster association, Our communities are following the pattern of function groups defined in by kefi et al, 2016.

Also, the phyllum/subphyllum association tells us the hierarchy in taxonomy information shows a coherence in our community assignment. This suggests a part of interaction niche. We have reproduced this result approximately form teh paper adn cam the same conclusion which the paper came and hence quoting “the species composition of the functional groups is coherent with broad taxonomic classifications, considered as a coarse proxy for phylogenetic relatedness”.

Also, our groups have very strong association with trophic level, mobility, and shore height which suggest these groups could be explained by these simple traits.

Again, based on our project of FLorida food web and this network, we can clearly see how bodymass assocaition with the community tells us how topology of network via community is distinguishing the primary, secondary, etc consumers which can be seen via their distinct bodymass difference.

Lastly, a critical comment made in this paper, about non-trophic interactions, which can contribute to stability and robustness of the ecological system(network) seems like an interesting lead to follow for our project, given we are workign with food webs.

Full Model::

fullmodel <- formula(net ~ edges 
                  + gwesp(1,fixed=TRUE)
                  + nodematch("BodyMass")
                  + nodematch("Cluster")
                  + nodematch("Phyllum")
                  + nodematch("subphyllum")
                  + nodematch("sessile/mobile")
                  + nodematch("Shore Height 1 conservative")
                  + nodematch("ShoreHt_C_Breadth")
                  + nodematch("ShoreHt_R_Ordinal")
                  + nodematch("Shore Height 2 restrictive")
                  + nodematch("ShortHt_R_Breadth")
                  + nodematch("ShoreHt_C_Ordinal")
                  + nodematch("trophic")
                  + nodematch("Class Assignment")
                  )
fullmodel.fit<-ergm(fullmodel, verbose = F)
# see the statistics, AIC: 6811, BIC: 6920
#summary.ergm(fullmodel.fit)
#plot goodness of fit statistics
gof.ergm <- gof(fullmodel.fit)
par(mfrow=c(2, 3))
plot(gof.ergm, main=NA)

Summary of model fit

AIC: 7123 BIC: 7233 Monte Carlo MLE Results: Estimate Std. Error MCMC % p-value
edges -0.99773 0.07228 0 < 1e-04 gwesp.fixed.1 -0.26724 0.02794 0 < 1e-04 nodematch.BodyMass -1.23470 0.58828 0 0.03585 *
nodematch.Cluster -0.03505 0.26323 0 0.89407
nodematch.Phyllum -1.02549 0.18928 0 < 1e-04 nodematch.subphyllum 0.41303 0.26493 0 0.11903
nodematch.sessile/mobile -0.55785 0.08604 0 < 1e-04 nodematch.Shore Height 1 conservative -0.41403 0.76119 0 0.58651
nodematch.ShoreHt_C_Breadth 0.15713 0.10106 0 0.12003
nodematch.ShoreHt_R_Ordinal 0.25828 0.23408 0 0.26989
nodematch.Shore Height 2 restrictive -0.40615 0.24558 0 0.09818 .
nodematch.ShortHt_R_Breadth 0.33755 0.08427 0 < 1e-04 * nodematch.ShoreHt_C_Ordinal -0.11275 0.74428 0 0.87959
nodematch.trophic -0.79386 0.24194 0 0.00104 nodematch.Class Assignment -0.06197 0.23916 0 0.79554

Following model includes only those attributes with p<0.05

reducedmodel <- formula(net ~ edges 
                     + gwesp(1,fixed=TRUE)
                     + nodematch("Phyllum")
                     + nodematch("subphyllum")
                     + nodematch("sessile/mobile")
                     + nodematch("trophic")
                     + nodematch("Class Assignment")
)
reducedmodel.fit<-ergm(reducedmodel)
# see the statistics, AIC: 6811, BIC:6921
#summary.ergm(reducedmodel.fit)
#plot goodness of fit statistics
gof.ergm <- gof(reducedmodel.fit)
par(mfrow=c(2, 3))
plot(gof.ergm, main=NA)

Summary of model fit

AIC: 7172 BIC: 7223 Monte Carlo MLE Results: Estimate Std. Error MCMC % p-value
edges -0.89084 0.04470 0 < 1e-04 gwesp.fixed.1 -0.28115 0.02853 0 < 1e-04 nodematch.Phyllum -0.99398 0.17985 0 < 1e-04 nodematch.subphyllum 0.38414 0.25714 0 0.135243
nodematch.sessile/mobile -0.61508 0.08869 0 < 1e-04 nodematch.trophic -0.81948 0.23419 0 0.000469 *** nodematch.Class Assignment -0.15093 0.22008 0 0.492872

The following model includes only those attributes with a p value rated “***"

reducedermodel <- formula(net ~ edges 
                        + gwesp(1,fixed=TRUE)
                        + nodematch("Phyllum")
                        + nodematch("sessile/mobile")
                        + nodematch("Class Assignment")
)
reducedermodel.fit<-ergm(reducedermodel, verbose = FALSE)
# see the statistics, AIC: 6846, BIC:6883
#summary.ergm(reducedermodel.fit)
#plot goodness of fit statistics
gof.ergm <- gof(reducedermodel.fit)
par(mfrow=c(2, 3))
plot(gof.ergm, main=NA)

Summary of model fit

Monte Carlo MLE Results: Estimate Std. Error MCMC % p-value
edges -0.89490 0.04381 0 <1e-04 gwesp.fixed.1 -0.28313 0.02742 0 <1e-04 nodematch.Phyllum -1.04078 0.14882 0 <1e-04 nodematch.sessile/mobile -0.62454 0.08690 0 <1e-04 nodematch.Class Assignment -0.32101 0.22327 0 0.151
AIC: 7181 BIC: 7217 (Smaller is better.)

Following model does not include the “Class Assignment” attribute, so we can Test the hypopthesis that excluding it produces a poorer fit.

noclassmodel <- formula(net ~ edges 
                          + gwesp(1,fixed=TRUE)
                          + nodematch("Phyllum")
                          + nodematch("sessile/mobile")
)
noclassmodel.fit<-ergm(noclassmodel, verbose=FALSE)
# see the statistics, AIC: 7181, BIC:7210
# the membership attribute significantly improves the model fit as expected
#summary.ergm(noclassmodel.fit)
#plot goodness of fit statistics
gof.ergm <- gof(noclassmodel.fit)
par(mfrow=c(2, 3))
plot(gof.ergm, main=NA)

Summary of model fit

Monte Carlo MLE Results: Estimate Std. Error MCMC % p-value
edges -0.89271 0.04591 0 <1e-04 gwesp.fixed.1 -0.28381 0.02844 0 <1e-04 nodematch.Phyllum -1.13427 0.13828 0 <1e-04 nodematch.sessile/mobile -0.64308 0.08500 0 <1e-04

AIC: 7182 BIC: 7212 (Smaller is better.)

Q.2 In summary, reducing the number of covariates generally decreases the BIC and AIC for the ERGM model.The most important covariates are found to be Phyllum, sessile/mobile, and Class Assignment (for which we have added to the existing raw data from the stochastic block model). The model improves when adding this information as shown by BIC and AIC decrease across models with decreasing complexity, while keeping the highly significant attributesand Class Assignment as a main effect.

___________________________END_____________________________