Hierarchical Distance Sampling

This is an R Markdown document, demonstrating how to run a hierarchical distance sampling model using the function gdistsamp in the unmarked package in R. The data used for analyses in this document come from my PhD study at the University of Manitoba, which I used to model how different prairie animals and plants were affected by local-scale vegetation of transmission lines along an urbanization gradient, and how those animals and plants’ numbers were affected by surrounding land use. I was interested at the time in figuring out how and where sections of transmission line between roads might be managed intentionally as alternative habitats for prairie wildlife like butterflies, beetles, birds and plants. I was interested in this question for a couple of reasons. First, I am interested in managing urban landscapes wherever possible to support more biodiversity. Second, as native grasslands like tall-grass prairie are so heavily reduced, critically endangered, and underprotected in areas outside of cities in Manitoba, underused urban grassy spaces along transmission lines and other roadsides might be one of the best available conservation options in Manitoba for restoring patches of tall-grass prairie. So I used wildlife surveys along 48 transmission line sections across 3 years to understand how and where such spaces might be managed for prairie birds and other wildlife.

Urban transmission lines are usually mowed and sprayed frequently to keep them neat and relatively free of weeds while rural transmission lines may be infrequently mowed and sprayed, or used as haylands, or left completely unmanaged in forested or pastured or cropland landscapes except to keep trees from growing into overhead lines. I hypothesized that frequent mowing and spraying would reduce habitat suitability along urban transmission lines for many (but not necessarily all) species. I also hypothesized that large amounts of surrounding urban land (and for prairie birds, wooded lands as well) might reduce settlement by many prairie species, which could negate efforts to improve urban transmission lines as habitat for many species. Grassland amount might also matter, with larger habitats supporting more species: however, I thought that variation in how grasslands were managed might matter more than grassland amount, because management would affect grassland suitability for species. Research from my PhD studies have been published in a couple of papers so far: Urban rights-of-way as reservoirs for tall-grass prairie plants and butterflies (http://dx.doi.org/10.1007/s00267-015-0631-9); Urban rights-of-way as extensive butterfly habitats: A case study from Winnipeg, Canada (http://dx.doi.org/10.1016/j.landurbplan.2016.05.026). I have at least a few more papers to go.

Why might you ask would I be interested in running hierarchical distance sampling models? Because for many species during wildlife surveys, detection probability during surveys < 1 and because of varying detection probability among sites. For animals like birds, which are usually detected aurally rather than visually during site visits, detection probability may decline with increasing distance of birds to the observer during a point count or a transect. Effective detection radius could vary depending on habitat differences or environmental noise affecting propagation of bird sounds. For example, among sites along an urbanization gradient, birds might be less likely to be detected at sites with more urban traffic, causing observers to conclude that urban sites have lower numbers of a species, even if true numbers are the same as at rural sites with less traffic noise.

Detection probability can also vary with behavior of a species that makes it likely to be detected. In the case of birds that are detected mostly aurally rather than visually, this behavior can be singing or calling rate, which may vary with time of day or time of season. We say that the availability of a bird for detection varies with time of season and time of day. Observer experience also influences whether or not an observer detects a species, and we can say that availability of a bird for detection is greater when a more experienced observer is conducting the survey.

Finally, detection probability should be higher if a species is more abundant at a site. More individuals could mean more conspecific interactions or greater encounter rates between observers and individual organisms. Site features that enhance abundance of a species could enhance detection of that species.

There are already numerous recently developed techniques for accounting for different kinds of detection probability, but hierarchical distance sampling can account for all of the different kinds of detection influences in a single analysis. Distance sampling methods are necessary for estimating densities of organisms at sites, and can model how effective detection radius varies in different kinds of vegetation or different kinds of noisy environments, but cannot model availability for detection. Removal sampling or time-to-first-detection methods can model availability for detection but cannot be used to predict abundance or density at sites. Mixture models can be used to predict relative abundance at sites and probability that a species is perceived by observers if it is present at a site, but do not distinguish availability for detection and detection probability with distance. Mixture models cannot be used to estimate densities at sites. But hierarchical distance sampling methods combine distance sampling, removal sampling, and mixture modelling in one analysis. I can theoretically account for differences in effective detection radius among noisy urban sites and less noisy rural sites and observer, seasonal, and time-of-morning effects on availability for detection, and predict abundance and density of organisms at sites.

library(unmarked)
SAVS.GDISTSAMP<-read.csv(file="C:/Users/Lionel/Documents/MITACS ACCELERATE FELLOWSHIP/RMarkdownAndKnitrPractice/PhDexample_gdistsamp2007.csv", header=TRUE)  

The code above starts by loading the unmarked package into RStudio along with its dependencies, and reads into R a csv file located in the same working directory as this RMarkdown document.For simplicity’s sake the CSV file has already been formatted so that it contains all data necessary for hierarchical distance sampling, but the necessary data could be formatted from separate files read into R instead.

The first column in the CSV file contains the site names, which are not necessary for this analysis. The next 12 columns contain abundance data for Savannah Sparrows: 4 distance categories from the first visit in 2007 (0-5 m, 6-10 m, 11-15 m, 15-20 m), then the same 4 distance categories from the second visit, then the same 4 distance categories from the third visit. Note that one site (Garven F) was not visited in round 2, so values for Savannah Sparrow counts from that site in round 2 are “NA”. These values are stored in y.

In the next line, breaks stores the distances bracketing the 4 distance categories per visit, enabling us to calculate densities and abundance later on, as does the next line with transect.length (each site is 500 m long and we are counting birds out to 20 m from either side of the transect line). R corresponds to the number of sites and T the number of visits.

roads in column 14 of the CSV file distinguishes between sites characterized by high traffic volumes on nearby roads. We will use this variable as a factor that potentially lowers effective detection radius at urban or rural sites with major roads nearby. It corresponds roughly to the percentage of land within 100 m of sites consisting of roads being <11% (roads=1) or >11 % (roads=2).

Columns 15-18 contain site variables associated with vegetation management within sites: mowing frequency (x/year), mowing frequency centered (prior to modelling it as a quadratic function), the quadratic term for mowing frequency, and whether or not a site was hayed in 2007 (yes=1, no=0).

Columns 19-22 are land use variables within 100 m of each site, the percentages of land within 100 m consisting of those land uses: roads100 (major roads only), urban100 (major roads, built up lands including houses, larger buildings, parking lots, and smaller streets), grassland100 (mowed, unmowed, pastured or hayed), and wooded100 (forests+shrublands).

Columns 23-27 represent vegetation structural metrics within study sites that are potentially affected by mowing regime and which represent potential nesting habitat for grassland birds: BAREGROUND (bare ground cover), litcov (litter cover), htlow (vegetation height-density, the height up to which >50% of a density pole’s intervals are concealed from view 4 m away), woodycov (woody plant stem cover), and grasscov (live grass stem cover).

Columns 14-27 are the site covariates. There is 1 column per variable. These will be stored in covs.

# Organize distance data and covariates into an unmarkedFrame object
y<-SAVS.GDISTSAMP[,2:13] #columns 2-13 contain observations of SAVS numbers in 4 distance intervals from 3 survey periods in 2007
breaks <- seq(0, 20, by=5)
transect.length<-500 #length of transects used for calculating densities
R<-35 #transects
T <- 3 # number of survey periods

roads<-SAVS.GDISTSAMP[,14]
roads.f<-as.factor(roads)
roads.df<-data.frame(roads.f)

#site level covariates
covs<-data.frame(roads=roads.df, mowed=SAVS.GDISTSAMP[,15], mowed.c=SAVS.GDISTSAMP[,16], mowsquare=SAVS.GDISTSAMP[,17], hayed=SAVS.GDISTSAMP[,18], roads100=SAVS.GDISTSAMP[,19], urban100=SAVS.GDISTSAMP[,20], grass100=SAVS.GDISTSAMP[,21], wooded100=SAVS.GDISTSAMP[,22],  bareground=SAVS.GDISTSAMP[,23], litcov=SAVS.GDISTSAMP[,24], htlow=SAVS.GDISTSAMP[,25], woodycov=SAVS.GDISTSAMP[,26], grasscov=SAVS.GDISTSAMP[,27])

Columns 28-30 contain the Julian dates of the 3 visits. Note 3 visits, 3 columns for one variable, stored as a matrix in Julian.m.

Columns 31-33 contain the time period in the morning when the 3 visits occurred. Note 3 visits, 3 columns for one variable, with each column converted to a factor, with all 3 factors stored in the data frame Time.df.

Columns 34-36 contain the observer conducting the survey during each of the 3 visits. Note 3 visits, 3 columns for one variable, with each column converted to a factor, with all 3 factors stored in the data frame Obs.df.

Julian date, time of survey, and observer are observation covariates (confusingly called yearly site covariates for some reason in gdistsamp). They can vary with each visit, affecting the availability of Savannah Sparrows for detection. The first column for each variable here corresponds to the first 4 abundance columns for Savannah Sparrows (distance intervals from first visit). Second column for each variable here corresponds to the next 4 abundance columns for Savannah Sparrows, and so on…Remember the site that was not visited in round 2 (Garven F)? It had “NA” values for Savannah Sparrow counts and it has “NA” for Julian date, time of survey, and observer in the second column value for each of those variables at that site. It is important that the “NA”s match up or there will be an error when creating the Unmarked GDS frame.

Julian.m<-as.matrix(SAVS.GDISTSAMP[,28:30])   #detectability may vary with Julian date (converted to matrix)
Timeclass.31<-SAVS.GDISTSAMP[,31]#time of day (1=5-6AM, 2=6-7AM, 3=7-8AM, 4=8-9AM, 5=9-10AM)
Timeclass.f31<-as.factor(Timeclass.31)
Timeclass.32<-SAVS.GDISTSAMP[,32]#time of day (1=5-6AM, 2=6-7AM, 3=7-8AM, 4=8-9AM, 5=9-10AM)
Timeclass.f32<-as.factor(Timeclass.32)
Timeclass.33<-SAVS.GDISTSAMP[,33]#time of day (1=5-6AM, 2=6-7AM, 3=7-8AM, 4=8-9AM, 5=9-10AM)
Timeclass.f33<-as.factor(Timeclass.33)
Time.df<-data.frame(Timeclass.f31,Timeclass.f32,Timeclass.f33)
Obs.34<-SAVS.GDISTSAMP[,34]#time of day (1=5-6AM, 2=6-7AM, 3=7-8AM, 4=8-9AM, 5=9-10AM)
Obs.f34<-as.factor(Obs.34)
Obs.35<-SAVS.GDISTSAMP[,35]#time of day (1=5-6AM, 2=6-7AM, 3=7-8AM, 4=8-9AM, 5=9-10AM)
Obs.f35<-as.factor(Obs.35)
Obs.36<-SAVS.GDISTSAMP[,36]#time of day (1=5-6AM, 2=6-7AM, 3=7-8AM, 4=8-9AM, 5=9-10AM)
Obs.f36<-as.factor(Obs.36)
Obs.df<-data.frame(Obs.f34,Obs.f35,Obs.f36)

# Organize data into an Unmarked Data frame for Generalized Distance Sampling
umf <- unmarkedFrameGDS(y = y, siteCovs=covs, yearlySiteCovs=list(Julian=Julian.m, Time=Time.df, Obs=Obs.df), survey="line", unitsIn="m", dist.breaks=breaks, tlength=rep(transect.length, R), numPrimary=T)  #tlength=rep(transect.length, R)
#summary(umf)

Once the bird counts, site covariates, and the confusingly named Yearly site covariates are combined into the composite unmarked frame designed for hierarchical distance sampling, we can begin to run some models. We’ll start by seeing how Savannah Sparrow counts vary with distance from the observer. Do counts decline with distance (e.g. halfnormal distance function) or don’t they (uniform distance function), perhaps because the transmission line study sites are not too wide and we aren’t concerned with birds that are more than 20 m away. If counts do decline with distance, is effective detection radius lower at sites with potentially more traffic?

d1 <- gdistsamp(~1, ~1, ~1, umf, keyfun="halfnorm", output="density", unitsOut="ha", K=50)
d1b <- gdistsamp(~1, ~1, ~roads, umf, keyfun="halfnorm", output="density", unitsOut="ha", K=50)
d2 <- gdistsamp(~1, ~1, ~1, umf, keyfun="uniform", output="density", unitsOut="ha", K=50)

models1 <- fitList('Halfnorm Key Function Null' = d1, 'Halfnorm Key Function Urban' = d1b, 'Uniform Key Function Null' = d2) 
ms1 <- modSel(models1) 
## Warning in sqrt(diag(vcov(x, altNames = TRUE))): NaNs produced
ms1 #prints model results
##                             nPars    AIC delta AICwt cumltvWt
## Uniform Key Function Null       2 490.86  0.00  0.67     0.67
## Halfnorm Key Function Null      3 492.86  2.00  0.24     0.91
## Halfnorm Key Function Urban     4 494.86  4.00  0.09     1.00

Model results from the distance functions suggest that a uniform distance function best describes how counts of Savannah Sparrows vary with distance intervals, in that the uniform model has the lowest AIC. In other words, detection probability does not decline meaningfully within the distance ranges sampled, and detection probability is not meaningfully different at sites near high traffic roads.

The next step in modelling is to see how availability of Savannah Sparrows for detection (phi) varies with time of season (26 May-30 June), time of morning (0500-1000 hrs), and observer experience (LL=Lionel Leston, JT=Jennifer Tran). We will model availability for detection assuming a uniform distance function (in which the third tilde (~) in the formula expression equals 1). We will evaluate these models against a null model of availability (the uniform distance function without any availability covariates).

#Factors affecting phi (availability for detection, e.g. singing rate)
phi.Julian <- gdistsamp(~1, ~Julian, ~1, umf, keyfun="uniform",output="density", unitsOut="ha", K=50)
phi.Time <- gdistsamp(~1, ~Time, ~1, umf, keyfun="uniform",output="density", unitsOut="ha", K=50)
phi.JulianTime <- gdistsamp(~1, ~Julian+Time, ~1, umf, keyfun="uniform", output="density", unitsOut="ha", K=50)
phi.Observer <- gdistsamp(~1, ~Obs, ~1, umf, keyfun="uniform",output="density", unitsOut="ha", K=50)
phi.JulianObserver <- gdistsamp(~1, ~Julian+Obs, ~1, umf, keyfun="uniform",output="density", unitsOut="ha", K=50)
phi.TimeObserver <- gdistsamp(~1, ~Time+Obs, ~1, umf, keyfun="uniform",output="density", unitsOut="ha", K=50)
phi.JulianTimeObserver <- gdistsamp(~1, ~Julian+Time+Obs, ~1, umf, keyfun="uniform",output="density", unitsOut="ha", K=50)


models2 <- fitList('Uniform Key Function Null' = d2, 'Julian+Time+Observer'=phi.JulianTimeObserver, 'Time+Observer'=phi.TimeObserver, 'Julian+Observer'=phi.JulianObserver, 'Observer'=phi.Observer, 'Julian' = phi.Julian, 'Time' = phi.Time, 'Julian+Time' = phi.JulianTime) 
ms2 <- modSel(models2) 
ms2 #prints results
##                           nPars    AIC delta  AICwt cumltvWt
## Time+Observer                 7 481.61  0.00 0.5037     0.50
## Julian+Time+Observer          8 482.98  1.37 0.2539     0.76
## Time                          6 484.94  3.33 0.0953     0.85
## Julian+Observer               4 486.24  4.64 0.0496     0.90
## Julian+Time                   7 486.35  4.74 0.0470     0.95
## Observer                      3 486.69  5.08 0.0397     0.99
## Julian                        3 490.50  8.89 0.0059     1.00
## Uniform Key Function Null     2 490.86  9.25 0.0049     1.00

It appears that there are time-of-morning and observer effects on whether or not Savannah Sparrows are detected. The second-best model also suggests time-of-season effects, but that model has a higher AIC value and is less parsimonious. Let’s look at the results in model phi.TimeObserver.

phi.TimeObserver
## 
## Call:
## gdistsamp(lambdaformula = ~1, phiformula = ~Time + Obs, pformula = ~1, 
##     data = umf, keyfun = "uniform", output = "density", unitsOut = "ha", 
##     K = 50)
## 
## Abundance:
##  Estimate    SE    z P(>|z|)
##         1 0.372 2.69 0.00716
## 
## Availability:
##             Estimate    SE      z  P(>|z|)
## (Intercept)   -1.983 0.540 -3.675 0.000238
## Time2         -0.565 0.405 -1.395 0.163087
## Time3          0.270 0.361  0.746 0.455628
## Time4          0.331 0.367  0.902 0.367247
## Time5          0.960 0.415  2.315 0.020624
## ObsLL          0.554 0.245  2.258 0.023923
## 
## AIC: 481.6078

Savannah Sparrows were more likely to be detected between 0900-1000 hrs than around 0500-0600 hrs. A bit surprising, also that Savannah Sparrow detection did not otherwise appear to decline later in the morning, what with rush hour traffic or with birds usually most actively singing during the dawn chorus. The observer effect is not too surprising: there were two observers at sites in 2007 and I was the more experienced observer.

Now we will look at how abundance is affected by site covariates. We will start with land use models, pick the best of those, then add local vegetation management covariates to see if we can get a better model. We will also try adding local vegetation structural covariates to the best land use model to see if features associated with potential nesting habitat improve prediction of Savannah Sparrows. All models will assume a uniform distance function (third ~ equals 1) and use Time and Observer as availability covariates (second ~ equals Time+Obs).

LandUse <- gdistsamp(~urban100+grass100+wooded100, ~Time+Obs, ~1, umf, keyfun="uniform",output="density", unitsOut="ha", K=50)
Urban <- gdistsamp(~urban100, ~Time+Obs, ~1, umf, keyfun="uniform",output="density", unitsOut="ha", K=50)
Wooded <- gdistsamp(~wooded100, ~Time+Obs, ~1, umf, keyfun="uniform",output="density", unitsOut="ha", K=50)
Grassland <- gdistsamp(~grass100, ~Time+Obs, ~1, umf, keyfun="uniform",output="density", unitsOut="ha", K=50)
models3 <- fitList('Time+Obs' =phi.TimeObserver, 'Land Use' = LandUse, 'Urban' = Urban, 'Wooded'=Wooded, 'Grassland'=Grassland) 
ms3 <- modSel(models3) 
ms3 #prints results
##           nPars    AIC delta   AICwt cumltvWt
## Wooded        8 443.50  0.00 7.7e-01     0.77
## Land Use     10 445.86  2.36 2.3e-01     1.00
## Grassland     8 477.61 34.11 3.0e-08     1.00
## Urban         8 480.97 37.47 5.6e-09     1.00
## Time+Obs      7 481.61 38.11 4.1e-09     1.00
Wooded #prints results: already known to be most parsimonious model
## 
## Call:
## gdistsamp(lambdaformula = ~wooded100, phiformula = ~Time + Obs, 
##     pformula = ~1, data = umf, keyfun = "uniform", output = "density", 
##     unitsOut = "ha", K = 50)
## 
## Abundance:
##             Estimate      SE     z  P(>|z|)
## (Intercept)   2.0069 0.59918  3.35 8.10e-04
## wooded100    -0.0424 0.00853 -4.96 6.95e-07
## 
## Availability:
##             Estimate    SE       z P(>|z|)
## (Intercept) -2.34715 0.715 -3.2842 0.00102
## Time2       -0.54717 0.395 -1.3855 0.16590
## Time3        0.16349 0.352  0.4640 0.64267
## Time4        0.00816 0.349  0.0234 0.98135
## Time5        0.77039 0.389  1.9824 0.04743
## ObsLL        0.63346 0.235  2.6973 0.00699
## 
## AIC: 443.4992

A model with just the amount of wooded land within 100 m and not amounts of grassland habitat or urban land within 100 m was the most parsimonious landscape model, suggesting that Savannah Sparrow densities decline in more wooded landscapes but not meaningfully in more urban landscapes. Savannah Sparrows do not increase in density as grassland increases within 100 m.

VegMan <- gdistsamp(~wooded100+mowed.c+mowsquare+hayed, ~Time+Obs, ~1, umf, keyfun="uniform",output="density", unitsOut="ha", K=50)
Mowed <- gdistsamp(~wooded100+mowed.c, ~Time+Obs, ~1, umf, keyfun="uniform",output="density", unitsOut="ha", K=50)
Hayed <- gdistsamp(~wooded100+hayed, ~Time+Obs, ~1, umf, keyfun="uniform",output="density", unitsOut="ha", K=50)

models4 <- fitList('Mowed' =Mowed, 'Hayed' = Hayed, 'Wooded'=Wooded, 'VegMan'=VegMan) 
ms4 <- modSel(models4) 
ms4
##        nPars    AIC delta AICwt cumltvWt
## Wooded     8 443.50  0.00 0.497     0.50
## Mowed      9 444.99  1.49 0.236     0.73
## Hayed      9 445.33  1.83 0.199     0.93
## VegMan    11 447.46  3.96 0.068     1.00

Adding vegetation management covariates did not improve explanation of Savannah Sparrow densities. Let’s try vegetation structural variables instead.

Woodycover <- gdistsamp(~wooded100+woodycov, ~Time+Obs, ~1, umf, keyfun="uniform",output="density", unitsOut="ha", K=50)
Grasscover <- gdistsamp(~wooded100+grasscov, ~Time+Obs, ~1, umf, keyfun="uniform",output="density", unitsOut="ha", K=50)
Litter <- gdistsamp(~wooded100+litcov, ~Time+Obs, ~1, umf, keyfun="uniform",output="density", unitsOut="ha", K=50)
Bareground <- gdistsamp(~wooded100+bareground, ~Time+Obs, ~1, umf, keyfun="uniform",output="density", unitsOut="ha", K=50)
VegDensity <- gdistsamp(~wooded100+htlow, ~Time+Obs, ~1, umf, keyfun="uniform",output="density", unitsOut="ha", K=50)
VegStruc <- gdistsamp(~wooded100+woodycov+grasscov+htlow+bareground+litcov, ~Time+Obs, ~1, umf, keyfun="uniform",output="density", unitsOut="ha", K=50)
models5 <- fitList('Wooded'=Wooded, 'Woodycover'=Woodycover, 'VegDensity'=VegDensity, 'Grasscover'=Grasscover, 'Bareground'=Bareground, 'Litter'=Litter, 'VegStruc'=VegStruc) 
ms5 <- modSel(models5) 
ms5
##            nPars    AIC delta AICwt cumltvWt
## Litter         9 443.34  0.00 0.232     0.23
## Wooded         8 443.50  0.16 0.214     0.45
## Bareground     9 443.55  0.21 0.209     0.66
## Grasscover     9 444.55  1.21 0.127     0.78
## Woodycover     9 445.23  1.89 0.090     0.87
## VegDensity     9 445.30  1.96 0.087     0.96
## VegStruc      13 446.84  3.50 0.040     1.00

Adding vegetation structural variables did not improve explanation of Savannah Sparrow densities. That will be it for model selection for now. Before we go any further with graphing predictions with the final best model (Wooded), we should test the model’s goodness-of-fit, which is different from the relative fit of AIC. Since our response variable consists of counts binned into intervals, we will use a Freeman-Tukey chi-square test.

library(AICcmodavg)

fitstats <- function(fm) {
  observed <- getY(fm@data)
  expected <- fitted(fm)
  resids <- residuals(fm)
  sse <- sum(resids^2, na.rm = TRUE)#added na.rm=TRUE due to missing obs at one site in 2007
  chisq <- sum((observed - expected)^2 / expected)
  freeTuke <- sum((sqrt(observed) - sqrt(expected))^2)
  out <- c(SSE=sse, Chisq=chisq, freemanTukey=freeTuke)
  return(out)
}

parboot(Wooded, statistic=fitstats, nsim=500)

Note that the parametric bootstrap for getting the Freeman-Tukey statistic has not been evaluated in this Knitted document. The reason for that is I have been unable to get the bootstrap to work when there are missing observations from sites, as there were at site Garven F in 2007 in this dataset. What I have done in the past is rerun the analysis without Garven F in it, made sure that the final best model was the same, then run the bootstrap. I kept Garven F in the dataset to show how to create an unmarked GDS Frame and run hierarchical distance sampling models with missing data. The models will work with missing data: it’s just the bootstrap that will not.

Below I have attached goodness-of-fit results for the Wooded model from a run of 500 bootstraps. The P value associated with the Freeman-Tukey statistic > 0.10, suggesting that the model for predicting Savannah Sparrow densities is adequate.

## 
## Call: parboot(object = Wooded, statistic = fitstats, nsim = 500)
## 
## Parametric Bootstrap Statistics:
##                  t0 mean(t0 - t_B) StdDev(t0 - t_B) Pr(t_B > t0)
## SSE           269.8          134.3            16.51        0.000
## Chisq        1549.3          688.4            52.89        0.000
## freemanTukey   90.6           -2.6             6.04        0.663
## 
## t_B quantiles:
##               0% 2.5% 25% 50% 75% 97.5% 100%
## SSE           85  107 124 135 147   171  181
## Chisq        709  763 823 863 893   972 1062
## freemanTukey  74   82  89  93  97   104  109
## 
## t0 = Original statistic compuated from data
## t_B = Vector of bootstrap samples