Presence of oats and rye and spatial attributes

I noticed while I was doing the initial data entry that wheat and barley were very common and often present in the same sample. Oats and rye (secondary cereal crops) were far less common. I decided to construct a query that would examine the spatial attributes of sites for which both of these cereals were present.

The data loading process is masked here, but is identical to the chunk visibile here

library(sqldf)
library(ggplot2)

Querying the data

I built two separate queries and then merged these into a single table of samples containing secondary cereal crops. I first selected for all samples for which rye was found and added a “secale” column. I set all values in the column to “TRUE.” I repeated the process for oats and joined the tables to create a single table of samples with a column indicating the presence of rye and a column indicating the presence of oats. I added a new column and set the values of the new column with conditional statements (“avena” for sites with only oats, “secale” for sites with only rye, and “avena_and_secale” for sites with both).

sample_secale <- sqldf("SELECT * FROM sampling LEFT JOIN finds ON (sampling.sample_id=finds.sample_id) WHERE genus ='Secale' GROUP BY sampling.sample_id ORDER BY sampling.sample_id")
head(sample_secale, n=15)
##   sample_id   her_no her_cit sample context feature_type volume
## 1      S068 MLI25067       1      6    3077        ditch    9.0
## 2      S088 MLI56354       1      1     115        ditch   12.0
## 3      S095 MLI13829      12      2       7        ditch    1.2
##   bot_finds_id sample_id  Order  Family  genus species plant_part
## 1         B322      S068 Poales Poaceae Secale cereale      grain
## 2         B360      S088 Poales Poaceae Secale cereale      grain
## 3         B382      S095 Poales Poaceae Secale cereale      grain
##   preservation count   notes
## 1            c    28      cf
## 2            c     1 unquant
## 3            c     4      cf
secale_sites <- subset(sample_secale, select=c("her_no", "her_cit"))
secale_sites <- unique(secale_sites)
secale_sites$secale <- "TRUE"

sample_avena <- sqldf("SELECT * FROM sampling LEFT JOIN finds ON (sampling.sample_id=finds.sample_id) WHERE genus ='Avena' GROUP BY sampling.sample_id ORDER BY sampling.sample_id")
avena_sites <- subset(sample_avena, select=c("her_no", "her_cit"))
avena_sites <- unique(avena_sites)
avena_sites$avena <- "TRUE"

#merge the data frames into a single data frame for secondary cereals
sites_minor_cereals <- merge(avena_sites, secale_sites, all.x=TRUE)
sites_minor_cereals$presence <- NA

#conditional statements determine the values of the new column
sites_minor_cereals[(!is.na(sites_minor_cereals$avena)) & (is.na(sites_minor_cereals$secale)), "presence"] <- "avena_only"
sites_minor_cereals[(is.na(sites_minor_cereals$avena)) & (!is.na(sites_minor_cereals$secale)), "presence"] <- "secale_only"
sites_minor_cereals[(!is.na(sites_minor_cereals$avena)) & (!is.na(sites_minor_cereals$secale)), "presence"] <- "avena_and_secale"

Connecting the secondary cereals to other data

I was interested in seeing whether there were any shared spatial attributes of the sites where oats and rye were found, so the first step was to link my secondary cereals table to the spatial data table by site key. I then loaded a csv generated previously for the purpose of mapping all Saxon and medieval phases of occupation at a given site. The original steps are available on github

sites_minor_cereals_sp <- merge(sites_sp, sites_minor_cereals)
sites_phase <- read.csv("~/Grad Year 3/Advanced Data Structures/output/sites_phase.csv")
sites_phase_new <- subset(sites_phase, select=c("her_no", "her_cit", "lat", "lon", "new"))
sites_phase_cereals <- merge(sites_minor_cereals_sp, sites_phase_new, by= c("her_no", "her_cit"))
head(sites_phase_cereals)
##     her_no her_cit dem slope temppoly                            terrain
## 1 MLI13829      12   1     1        5 Alluvial plains and river terraces
## 2 MLI13829      21   1     1        5 Alluvial plains and river terraces
## 3 MLI25066       1   1     2        4 Alluvial plains and river terraces
## 4 MLI25067       1   1     1        4 Alluvial plains and river terraces
## 5 MLI33412       5   1     1        4 Alluvial plains and river terraces
## 6 MLI35401       1   4     2        4              Clay or marl lowlands
##   nucleation avena secale         presence      lat       lon          new
## 1          2  TRUE   TRUE avena_and_secale 53.02927  0.099714     late/med
## 2          2  TRUE   <NA>       avena_only 53.02972  0.095113     late/med
## 3          2  TRUE   <NA>       avena_only 52.80927 -0.088062         late
## 4          2  TRUE   TRUE avena_and_secale 52.80947 -0.083170 mid/late/med
## 5          3  TRUE   <NA>       avena_only 52.70286 -0.333381     late/med
## 6          2  TRUE   <NA>       avena_only 53.00354 -0.601405        early

Plotting spatial attributes of secondary cereal crop sites

I created histograms to see whether there were any similarities in the spatial data among sites with secondary cereal crops. The most interesting observations related to phase of the sites and the terrain. I noticed that a higher percentage of sites were located on the alluvial plain, which would have been a very fertile for crops. I also noticed a temporal difference between oats and rye: although oats are present in Early Saxon sites, rye is entirely absent until the Middle Saxon period. The sample size is admittedly very small, but it would be interesting to run this analysis again after collecting data about more sites containing oats and rye.

ggplot(sites_phase_cereals, aes(x=dem, color=presence, fill=presence)) + geom_histogram(position="dodge", binwidth=1)+xlim(0,5) + labs(x="Elevation")

ggplot(sites_phase_cereals, aes(x=terrain, color=presence, fill=presence)) + geom_bar(stat="count", position="dodge") + labs(x="Terrain")

ggplot(sites_phase_cereals, aes(x=new, color=presence, fill=presence)) + geom_bar(stat="count", position="dodge") + labs(x="Phase")

#write.csv(sites_phase_cereals, file="~/Grad Year 3/Advanced Data Structures/output/phase_cereals.csv")