Ernst-Witte Site Burials Analysis

This example runs through the exploratory data analysis (EDA) of the Ernst-Witte burials data. Following this is the creation of a presentable plot of percent pf grave goods over individual age by time period. Dplyr data munging is used to transform from the original dat a to a format the is useful for this plot.

This data set includes sex, age, burial group, location, and burial orientation and direction facing from the Ernest Witte site, a Late Archaic cemetery in Texas (Hall 1981).

Note about ggplot2 version

this relies on the development version of ggplot2 for the subtitle and caption text. To install this version, first install the devtools package install.packages("devtools"), then install ggplot from github devtools::install_github("hadley/ggplot2"). If you do not do this and have the CRAN version, everything should work except the subtitle and caption, and you will get a warning.

Load packages

library("archdata")
library("dplyr")

Attaching package: 'dplyr'
The following objects are masked from 'package:stats':

    filter, lag
The following objects are masked from 'package:base':

    intersect, setdiff, setequal, union
library("tidyr")
library("ggplot2") # dev version
library("scales")

assign to ew object

data(EWBurials)
ew <- EWBurials

Explore data structure

Common use of str(), summary(), and head()

str(ew)
'data.frame':   49 obs. of  8 variables:
 $ Group    : Factor w/ 2 levels "1","2": 2 2 2 2 2 1 2 2 2 1 ...
 $ North    : num  97 100 102 101 102 ...
 $ West     : num  90.3 90.6 91.6 90.5 90.5 ...
 $ Age      : Factor w/ 6 levels "Child","Adolescent",..: 3 3 6 3 6 3 6 3 3 5 ...
 $ Sex      : Factor w/ 2 levels "Female","Male": 2 2 2 2 2 2 2 2 2 1 ...
 $ Direction:Classes 'circular', 'numeric'  atomic [1:49] 42 28 350 335 3 142 328 351 357 144 ...
  .. ..- attr(*, "circularp")=List of 6
  .. .. ..$ type    : chr "angles"
  .. .. ..$ units   : chr "degrees"
  .. .. ..$ template: chr "geographics"
  .. .. ..$ modulo  : chr "asis"
  .. .. ..$ zero    : num 1.57
  .. .. ..$ rotation: chr "clock"
 $ Looking  :Classes 'circular', 'numeric'  atomic [1:49] 283 272 219 60 86 21 244 214 91 53 ...
  .. ..- attr(*, "circularp")=List of 6
  .. .. ..$ type    : chr "angles"
  .. .. ..$ units   : chr "degrees"
  .. .. ..$ template: chr "geographics"
  .. .. ..$ modulo  : chr "asis"
  .. .. ..$ zero    : num 1.57
  .. .. ..$ rotation: chr "clock"
 $ Goods    : Factor w/ 2 levels "Absent","Present": 2 2 2 1 2 1 1 2 1 1 ...
summary(ew)
 Group      North             West                  Age         Sex    
 1:12   Min.   : 83.44   Min.   : 86.35   Child       : 2   Female:24  
 2:37   1st Qu.:100.03   1st Qu.: 90.53   Adolescent  : 3   Male  :25  
        Median :102.83   Median : 93.34   Young Adult :19              
        Mean   :101.42   Mean   : 94.92   Adult       : 3              
        3rd Qu.:104.92   3rd Qu.: 97.37   Middle Adult:10              
        Max.   :115.80   Max.   :109.34   Old Adult   :12              
   Direction        Looking          Goods   
 Min.   :  1.0   Min.   :  8.0   Absent :23  
 1st Qu.: 28.0   1st Qu.: 86.0   Present:26  
 Median : 54.0   Median :180.0               
 Mean   :108.9   Mean   :175.4               
 3rd Qu.:144.0   3rd Qu.:252.0               
 Max.   :357.0   Max.   :356.0               
head(ew)
     Group  North  West         Age  Sex Direction Looking   Goods
011      2  96.96 90.32 Young Adult Male        42     283 Present
014      2 100.20 90.61 Young Adult Male        28     272 Present
015      2 101.74 91.62   Old Adult Male       350     219 Present
016a     2 101.00 90.47 Young Adult Male       335      60  Absent
018      2 101.65 90.46   Old Adult Male         3      86 Present
020      1  95.17 90.53 Young Adult Male       142      21  Absent

Data processing

The purpose of this data processing is to go from the wide data format to a long dat format that answers the question: what is the percent of grave goods by male/female for each burial individual age and period of burial.

The process below follows: * group_by to group the data by Age, Sex, and Group (period) * mutate the data to change the values of Goods to presence/absence * summarise the data to sum the presence of Goods * count the sum of Goods with n() * and calculate the percent of grave goods * the data are ungroup’ed * finally, complete creates all potential combinations of Age, Sex and Group and fills in those without a percent with a 0.

The final step using complete makes piloting more consistent.

ew2 <- group_by(ew, Age, Sex, Group) %>%
  mutate(Goods = ifelse(Goods == "Present", 1, 0)) %>%
  summarise(sum_goods = sum(Goods),
            cnt = n(),
            pcnt = sum_goods/cnt) %>%
  ungroup() %>%
  tidyr::complete(Age, Sex, Group, fill = list(pcnt = 0))

Preliminary plot

Plot the data from the dplyr sequence for a sanity check

ggplot(ew2, aes(x = Age, y = pcnt, group = Sex, fill = Sex)) + 
  geom_bar(stat = "identity", position = "dodge")

Build ggplot

Once the data is worked into the format desired to answer the question we are interested in, we can build a ggplot to display the results.

The following plot uses the long dat format created above, uses the facet_grid() function to split the plots into two broad time periods, scale_fill_manual() to set custom fill colors, a bunch of calls to theme() to adjust plot elements, and finally within theme() uses legend.position() to move the legend to a spot within the plot.

In less than 100 lines of code, we took data from its original format, transformed it to address a specific question, and then made a publication ready plot.

group_names <- c(`1` = " 2000 - 1200 BCE", `2` = " CE 200 - 500")

ggplot(ew2, aes(x = Age, y = pcnt, group = Sex, fill = Sex)) + 
  geom_bar(stat = "identity", position = "dodge") +
  geom_hline(yintercept = 0, color = "gray70") +
  theme_bw() +
  scale_fill_manual(values = c("darkgoldenrod2", "slategrey")) +
  scale_y_continuous(labels = scales::percent) +
  labs(title="Percent of Grave Goods by Age, Sex, and Temporal Group",
       subtitle="Ernest Witte site, Austin County, Texas",
       caption="Data: Hall,G.D. (1981)",
       y = "Grave Good Presence") +
  facet_grid(Group~., labeller = as_labeller(group_names)) +
  theme(
    strip.background = element_rect(colour = "white", fill = "white"),
    strip.text.y = element_text(colour = "black", size = 10, face = "bold", 
                                family = "Trebuchet MS"),
    panel.border = element_rect(color = "gray90"),
    axis.text.x = element_text(angle = 0, size = 10, family = "Trebuchet MS"),
    axis.text.y = element_text(size = 9, family = "Trebuchet MS"),
    axis.title.y = element_text(size = 11, family = "Trebuchet MS"),
    axis.title.x = element_blank(),
    plot.caption = element_text(size = 10, hjust = 0, margin=margin(t=10), 
                                family = "Trebuchet MS"),
    plot.title=element_text(family="TrebuchetMS-Bold"),
    plot.subtitle=element_text(family="TrebuchetMS-Italic"),
    panel.grid.major.x = element_blank(), 
    panel.grid.minor.x = element_blank(),
    panel.grid.minor.y = element_blank(),
    legend.position = c(0.1, 0.85),
    panel.spacing = unit(1, "lines")
  )

Data set information

Details

The Ernest Witte site in Austin County, Texas contains four burial groups from different time periods. Group 1 includes 60 interments and that occurred between about 2000 and 1200 BCE. Group 2 is the largest with 148 interments. The burials in this group were interred between about CE 200 and 500. Groups 3 and 4 include only 10 and 13 interments and date to CE 500 to 1500, but are not included in this data set which was taken from Appendix II (Hall 1981). Two of the variables, direction and looking, are circular data and require package circular. Hall (2010) provides a summary of the site and its significance.

Source

Hall, G. D. 1981. Allen’s Creek: A Study in the Cultural Prehistory of the Lower Brazos River Valley. The University of Texas at Austin. Texas Archaeological Survey. Texas. Research Report No. 61.

References

Hall, G. D. 2010. Ernest Witte site. Handbook of Texas Online http://www.tshaonline.org/ handbook/online/articles/bbe05. Texas State Historical Association