2026-04-12

TheaterWorks Data

This data is from a local community theater. It includes cumulative donations over 3 seasons, from 2022-2025, and it includes the cumulative number of attended performances per patron.

As the Patron Services Manager I was interested in finding out the relation in our data between number of performances attended and cumulative donations. I would like to know if a patron attends several different performances, if that influences their donation amount and how.

I’ve excluded donations greater than $500 as those donors tend to behave differently than standard patrons. They would be in the VIP donor or even Sponsorship category, so their data would skew the results.

Number of Attended Performances per Patron

Cumulative Donation by Patron

Equation of a Linear Regression4

The basic equation of a linear regression is: \(Y_i=\beta_0+\beta_1\cdot X_i + \epsilon\)

Code for Data Cleaning and Linear Model

DonProd$Cumulativedonation = as.numeric(DonProd$Cumulativedonation)
CuDon = filter(DonProd,!is.na(Cumulativedonation) & 
              Cumulativedonation > 1 &
              Cumulativedonation < 500)
ggplot(CuDon, aes(x=ProdCount, y=Cumulativedonation))+ 
  geom_point()+geom_smooth(method="lm")

Linear Model for Donor vs Attendance

Equation of TheaterWorks Data5

The Equation for Donors VS Attendance is: \(DonationAmount=29.09+7.18\cdot AttendedPerfs\)

Conclusion

From our linear model we can conclude that the average donation of a patron who attends 1 show is around $30. For every additional performance attended a patron is likely to donate an additional $7. The data is correlative so it is possible that those who donate attend more performances, rather than those who attend more performances tend to donate more.