Are there natural ways to divide our plots into three groups?

The density function suggests there may not be a natural division between small, medium, and large plots:

library(ggplot2)
library(readxl)

sheet = read_xlsx("~/Downloads/220114 - Garden Plot Data - Draft.xlsx")
plots = sheet[3:91, 2]
colnames(plots) <- c("area")
plots$area = as.double(plots$area)

ggplot(plots, aes(area)) + geom_density() + geom_rug() + labs(title="Density function of plot area", x="Area", y="Density")

Mark and Matthew propose that a small plot is 81 ft^2 or smaller, and a large plot is 130 ft^2 or larger.

One desirable property of a breakpoint between size classes is that there are not many plots with sizes near the breakpoint; otherwise, there may be disagreement about whether a garden plot falls on the “small” or “medium” side.

The horizontal lines in a empirical CDF plot represent intervals where no plots fall:

ggplot(plots, aes(area)) +
  stat_ecdf() +
  labs(title="Empirical cumulative distribution function", x="Plot area", y="Fraction plots ≤ x") +
  geom_vline(xintercept=c(81, 130), alpha=0.3) +
  geom_vline(xintercept=67, color="blue", alpha=0.3) +
  scale_x_continuous(breaks=c(50, 100, 150, 200, 81, 130, 67), minor_breaks=c(25, 75, 125, 175))

If we use 81 ft^2 and 130 ft^2 as our break points, we find this many of each class:

table(cut(plots$area, c(0, 81, 130, Inf), c("S", "M", "L")))
## 
##  S  M  L 
## 49 19 21

Some fees that work here are:

S: $45 M: $50 L: $65

for a total of $4520.

If we use 67 ft^2 and 130 ft^2, we find this many of each class:

table(cut(plots$area, c(0, 67, 130, Inf), c("S", "M", "L")))
## 
##  S  M  L 
## 38 30 21

Some fees that work here are:

S: $45 M: $50 L: $60

for a total of $4470