Use the Housing dataset from Ecdat package
Gives information about sales prices and other variables for 546
homes in Windsor, Canada in 1987
Scatterplot of sales price against the lot size of these homes, with
loess curves plotted by whether or not the home is in a preferred
neighborhood
library(ggplot2)
library(Ecdat)
## Loading required package: Ecfun
##
## Attaching package: 'Ecfun'
## The following object is masked from 'package:base':
##
## sign
##
## Attaching package: 'Ecdat'
## The following object is masked from 'package:datasets':
##
## Orange
ggplot(Housing, aes(x = lotsize, y = price, color = prefarea))+
geom_point(alpha = 0.5)+
geom_smooth(method = "loess")+
labs(title = "Sales Prices vs. Lot Size of Homes in Windsor, Canada in 1987", x = "Lot Size (Square Feet)", y = "Price", color = "Preferred Neighborhood?") +
scale_y_continuous(breaks = seq(25000, 250000, by = 25000), limits = c(25000, 200000)) +
scale_x_continuous(breaks = seq(1500, 16500, by = 3000))+
theme(axis.text = element_text(size = 14, face = "bold"),
axis.title = element_text(size = 15, face = "bold"),
plot.title = element_text(size = 20),
axis.ticks.length = unit(0.25, "cm"),
axis.ticks = element_line(color = "firebrick"),
legend.key.size = unit(1.3, "cm"),
legend.text = element_text(size = 16),
legend.title = element_text(size = 16))
## `geom_smooth()` using formula = 'y ~ x'

Analysis
The scatterplot of price against the lot size of homes in Windsor,
Canada shows that there are more homes in this dataset that are not in a
preferred neighborhood of the city than homes that are in a preferred
neighborhood. There does not appear to be a linear relationship between
the lot size in square feet and the prices of these homes for either
preferred or unpreferred neighborhoods. We can see that homes in
preferred neighborhoods do generally have higher prices than homes that
are not in preferred neighborhoods, as can be seen by more blue
(preferred) points being located at higher prices than red (not
preferred) points. Consequently, the Loess curve representing preferred
neighborhoods is located above the Loess curve representing homes that
are not in preferred neighborhoods.
There is a cluster of points with low lot square footage that have
lower prices. The overall trend for both groups is an increase in price
for higher lot sizes, as evident by the Loess curves generally
increasing over time. However, looking at the points apart from the
curves, we can see a lot of variation within both groups of preferred
neighborhoods and not, with many homes that have lot sizes in the middle
parts of the overall data having the highest prices. The trend of
increasing price for homes with “average” lot sizes (or just lot sizes
toward the center of the distribution), is especially evident for the
homes that are in preferred neighborhoods, with an upward shift in the
blue Loess curve with lot sizes of about 6000-7500 square feet. After
this, we see a slight downward trend. There are also multiple homes that
are not in preferred neighborhoods with square footage greater than
10,500 that have prices less than 100,000. It could be true that lot
size does not play a huge role in determining the price, or even that
homes with an average amount of land have other qualities that tend to
increase their value, but it does seem that being in a preferred
neighborhood is associated with higher selling price for these homes in
Windsor.