inventory vs. listings. Describe any pattern you see.ggplot(data = txh) +
geom_point(mapping = aes(x = listings, y = inventory))
It appears that there is a positive relationship between the number of listings and the amount of inventory, but there is also some kind of clustering of points.
sales. Describe any additional pattern you see.ggplot(data = txh) +
geom_point(mapping = aes(x = listings, y = inventory, size = sales))
There appear to be more sales when there are more listings.
city. Describe any additional pattern you see.ggplot(data = txh) +
geom_point(mapping = aes(x = listings, y = inventory, size = sales, color = city))
The four cities are now clearly visible. While the inventory is similar across all four cities, Dallas and Houston tend to have many more listings.
ggplot(data = txh) +
geom_point(mapping = aes(x = listings, y = inventory, size = sales, color = city)) +
geom_smooth(mapping = aes(x = listings, y = inventory, color = city))
median sale price in the four cities. Describe any pattern you see.ggplot(data = txh) +
geom_boxplot(mapping = aes(y = median, x = city))
The median sale price in Austin tends to be higher than the other three cities, followed by Dallas, Houston, and San Antonio, in that order.
volume in each of the four cities. Describe any pattern you see.ggplot(data = txh) +
geom_histogram(mapping = aes(x = volume)) +
facet_wrap(~ city)
While the median sale price tends to be larger in Austin, the total volume (total value of sales) is clearly higher in Dallas and Houston, likely due to more total sales.
inventory vs. date with points colored by city. Clearly, it would be better to have lines connecting the sequential inventory values rather than using points. Which geom_ would you use to do this? Recreate the graph with lines instead of points.ggplot(data = txh) +
geom_point(mapping = aes(x = date, y = inventory, color = city))
geom_line() can be used to connect the points with lines.
ggplot(data = txh) +
geom_line(mapping = aes(x = date, y = inventory, color = city))