Introduction

This notebook looks at egg production data across different states and compares it to price per dozen. The goal is to re-create a scatter plot from Chapter 1 to see if there is any relationship between how much eggs are produced and how much they cost.

Load data

egg_data <- read_excel("C:/Users/desir/OneDrive/Documents/MBAD/egg_production_clean.xlsx")

Preview data

glimpse(egg_data)
Rows: 50
Columns: 5
$ state                       <chr> "AL", "AK", "AZ", "AR", "CA", "CO", "CT",…
$ eggs_produced_1990_millions <dbl> 2206.0, 0.7, 73.0, 3620.0, 7472.0, 788.0,…
$ eggs_produced_1991_millions <dbl> 2186.0, 0.7, 74.0, 3737.0, 7444.0, 873.0,…
$ price_per_dozen_1990_cents  <dbl> 92.7, 151.0, 61.0, 86.3, 63.4, 77.8, 106.…
$ price_per_dozen_1991_cents  <dbl> 91.4, 149.0, 56.0, 91.8, 58.4, 73.0, 104.…

The dataset contains egg production and pricing data for different states. It includes the number of eggs produced, measured in millions, and the price per dozen eggs, measured in cents. Each row represents a different state, making this a cross-sectional dataset. This allows us to compare how production and price vary across states at the same point in time.

Visuals

plot(egg_data$eggs_produced_1990_millions,
     egg_data$price_per_dozen_1990_cents,
     main = "Egg Production vs Price",
     xlab = "Egg Production (millions)",
     ylab = "Price per Dozen (cents)",
     pch = 19)

The scatter plot shows the relationship between egg production and price per dozen across different states. Each point on the graph represents a single state, with its position based on how many eggs it produces and the price of eggs in that state.

The pattern shows that states with higher egg production tend to have lower prices. This suggests an inverse relationship between production and price. However, the points are somewhat spread out, which means the relationship is not perfect and other factors may also be influencing price.

plot(egg_data$eggs_produced_1990_millions,
     egg_data$price_per_dozen_1990_cents,
     main = "Egg Production vs Price",
     xlab = "Egg Production (millions)",
     ylab = "Price per Dozen (cents)",
     pch = 19)

# Linear regression line
model <- lm(price_per_dozen_1990_cents ~ eggs_produced_1990_millions, data = egg_data)
abline(model, col = "blue", lwd = 2)

# Flexible curve
lines(lowess(egg_data$eggs_produced_1990_millions,
             egg_data$price_per_dozen_1990_cents),
      col = "red", lwd = 2)

The blue line shows a basic linear trend between egg production and price, while the red line follows the data more closely and adjusts to changes in the pattern. The flexible curve makes it clear that the relationship is not perfectly straight and helps highlight areas where the data deviates from a simple linear trend. This suggests that other factors besides production are likely influencing price across states.

LS0tDQp0aXRsZTogIkNoYXB0ZXIgMSBEaXNjdXNzaW9uIEVnZyBQcm9kdWN0aW9uIg0KYXV0aG9yOg0KICBuYW1lOiBNaWtleSBDcmVwcHMNCiAgYWZmaWxpYXRpb246IFVuaXZlcnNpdHkgb2YgU3QuIEZyYW5jaXMNCmRhdGU6ICJgciBsdWJyaWRhdGU6OnRvZGF5KClgIg0Kb3V0cHV0Og0KICBodG1sX25vdGVib29rOg0KICAgIHRvYzogdHJ1ZQ0KICAgIHRvY19mbG9hdDogdHJ1ZQ0KICAgIHRoZW1lOiB1bml0ZWQNCiAgICBoaWdobGlnaHQ6IHRhbmdvDQogIGh0bWxfZG9jdW1lbnQ6DQogICAgdG9jOiB0cnVlDQogICAgZGZfcHJpbnQ6IHBhZ2VkDQotLS0NCiMgSW50cm9kdWN0aW9uDQoNClRoaXMgbm90ZWJvb2sgbG9va3MgYXQgZWdnIHByb2R1Y3Rpb24gZGF0YSBhY3Jvc3MgZGlmZmVyZW50IHN0YXRlcyBhbmQgY29tcGFyZXMgaXQgdG8gcHJpY2UgcGVyIGRvemVuLiBUaGUgZ29hbCBpcyB0byByZS1jcmVhdGUgYSBzY2F0dGVyIHBsb3QgZnJvbSBDaGFwdGVyIDEgdG8gc2VlIGlmIHRoZXJlIGlzIGFueSByZWxhdGlvbnNoaXAgYmV0d2VlbiBob3cgbXVjaCBlZ2dzIGFyZSBwcm9kdWNlZCBhbmQgaG93IG11Y2ggdGhleSBjb3N0Lg0KDQojIExvYWQgZGF0YQ0KYGBge3J9DQplZ2dfZGF0YSA8LSByZWFkX2V4Y2VsKCJDOi9Vc2Vycy9kZXNpci9PbmVEcml2ZS9Eb2N1bWVudHMvTUJBRC9lZ2dfcHJvZHVjdGlvbl9jbGVhbi54bHN4IikNCmBgYA0KIyBQcmV2aWV3IGRhdGENCmBgYHtyfQ0KZ2xpbXBzZShlZ2dfZGF0YSkNCmBgYA0KVGhlIGRhdGFzZXQgY29udGFpbnMgZWdnIHByb2R1Y3Rpb24gYW5kIHByaWNpbmcgZGF0YSBmb3IgZGlmZmVyZW50IHN0YXRlcy4gSXQgaW5jbHVkZXMgdGhlIG51bWJlciBvZiBlZ2dzIHByb2R1Y2VkLCBtZWFzdXJlZCBpbiBtaWxsaW9ucywgYW5kIHRoZSBwcmljZSBwZXIgZG96ZW4gZWdncywgbWVhc3VyZWQgaW4gY2VudHMuIEVhY2ggcm93IHJlcHJlc2VudHMgYSBkaWZmZXJlbnQgc3RhdGUsIG1ha2luZyB0aGlzIGEgY3Jvc3Mtc2VjdGlvbmFsIGRhdGFzZXQuIFRoaXMgYWxsb3dzIHVzIHRvIGNvbXBhcmUgaG93IHByb2R1Y3Rpb24gYW5kIHByaWNlIHZhcnkgYWNyb3NzIHN0YXRlcyBhdCB0aGUgc2FtZSBwb2ludCBpbiB0aW1lLg0KDQoNCiMgVmlzdWFscw0KYGBge3J9DQpwbG90KGVnZ19kYXRhJGVnZ3NfcHJvZHVjZWRfMTk5MF9taWxsaW9ucywNCiAgICAgZWdnX2RhdGEkcHJpY2VfcGVyX2RvemVuXzE5OTBfY2VudHMsDQogICAgIG1haW4gPSAiRWdnIFByb2R1Y3Rpb24gdnMgUHJpY2UiLA0KICAgICB4bGFiID0gIkVnZyBQcm9kdWN0aW9uIChtaWxsaW9ucykiLA0KICAgICB5bGFiID0gIlByaWNlIHBlciBEb3plbiAoY2VudHMpIiwNCiAgICAgcGNoID0gMTkpDQpgYGANClRoZSBzY2F0dGVyIHBsb3Qgc2hvd3MgdGhlIHJlbGF0aW9uc2hpcCBiZXR3ZWVuIGVnZyBwcm9kdWN0aW9uIGFuZCBwcmljZSBwZXIgZG96ZW4gYWNyb3NzIGRpZmZlcmVudCBzdGF0ZXMuIEVhY2ggcG9pbnQgb24gdGhlIGdyYXBoIHJlcHJlc2VudHMgYSBzaW5nbGUgc3RhdGUsIHdpdGggaXRzIHBvc2l0aW9uIGJhc2VkIG9uIGhvdyBtYW55IGVnZ3MgaXQgcHJvZHVjZXMgYW5kIHRoZSBwcmljZSBvZiBlZ2dzIGluIHRoYXQgc3RhdGUuDQoNClRoZSBwYXR0ZXJuIHNob3dzIHRoYXQgc3RhdGVzIHdpdGggaGlnaGVyIGVnZyBwcm9kdWN0aW9uIHRlbmQgdG8gaGF2ZSBsb3dlciBwcmljZXMuIFRoaXMgc3VnZ2VzdHMgYW4gaW52ZXJzZSByZWxhdGlvbnNoaXAgYmV0d2VlbiBwcm9kdWN0aW9uIGFuZCBwcmljZS4gSG93ZXZlciwgdGhlIHBvaW50cyBhcmUgc29tZXdoYXQgc3ByZWFkIG91dCwgd2hpY2ggbWVhbnMgdGhlIHJlbGF0aW9uc2hpcCBpcyBub3QgcGVyZmVjdCBhbmQgb3RoZXIgZmFjdG9ycyBtYXkgYWxzbyBiZSBpbmZsdWVuY2luZyBwcmljZS4NCg0KYGBge3J9DQpwbG90KGVnZ19kYXRhJGVnZ3NfcHJvZHVjZWRfMTk5MF9taWxsaW9ucywNCiAgICAgZWdnX2RhdGEkcHJpY2VfcGVyX2RvemVuXzE5OTBfY2VudHMsDQogICAgIG1haW4gPSAiRWdnIFByb2R1Y3Rpb24gdnMgUHJpY2UiLA0KICAgICB4bGFiID0gIkVnZyBQcm9kdWN0aW9uIChtaWxsaW9ucykiLA0KICAgICB5bGFiID0gIlByaWNlIHBlciBEb3plbiAoY2VudHMpIiwNCiAgICAgcGNoID0gMTkpDQoNCiMgTGluZWFyIHJlZ3Jlc3Npb24gbGluZQ0KbW9kZWwgPC0gbG0ocHJpY2VfcGVyX2RvemVuXzE5OTBfY2VudHMgfiBlZ2dzX3Byb2R1Y2VkXzE5OTBfbWlsbGlvbnMsIGRhdGEgPSBlZ2dfZGF0YSkNCmFibGluZShtb2RlbCwgY29sID0gImJsdWUiLCBsd2QgPSAyKQ0KDQojIEZsZXhpYmxlIGN1cnZlDQpsaW5lcyhsb3dlc3MoZWdnX2RhdGEkZWdnc19wcm9kdWNlZF8xOTkwX21pbGxpb25zLA0KICAgICAgICAgICAgIGVnZ19kYXRhJHByaWNlX3Blcl9kb3plbl8xOTkwX2NlbnRzKSwNCiAgICAgIGNvbCA9ICJyZWQiLCBsd2QgPSAyKQ0KYGBgDQpUaGUgYmx1ZSBsaW5lIHNob3dzIGEgYmFzaWMgbGluZWFyIHRyZW5kIGJldHdlZW4gZWdnIHByb2R1Y3Rpb24gYW5kIHByaWNlLCB3aGlsZSB0aGUgcmVkIGxpbmUgZm9sbG93cyB0aGUgZGF0YSBtb3JlIGNsb3NlbHkgYW5kIGFkanVzdHMgdG8gY2hhbmdlcyBpbiB0aGUgcGF0dGVybi4gVGhlIGZsZXhpYmxlIGN1cnZlIG1ha2VzIGl0IGNsZWFyIHRoYXQgdGhlIHJlbGF0aW9uc2hpcCBpcyBub3QgcGVyZmVjdGx5IHN0cmFpZ2h0IGFuZCBoZWxwcyBoaWdobGlnaHQgYXJlYXMgd2hlcmUgdGhlIGRhdGEgZGV2aWF0ZXMgZnJvbSBhIHNpbXBsZSBsaW5lYXIgdHJlbmQuIFRoaXMgc3VnZ2VzdHMgdGhhdCBvdGhlciBmYWN0b3JzIGJlc2lkZXMgcHJvZHVjdGlvbiBhcmUgbGlrZWx5IGluZmx1ZW5jaW5nIHByaWNlIGFjcm9zcyBzdGF0ZXMuDQo=