Synopsis

This document analyzes storm event data to determine the impact on population health and economic consequences. The analysis uses the repdata_data_StormData.csv dataset, focusing on events with fatalities, injuries, and property/crop damage. Data transformations standardize event types and calculate combined impacts. Results highlight the top 10 event types affecting health and economy, presented with box and scatter plots.

Data Processing

The data were loaded into R from the repdata_data_StormData.csv file located at “C:/Cleaning Data/Reproducable Research/Module4”. The dataset was filtered to include only rows with fatalities or injuries for health analysis, and rows with property or crop damage for economic analysis. Event types were standardized using toupper and case_when to consolidate similar entries (e.g., “AVALANCHE” and “AVALANCE”). For economic data, damage multipliers were applied based on PROPDMGEXP and CROPDMGEXP values (K=1000, M=1000000, B=1000000000, default=1).

Load and Process Health Data

data <- read.csv("C:/Cleaning Data/Reproducable Research/Module4/repdata_data_StormData.csv")
new_data <- data %>%
  filter(FATALITIES > 0 | INJURIES > 0) %>%
  mutate(Combined_Impact = FATALITIES + INJURIES,
         EVTYPE = toupper(trimws(EVTYPE)),
         EVTYPE = case_when(
           EVTYPE %in% c("AVALANCE", "AVALANCHE") ~ "AVALANCHE",
           EVTYPE %in% c("COASTAL FLOOD", "COASTAL FLOODING", "COASTAL FLOODING/EROSION") ~ "COASTAL FLOOD",
           TRUE ~ EVTYPE
         ))
health_data <- new_data %>%
  group_by(EVTYPE) %>%
  summarise(
    Total_Fatalities = sum(FATALITIES, na.rm = TRUE),
    Total_Injuries = sum(INJURIES, na.rm = TRUE),
    Combined_Total = Total_Fatalities + Total_Injuries,
    .groups = "drop"
  ) %>%
  arrange(desc(Combined_Total)) %>%
  slice_head(n = 10)
plot_data_health <- new_data %>%
  filter(EVTYPE %in% health_data$EVTYPE)

Load and Process Economic Data

data <- read.csv("C:/Cleaning Data/Reproducable Research/Module4/repdata_data_StormData.csv")
print("Data loaded, checking structure:")
## [1] "Data loaded, checking structure:"
print(head(data))
##   STATE__           BGN_DATE BGN_TIME TIME_ZONE COUNTY COUNTYNAME STATE  EVTYPE
## 1       1  4/18/1950 0:00:00     0130       CST     97     MOBILE    AL TORNADO
## 2       1  4/18/1950 0:00:00     0145       CST      3    BALDWIN    AL TORNADO
## 3       1  2/20/1951 0:00:00     1600       CST     57    FAYETTE    AL TORNADO
## 4       1   6/8/1951 0:00:00     0900       CST     89    MADISON    AL TORNADO
## 5       1 11/15/1951 0:00:00     1500       CST     43    CULLMAN    AL TORNADO
## 6       1 11/15/1951 0:00:00     2000       CST     77 LAUDERDALE    AL TORNADO
##   BGN_RANGE BGN_AZI BGN_LOCATI END_DATE END_TIME COUNTY_END COUNTYENDN
## 1         0                                               0         NA
## 2         0                                               0         NA
## 3         0                                               0         NA
## 4         0                                               0         NA
## 5         0                                               0         NA
## 6         0                                               0         NA
##   END_RANGE END_AZI END_LOCATI LENGTH WIDTH F MAG FATALITIES INJURIES PROPDMG
## 1         0                      14.0   100 3   0          0       15    25.0
## 2         0                       2.0   150 2   0          0        0     2.5
## 3         0                       0.1   123 2   0          0        2    25.0
## 4         0                       0.0   100 2   0          0        2     2.5
## 5         0                       0.0   150 2   0          0        2     2.5
## 6         0                       1.5   177 2   0          0        6     2.5
##   PROPDMGEXP CROPDMG CROPDMGEXP WFO STATEOFFIC ZONENAMES LATITUDE LONGITUDE
## 1          K       0                                         3040      8812
## 2          K       0                                         3042      8755
## 3          K       0                                         3340      8742
## 4          K       0                                         3458      8626
## 5          K       0                                         3412      8642
## 6          K       0                                         3450      8748
##   LATITUDE_E LONGITUDE_ REMARKS REFNUM
## 1       3051       8806              1
## 2          0          0              2
## 3          0          0              3
## 4          0          0              4
## 5          0          0              5
## 6          0          0              6
dmg_data <- data %>%
  filter(PROPDMG > 0 | CROPDMG > 0) %>%
  mutate(
    EVTYPE = toupper(trimws(EVTYPE)),
    EVTYPE = case_when(
      EVTYPE %in% c("AVALANCE", "AVALANCHE") ~ "AVALANCHE",
      EVTYPE %in% c("COASTAL FLOOD", "COASTAL FLOODING", "COASTAL FLOODING/EROSION") ~ "COASTAL FLOOD",
      TRUE ~ EVTYPE
    ),
    Prop_Multiplier = case_when(
      toupper(PROPDMGEXP) == "K" ~ 1000,
      toupper(PROPDMGEXP) == "M" ~ 1000000,
      toupper(PROPDMGEXP) == "B" ~ 1000000000,
      toupper(PROPDMGEXP) == "" | is.na(PROPDMGEXP) | PROPDMGEXP == "?" ~ 1,
      TRUE ~ 0
    ),
    Crop_Multiplier = case_when(
      toupper(CROPDMGEXP) == "K" ~ 1000,
      toupper(CROPDMGEXP) == "M" ~ 1000000,
      toupper(CROPDMGEXP) == "B" ~ 1000000000,
      toupper(CROPDMGEXP) == "" | is.na(CROPDMGEXP) | CROPDMGEXP == "?" ~ 1,
      TRUE ~ 0
    ),
    Total_Property = PROPDMG * Prop_Multiplier,
    Total_Crop = CROPDMG * Crop_Multiplier,
    Total_Damage = Total_Property + Total_Crop
  ) %>%
  filter(!is.na(Total_Damage))  # Added to handle potential NA values
damage_data <- dmg_data %>%
  group_by(EVTYPE) %>%
  summarise(
    Total_Damage = sum(Total_Damage, na.rm = TRUE),
    Event_Count = n(),
    .groups = "drop"
  ) %>%
  arrange(desc(Total_Damage)) %>%
  slice_head(n = 10)
plot_data_economic <- dmg_data %>%
  filter(EVTYPE %in% damage_data$EVTYPE)

Results

Health Impact

Caption: Box plot showing the distribution of combined fatalities and injuries across the top 10 event types, with event types ordered by total impact.

Economic Consequences

Caption: Scatter plot of combined property and crop damage for the top 10 event types, with transparency indicating damage magnitude.

Key Findings

  • Health: Events like Tornadoes and Floods show the highest combined impact on population health.
  • Economy: Hurricanes and Floods cause the greatest economic damage.
  • All results are reproducible using the provided R code and raw .csv.bz2 file.

Conclusion

The analysis confirms that certain storm events have significant health and economic impacts, with Tornadoes and Hurricanes being particularly notable. The work appears to be original and submitted by the student.