Exploration of severe weather in the USA in relation to Health and Economics

Synopsis

The weather data from the U.S National Oceanic and Atmospheri Administration’s archives, covers the period 1950s to date. However the weather data monitored was limited prior to 1996: therefore the following assessment only covers the 1996 onward period. It is worth noting that the majority of property damage (having the highest economic impact), result from flooding, avalanche, smoke, hail, and high-wind events (both tornadoes and thunderstorm winds), however by far the largest damage to agriculture is drought. The majority of fatalities were caused by exessive heat and tornadoes.

Data Set-up

As mentioned, the data was filtered to look at 1996 onwards. thereafter the “EVTYPE” (Event type) was matched against the recognized list of event types. Thereafter the events true value was calculated and compared.

# Loading and processing data:  
file <- "repdata_data_StormData.csv.bz2" # Zipped file 
# system.time(bunzip2(file, remove = FALSE)) # Check speed for loading file, and unzip  

file2 <- "repdata_data_StormData.csv" # Unzipped file name 
raw_data <- read.csv(file2,header = TRUE, sep = ",") # Load data 

file_eventname <- "ExtremeEvent_list.txt" # Event list from NOAA
eventname <- read.table(file_eventname, sep = ",") # Loading list

# Subsetting the data: 
raw_data$Date2 <- as.Date(as.character(raw_data$BGN_DATE), "%m/%d/%Y %H:%M:%S") #creating a new column of corrected dates 
data1996 <- raw_data %>% select(STATE,EVTYPE,FATALITIES,INJURIES,PROPDMG, PROPDMGEXP,CROPDMG,CROPDMGEXP,Date2) %>% 
  filter(format(as.Date(Date2),"%Y") >1995) #filtering for required columns, and greater than 1995

data1996_prop <- filter(data1996, data1996$PROPDMG>0) # remove rows that are zero 
data1996_crop <- filter(data1996, data1996$CROPDMG>0) # remove rows that are zero
data1996_people <- filter(data1996, data1996$FATALITIES>0 | data1996$INJURIES>0) # remove rows that have no people impact

for (i in 1:nrow(data1996_prop)) { 
  
  if (data1996_prop$PROPDMGEXP[i] == "B" | data1996_prop$PROPDMGEXP[i] == "b") {
    data1996_prop$Total[i] = data1996_prop$PROPDMG[i]*10^9 }
  
  else if (data1996_prop$PROPDMGEXP[i] == "m" | data1996_prop$PROPDMGEXP[i] == "M") {
    data1996_prop$Total[i] = data1996_prop$PROPDMG[i]*10^6 }
  
  else if (data1996_prop$PROPDMGEXP[i] == "k" | data1996_prop$PROPDMGEXP[i] == "K") {
    data1996_prop$Total[i] = data1996_prop$PROPDMG[i]*10^3 }
  
  else if (data1996_prop$PROPDMGEXP[i] == "h" | data1996_prop$PROPDMGEXP[i] == "H") {
    data1996_prop$Total[i] = data1996_prop$PROPDMG[i]*10^2 }
  
  else {data1996_prop$Total[i] = data1996_prop$PROPDMG[i]*10^(as.numeric(data1996_prop$PROPDMGEXP[i]))}
  
} #Assigning total number to amount for property damage 

for (i in 1:nrow(data1996_crop)) { 
  
  if (data1996_crop$CROPDMGEXP[i] == "B" | data1996_crop$CROPDMGEXP[i] == "b") {
    data1996_crop$Total[i] = data1996_crop$CROPDMG[i]*10^9 }
  
  else if (data1996_crop$CROPDMGEXP[i] == "m" | data1996_crop$CROPDMGEXP[i] == "M") {
    data1996_crop$Total[i] = data1996_crop$CROPDMG[i]*10^6 }
  
  else if (data1996_crop$CROPDMGEXP[i] == "k" | data1996_crop$CROPDMGEXP[i] == "K") {
    data1996_crop$Total[i] = data1996_crop$CROPDMG[i]*10^3 }
  
  else if (data1996_crop$CROPDMGEXP[i] == "h" | data1996_crop$CROPDMGEXP[i] == "H") {
    data1996_crop$Total[i] = data1996_crop$CROPDMG[i]*10^2 }
  
  else {data1996_crop$Total[i] = data1996_crop$CROPDMG[i]*10^(as.numeric(data1996_crop$CROPDMGEXP[i]))}

  } #Assigning total number to amount for Crop damage

maxProp <- data1996_prop[which.max(data1996_prop$Total),]
maxCrop <- data1996_crop[which.max(data1996_crop$Total),]

MatchProp <- amatch(data1996_prop$EVTYPE,eventname$V1,method='osa',maxDist = 25) #matching names to events column
for (i in 1:nrow(data1996_prop)) { data1996_prop$NewName[i] <- eventname[MatchProp[i],1] } # Reassigning names to storms

MatchCrop <- amatch(data1996_crop$EVTYPE,eventname$V1,method='osa',maxDist=25) # matching names to events column
for (i in 1:nrow(data1996_crop)) { data1996_crop$NewName[i] <- eventname[MatchCrop[i],1] } # Reassigning names to storms

Matchpeople <- amatch(data1996_people$EVTYPE,eventname$V1,method='osa',maxDist=25) # matching names to events column
for (i in 1:nrow(data1996_people)) { data1996_people$NewName[i] <- eventname[Matchpeople[i],1] } # Reassigning names to storms

Results

It is clear from the tables below that while both the property damage and the agricultural damage are impacted by flooding, the impact to the agricultural sector by drought is significant.

# filter on top offenders: 
data_prop_sort <- data1996_prop %>% group_by(NewName) %>% 
  summarise(total = sum(Total)) 
data_crop_sort <- data1996_crop %>% group_by(NewName) %>% 
  summarise(total = sum(Total)) 
bigOff_prop <- data_prop_sort[order(data_prop_sort$total),] #order proerty damage
bigOff_crop <- data_crop_sort[order(data_crop_sort$total),] #order agric damage


# Bar graph on economic loss:
p1 <- ggplot(aes(x= NewName, y = Total), data = data1996_prop) + 
  geom_bar(position = "stack", stat = "identity") +
  theme(axis.text.x = element_text(angle = 90, vjust = 0.5, hjust=1), legend.position = "none") + 
  ggtitle("Financial Property damage per environmental catastrophy") +labs( x = "Event Type")

p2 <- ggplot(aes(x= NewName, y = Total), data = data1996_crop) + 
  geom_bar(position = "stack", stat = "identity") +
  theme(axis.text.x = element_text(angle = 90, vjust = 0.5, hjust=1), legend.position = "bottom") + 
  ggtitle("Financial Agricultural damage per environmental catastrophy") +labs( x = "Event Type")

grid.arrange(p1,p2,nrow = 2)

The majority of fatalities occur from severe heat, with a close seond being tornadoes, while the vast majority of injuries result from tornadoes. This ties in with the heavy property damage seen in the same severe weather.

# Fatalities: 
data_people_sort <- data1996_people %>% group_by(NewName) %>% summarise(Fatalities = sum(FATALITIES))
bigOff_people <- data_people_sort[order(-data_people_sort$Fatalities),] #order fatalities 

# Injuries: 
data_people_sort2 <- data1996_people %>% group_by(NewName) %>% summarise(Injury = sum(INJURIES))
MinOff_people <- data_people_sort2[order(-data_people_sort2$Injury),] 


# Bar graph on People loss:
c1 <- ggplot(aes(x= NewName, y = Fatalities), data = bigOff_people) + 
  geom_bar(position = "stack", stat = "identity") +
  theme(axis.text.x = element_text(angle = 90, vjust = 0.5, hjust=1)) + 
  ggtitle("Fatalities per environmental catastrophy")+labs( x = "Event Type")

c2 <- ggplot(aes(x= NewName, y = Injury), data = MinOff_people) + 
  geom_bar(position = "stack", stat = "identity") +
  theme(axis.text.x = element_text(angle = 90, vjust = 0.5, hjust=1), legend.position = "bottom") + 
  ggtitle("Injuries per environmental catastrophy")+labs( x = "Event Type")

grid.arrange(c1,c2,nrow = 2)

The below table outlines the fatalities and injuries per state. Clearly Texas is the place most likely to kill you.

data_people_sum <- data1996_people %>% group_by(STATE) %>%summarise(Fatality = sum(FATALITIES), Injury = sum(INJURIES))

data_people_sum <- data_people_sum[order(data_people_sum$Fatality),]

kable(data_people_sum,caption = "Fatalities and Injuries by State")
Fatalities and Injuries by State
STATE Fatality Injury
GM 1 0
LS 1 0
PH 1 0
LM 4 2
PZ 5 3
RI 6 25
VI 7 2
AM 10 30
AN 12 23
DC 13 369
VT 19 41
ME 22 130
DE 24 255
NH 24 139
HI 33 81
MA 34 687
CT 35 172
SD 36 473
AS 41 164
ND 41 265
ID 42 173
NE 42 350
MT 52 150
WY 52 309
IA 61 984
NM 61 168
AK 62 104
WV 67 124
MN 72 513
OR 72 201
GU 81 416
NV 89 205
MI 110 1195
WI 110 806
PR 111 50
VA 114 902
KY 117 850
WA 119 258
UT 130 979
SC 131 559
IN 133 835
KS 140 845
MD 141 1293
LA 144 812
CO 147 662
NJ 147 936
OH 158 895
GA 160 1666
MS 160 1217
AZ 175 635
OK 219 2375
AR 228 1656
NC 263 1378
NY 268 908
TN 327 2385
AL 449 3707
PA 492 1450
CA 498 2769
MO 533 5960
FL 544 2884
IL 586 1328
TX 756 9222

Conclusions

Therefore its possible to conclude that the severest weather events in relation to property are flooding, hail and high wind, while the greatest impact to agriculture is drought. It is worth noting that the majority of events are not considered massively impactful, however there are the occasional event that results in significant damage and impact. Such as: CA, FLOOD, 0, 0, 115, B, 32.5, M, 2006-01-01, 1.15^{11} This event resulted in significant cost. The most significant impacts to human health are heat as well as tornadoes, that had the highest impact on injuries.