It was found that severe weather events indeed had a huge impact on society in recent years. Floods were found to have cost most significant economy damage, which attributed to more than 150 billions US dollars of property damage. Tornados were found to have made most number of death and injuries, with almost 97,000 injuries or fatalities in recent years.
Performed steps:
Load the data from csv file
Remove some unused variables to save memory
Remove original data to save memory
Calculate the damage
Using levels(), it was found that the exponential has lots of values that were not explained in the documentation. These values were ignored. Only K, M, B were understood as 1000, 10^6 and 10^9, respectively.
Property damage & crop damage were summed up to get the total damage
Calculate total fatalities and injuries
# Load the data
storm_data_raw <- read.csv("~/Downloads/Data/repdata-data-StormData.csv")
# Remove unnecessary columns
good_columns <- c("EVTYPE", # Event type
"FATALITIES", "INJURIES", # Fatalities & Injuries
"PROPDMG", "PROPDMGEXP", # Property damange & its exponential
"CROPDMG", "CROPDMGEXP") # Crop damage & its exponential
storm_data <- storm_data_raw[,good_columns]
summary(storm_data)
## EVTYPE FATALITIES INJURIES
## HAIL :288661 Min. : 0.0000 Min. : 0.0000
## TSTM WIND :219940 1st Qu.: 0.0000 1st Qu.: 0.0000
## THUNDERSTORM WIND: 82563 Median : 0.0000 Median : 0.0000
## TORNADO : 60652 Mean : 0.0168 Mean : 0.1557
## FLASH FLOOD : 54277 3rd Qu.: 0.0000 3rd Qu.: 0.0000
## FLOOD : 25326 Max. :583.0000 Max. :1700.0000
## (Other) :170878
## PROPDMG PROPDMGEXP CROPDMG CROPDMGEXP
## Min. : 0.00 :465934 Min. : 0.000 :618413
## 1st Qu.: 0.00 K :424665 1st Qu.: 0.000 K :281832
## Median : 0.00 M : 11330 Median : 0.000 M : 1994
## Mean : 12.06 0 : 216 Mean : 1.527 k : 21
## 3rd Qu.: 0.50 B : 40 3rd Qu.: 0.000 0 : 19
## Max. :5000.00 5 : 28 Max. :990.000 B : 9
## (Other): 84 (Other): 9
# Remove original data to save space
remove(storm_data_raw)
# Calculate the total damage
levels(storm_data$PROPDMGEXP) <- c(
"1", "1", "1", "1", "1", "1", "1", "1", "1", "1", "1", "1", "1",
"1000000000", "1", "1", "1000", "1000000", "1000000")
levels(storm_data$CROPDMGEXP) <- c(
"1", "1", "1", "1", "1000000000", "1000",
"1000", "1000000", "1000000")
storm_data$PROPDMG <- storm_data$PROPDMG *
as.integer(as.character(storm_data$PROPDMGEXP))
storm_data$CROPDMG <- storm_data$CROPDMG *
as.integer(as.character(storm_data$CROPDMGEXP))
storm_data$DAMAGE <- storm_data$PROPDMG + storm_data$CROPDMG
# Calculate total injuries & fatalities
storm_data$HEALTH <- storm_data$INJURIES + storm_data$FATALITIES
Across the United States, which types of events (as indicated in the EVTYPE variable) are most harmful with respect to population health?
total <- sort(
tapply(storm_data$HEALTH, storm_data$EVTYPE, sum),
decreasing = T)
barplot(head(total,3),
main="Most harmful events",
xlab="Event type",
ylab="Total fatalities and injuries")
max(total)
## [1] 96979
From the figure, it was found that Tornado has caused the most number of injuries and fatalities (96,980 fatalities and injuries), significantly more than any other type of events.
total <- sort(
tapply(storm_data$DAMAGE, storm_data$EVTYPE, sum),
decreasing = T)
barplot(head(total,3),
main="Most damaging events",
xlab="Event type",
ylab="Total damage")
max(total)
## [1] 150319678257
From the figure, it was found that Flood has caused biggest damage (around 150 billions US dollars), much higher than any other events.