Storms and other severe weather events can cause both public health and economic problems for communities and municipalities. Many severe events can result in fatalities, injuries, and property damage, and preventing such outcomes to the extent possible is a key concern.
This documents provides analysis of effects of storms and other severe weather conditions on population health and property damage in affected areas. It uses data from U.S. National Oceanic and Atmospheric Administration's (NOAA) storm database. Exploratory data analysis is performed in order to determine total number of casualties and property damage from each type of event.
For this analysis, data is obtained from https://d396qusza40orc.cloudfront.net/repdata%2Fdata%2FStormData.csv.bz2. In order to reproduce analysis, it is required to download this file into R working directory.
After data is downloaded , it is read into R. Some of the data characteristics is then examined:
fileUrl <- "https://d396qusza40orc.cloudfront.net/repdata%2Fdata%2FStormData.csv.bz2"
download.file(fileUrl,"stormData.csv.bz2",method="curl")
data <- read.csv("stormData.csv.bz2", header=TRUE)
dim(data)
## [1] 902297 37
names(data)
## [1] "STATE__" "BGN_DATE" "BGN_TIME" "TIME_ZONE" "COUNTY"
## [6] "COUNTYNAME" "STATE" "EVTYPE" "BGN_RANGE" "BGN_AZI"
## [11] "BGN_LOCATI" "END_DATE" "END_TIME" "COUNTY_END" "COUNTYENDN"
## [16] "END_RANGE" "END_AZI" "END_LOCATI" "LENGTH" "WIDTH"
## [21] "F" "MAG" "FATALITIES" "INJURIES" "PROPDMG"
## [26] "PROPDMGEXP" "CROPDMG" "CROPDMGEXP" "WFO" "STATEOFFIC"
## [31] "ZONENAMES" "LATITUDE" "LONGITUDE" "LATITUDE_E" "LONGITUDE_"
## [36] "REMARKS" "REFNUM"
levels(data$PROPDMGEXP)
## [1] "" "-" "?" "+" "0" "1" "2" "3" "4" "5" "6" "7" "8" "B" "h" "H" "K"
## [18] "m" "M"
From column names, we can conclude that columns of interest are:
We can now construct new data frame containing only the columns we are interested in. We will also filter PROPDMGEXP column to include only valid values (H,K,M,B,0-8). We then replace values in column PROPDMG to be total amount of damage in dollars. Finally, we remove all rows in which fatalities, injuries and propert damage is 0, because they do not influence the analysis:
cleaned <- data[,c(8,23:26)]
# subset all rows in which PROPDMGEXP is valid
cleaned <- cleaned[cleaned$PROPDMGEXP %in% c("k","K","m","M","b","B","h","H"),]
# set total damage in PROPDMG column for each PROPDMGEXP value
cleaned$PROPDMG[toupper(cleaned$PROPDMGEXP) == "H"] <- 100 * cleaned$PROPDMG[toupper(cleaned$PROPDMGEXP) == "H"]
cleaned$PROPDMG[toupper(cleaned$PROPDMGEXP) == "K"] <- 1000 * cleaned$PROPDMG[toupper(cleaned$PROPDMGEXP) == "K"]
cleaned$PROPDMG[toupper(cleaned$PROPDMGEXP) == "M"] <- 1000000 * cleaned$PROPDMG[toupper(cleaned$PROPDMGEXP) == "M"]
cleaned$PROPDMG[toupper(cleaned$PROPDMGEXP) == "B"] <- 1000000000 * cleaned$PROPDMG[toupper(cleaned$PROPDMGEXP) == "B"]
cleaned <- cleaned[cleaned$FATALITIES > 0 & cleaned$INJURIES > 0 & cleaned$PROPDMG > 0,]
dim(cleaned)
## [1] 1748 5
Now, we want to find the sum of fatalities, injuries and property damage for each type of event:
eventTypes <- cleaned[!duplicated(cleaned$EVTYPE),1]
finalData <- data.frame("evtypes" = eventTypes,"fatals"=c(0),"injured"= c(0),"damage"=c(0))
i = 1
for(event in eventTypes){
r <- cleaned[cleaned[["EVTYPE"]] == event,]
fatalities <- sum(r$FATALITIES,na.rm=TRUE)
injuries <- sum(r$INJURIES, na.rm=TRUE)
dmg <- sum(r$PROPDMG, na.rm=TRUE)
finalData[i,2] <- fatalities
finalData[i,3] <- injuries
finalData[i,4] <-dmg
i <- i + 1
}
finalData
## evtypes fatals injured damage
## 1 TORNADO 4750 54378 2.548e+10
## 2 DENSE FOG 13 150 2.300e+06
## 3 LIGHTNING 19 39 3.470e+05
## 4 THUNDERSTORM WINDS 20 41 7.895e+07
## 5 HIGH SEAS 1 1 5.000e+02
## 6 WINTER STORM 49 491 1.013e+08
## 7 WILD FIRES 3 150 6.190e+08
## 8 WINTER STORM HIGH WINDS 1 15 6.000e+07
## 9 WINTER STORMS 10 17 5.000e+05
## 10 HIGH WINDS 9 39 7.061e+07
## 11 HIGH SURF 1 1 1.000e+05
## 12 FLASH FLOOD 122 620 8.669e+08
## 13 TROPICAL STORM GORDON 8 43 5.000e+05
## 14 FLOOD 92 2670 1.650e+09
## 15 BLIZZARD 38 704 2.544e+08
## 16 THUNDERSNOW 1 1 5.000e+04
## 17 WIND 4 5 1.050e+05
## 18 HEAT 22 320 1.250e+05
## 19 WATERSPOUT/TORNADO 3 39 5.000e+07
## 20 EXCESSIVE HEAT 54 49 2.000e+05
## 21 FLOOD/FLASH FLOOD 2 3 9.100e+06
## 22 SNOW 3 7 5.500e+06
## 23 HEAT WAVE 13 52 5.000e+05
## 24 ICE STORM 24 1699 2.712e+08
## 25 HEAT WAVE DROUGHT 4 15 2.000e+05
## 26 EXTREME COLD 7 135 1.025e+07
## 27 HEAVY SNOW 25 149 1.417e+08
## 28 FOG 30 245 6.652e+06
## 29 HIGH WINDS/SNOW 3 6 5.000e+04
## 30 GLAZE 7 8 6.600e+05
## 31 THUNDERSTORM WIND 36 104 1.091e+08
## 32 HIGH WIND AND SEAS 3 20 5.000e+04
## 33 FREEZING RAIN 2 5 5.000e+05
## 34 HIGH WIND 67 238 1.016e+09
## 35 ICE 1 4 5.000e+04
## 36 TROPICAL STORM 9 270 6.295e+08
## 37 Marine Accident 1 2 5.000e+04
## 38 HEAVY RAIN 32 29 4.079e+06
## 39 TSTM WIND 65 306 2.213e+08
## 40 HURRICANE 7 16 9.384e+08
## 41 URBAN/SML STREAM FLD 7 11 1.750e+05
## 42 ICY ROADS 4 5 1.020e+05
## 43 blowing snow 1 1 1.500e+04
## 44 FREEZING DRIZZLE 2 13 3.500e+04
## 45 FROST 1 3 1.500e+04
## 46 LANDSLIDES 1 1 5.000e+03
## 47 AVALANCHE 5 6 6.090e+05
## 48 WILD/FOREST FIRE 2 11 2.550e+05
## 49 STORM SURGE 2 2 1.000e+06
## 50 TSTM WIND/HAIL 2 2 1.100e+04
## 51 LIGHT SNOW 1 2 5.000e+04
## 52 HEAVY SURF 1 1 2.500e+03
## 53 HURRICANE/TYPHOON 32 1219 1.172e+10
## 54 DUST STORM 2 29 9.600e+04
## 55 WILDFIRE 54 259 2.248e+09
## 56 LANDSLIDE 21 27 6.775e+06
## 57 STRONG WIND 16 21 3.770e+05
## 58 WINTER WEATHER/MIX 12 32 1.647e+06
## 59 WINTER WEATHER 6 163 3.135e+06
## 60 HAIL 1 2 5.000e+03
## 61 MARINE STRONG WIND 2 3 4.500e+04
## 62 TSUNAMI 32 129 8.100e+07
## 63 MARINE THUNDERSTORM WIND 2 1 4.000e+03
From this data, we can see that tornados cause by far the most injuries and fatalities. If we order final data by number of fatalities and injuries. Significantly behind tornados are floods and fast winds.
finalData[order(finalData$fatals,decreasing=TRUE),]
## evtypes fatals injured damage
## 1 TORNADO 4750 54378 2.548e+10
## 12 FLASH FLOOD 122 620 8.669e+08
## 14 FLOOD 92 2670 1.650e+09
## 34 HIGH WIND 67 238 1.016e+09
## 39 TSTM WIND 65 306 2.213e+08
## 20 EXCESSIVE HEAT 54 49 2.000e+05
## 55 WILDFIRE 54 259 2.248e+09
## 6 WINTER STORM 49 491 1.013e+08
## 15 BLIZZARD 38 704 2.544e+08
## 31 THUNDERSTORM WIND 36 104 1.091e+08
## 38 HEAVY RAIN 32 29 4.079e+06
## 53 HURRICANE/TYPHOON 32 1219 1.172e+10
## 62 TSUNAMI 32 129 8.100e+07
## 28 FOG 30 245 6.652e+06
## 27 HEAVY SNOW 25 149 1.417e+08
## 24 ICE STORM 24 1699 2.712e+08
## 18 HEAT 22 320 1.250e+05
## 56 LANDSLIDE 21 27 6.775e+06
## 4 THUNDERSTORM WINDS 20 41 7.895e+07
## 3 LIGHTNING 19 39 3.470e+05
## 57 STRONG WIND 16 21 3.770e+05
## 2 DENSE FOG 13 150 2.300e+06
## 23 HEAT WAVE 13 52 5.000e+05
## 58 WINTER WEATHER/MIX 12 32 1.647e+06
## 9 WINTER STORMS 10 17 5.000e+05
## 10 HIGH WINDS 9 39 7.061e+07
## 36 TROPICAL STORM 9 270 6.295e+08
## 13 TROPICAL STORM GORDON 8 43 5.000e+05
## 26 EXTREME COLD 7 135 1.025e+07
## 30 GLAZE 7 8 6.600e+05
## 40 HURRICANE 7 16 9.384e+08
## 41 URBAN/SML STREAM FLD 7 11 1.750e+05
## 59 WINTER WEATHER 6 163 3.135e+06
## 47 AVALANCHE 5 6 6.090e+05
## 17 WIND 4 5 1.050e+05
## 25 HEAT WAVE DROUGHT 4 15 2.000e+05
## 42 ICY ROADS 4 5 1.020e+05
## 7 WILD FIRES 3 150 6.190e+08
## 19 WATERSPOUT/TORNADO 3 39 5.000e+07
## 22 SNOW 3 7 5.500e+06
## 29 HIGH WINDS/SNOW 3 6 5.000e+04
## 32 HIGH WIND AND SEAS 3 20 5.000e+04
## 21 FLOOD/FLASH FLOOD 2 3 9.100e+06
## 33 FREEZING RAIN 2 5 5.000e+05
## 44 FREEZING DRIZZLE 2 13 3.500e+04
## 48 WILD/FOREST FIRE 2 11 2.550e+05
## 49 STORM SURGE 2 2 1.000e+06
## 50 TSTM WIND/HAIL 2 2 1.100e+04
## 54 DUST STORM 2 29 9.600e+04
## 61 MARINE STRONG WIND 2 3 4.500e+04
## 63 MARINE THUNDERSTORM WIND 2 1 4.000e+03
## 5 HIGH SEAS 1 1 5.000e+02
## 8 WINTER STORM HIGH WINDS 1 15 6.000e+07
## 11 HIGH SURF 1 1 1.000e+05
## 16 THUNDERSNOW 1 1 5.000e+04
## 35 ICE 1 4 5.000e+04
## 37 Marine Accident 1 2 5.000e+04
## 43 blowing snow 1 1 1.500e+04
## 45 FROST 1 3 1.500e+04
## 46 LANDSLIDES 1 1 5.000e+03
## 51 LIGHT SNOW 1 2 5.000e+04
## 52 HEAVY SURF 1 1 2.500e+03
## 60 HAIL 1 2 5.000e+03
The following plots show comparison of fatalities and injuries for different types of events:
par(mfrow=c(1,2))
barplot(finalData$fatals,names.arg=finalData$evtypes, main="Total number of fatalities per event",xlab="Event type",ylab="No. of fatalities")
barplot(finalData$injured,names.arg=finalData$evtypes, main="Total number of injured per event", xlab="Event type",ylab="No. of injuries")
As for propert damage, tornados are still the main culprit. However, they are closely followed by hurricanes and wildfires:
finalData[order(finalData$damage,decreasing=TRUE),]
## evtypes fatals injured damage
## 1 TORNADO 4750 54378 2.548e+10
## 53 HURRICANE/TYPHOON 32 1219 1.172e+10
## 55 WILDFIRE 54 259 2.248e+09
## 14 FLOOD 92 2670 1.650e+09
## 34 HIGH WIND 67 238 1.016e+09
## 40 HURRICANE 7 16 9.384e+08
## 12 FLASH FLOOD 122 620 8.669e+08
## 36 TROPICAL STORM 9 270 6.295e+08
## 7 WILD FIRES 3 150 6.190e+08
## 24 ICE STORM 24 1699 2.712e+08
## 15 BLIZZARD 38 704 2.544e+08
## 39 TSTM WIND 65 306 2.213e+08
## 27 HEAVY SNOW 25 149 1.417e+08
## 31 THUNDERSTORM WIND 36 104 1.091e+08
## 6 WINTER STORM 49 491 1.013e+08
## 62 TSUNAMI 32 129 8.100e+07
## 4 THUNDERSTORM WINDS 20 41 7.895e+07
## 10 HIGH WINDS 9 39 7.061e+07
## 8 WINTER STORM HIGH WINDS 1 15 6.000e+07
## 19 WATERSPOUT/TORNADO 3 39 5.000e+07
## 26 EXTREME COLD 7 135 1.025e+07
## 21 FLOOD/FLASH FLOOD 2 3 9.100e+06
## 56 LANDSLIDE 21 27 6.775e+06
## 28 FOG 30 245 6.652e+06
## 22 SNOW 3 7 5.500e+06
## 38 HEAVY RAIN 32 29 4.079e+06
## 59 WINTER WEATHER 6 163 3.135e+06
## 2 DENSE FOG 13 150 2.300e+06
## 58 WINTER WEATHER/MIX 12 32 1.647e+06
## 49 STORM SURGE 2 2 1.000e+06
## 30 GLAZE 7 8 6.600e+05
## 47 AVALANCHE 5 6 6.090e+05
## 9 WINTER STORMS 10 17 5.000e+05
## 13 TROPICAL STORM GORDON 8 43 5.000e+05
## 23 HEAT WAVE 13 52 5.000e+05
## 33 FREEZING RAIN 2 5 5.000e+05
## 57 STRONG WIND 16 21 3.770e+05
## 3 LIGHTNING 19 39 3.470e+05
## 48 WILD/FOREST FIRE 2 11 2.550e+05
## 20 EXCESSIVE HEAT 54 49 2.000e+05
## 25 HEAT WAVE DROUGHT 4 15 2.000e+05
## 41 URBAN/SML STREAM FLD 7 11 1.750e+05
## 18 HEAT 22 320 1.250e+05
## 17 WIND 4 5 1.050e+05
## 42 ICY ROADS 4 5 1.020e+05
## 11 HIGH SURF 1 1 1.000e+05
## 54 DUST STORM 2 29 9.600e+04
## 16 THUNDERSNOW 1 1 5.000e+04
## 29 HIGH WINDS/SNOW 3 6 5.000e+04
## 32 HIGH WIND AND SEAS 3 20 5.000e+04
## 35 ICE 1 4 5.000e+04
## 37 Marine Accident 1 2 5.000e+04
## 51 LIGHT SNOW 1 2 5.000e+04
## 61 MARINE STRONG WIND 2 3 4.500e+04
## 44 FREEZING DRIZZLE 2 13 3.500e+04
## 43 blowing snow 1 1 1.500e+04
## 45 FROST 1 3 1.500e+04
## 50 TSTM WIND/HAIL 2 2 1.100e+04
## 46 LANDSLIDES 1 1 5.000e+03
## 60 HAIL 1 2 5.000e+03
## 63 MARINE THUNDERSTORM WIND 2 1 4.000e+03
## 52 HEAVY SURF 1 1 2.500e+03
## 5 HIGH SEAS 1 1 5.000e+02
barplot(finalData$damage,names.arg=finalData$evtypes,main="Total property damage per event type",ylab="Total damage (in $)",xlab="Event type")
As can be noted from data analysis, whether events with the highest impact on both population health and economy are tornados. They produce the highet rate of fatalities and injuries, and also cause the highest damages to property, thus impacting econimical strength of the affected area.
Other significant events which cause high number of fatalities and injuries are floods and strong winds. These events are far behind tornadosin regard to their impact, but still must be regarded as very important in that sense.
As for impact on economy, tornados are followed by hurricanes and wild fires. These don't pose as much of threat to population, but cause significant economic damage to affected areas.