Analysis of the storm data yielded the following results:
1. Tornadoes have the maximum impact on the health of a population which is measured by sum of fatalities and injuries.
2. Floods have the maximum economic consequence of a region as measured by sum of damage to property and damage to crops.
For this report we use the dataset called Storm Data which is made available by National Climate Data Centre (NCDC). NCDC receives Storm Data from the National Weather Service which in turn receives its information from a variety of sources. You can read about the data here.
Indepedent variable considered is EVTYPE which is the type of storm. To ascertain the impact to population health the dependent variables considered were FATALITIES and INJURIES. To make computations simple the two dependent variables were combined with a 1:1 weightage. The final dependent variable is Human.Impact
A gross total of Human.Impact variable was used to determine which type of storm had the highest impact on human casualties
Indepedent variable considered is EVTYPE which is the type of storm. To ascertain the economic impact the dependent variables considered were damage to property and damage to crops. The data for these are available in their dollar amounts therefore to make computations simple the two dependent variables were added together to arrive at a single dependent variable called Economic.Impact.
A gross total of Economic.Impact was used to determine which type of storm had the highest economic impact.
#download and load the raw data into the environment
URL <- "https://d396qusza40orc.cloudfront.net/repdata%2Fdata%2FStormData.csv.bz2"
if (!file.exists("repdata%2Fdata%2FStormData.csv.bz2")){
download.file(url = URL,
destfile = "repdata%2Fdata%2FStormData.csv.bz2", method = "auto")}
storms <- read.csv("repdata%2Fdata%2FStormData.csv.bz2")
suppressPackageStartupMessages(library(dplyr))
suppressPackageStartupMessages(library(ggplot2))
storms <- tbl_df(storms)
health <- storms %>%
group_by(EVTYPE) %>%
summarise(Human.Impact = sum(FATALITIES) + sum(INJURIES)) %>%
arrange(desc(Human.Impact)) %>%
top_n(10)
FALSE Selecting by Human.Impact
health #display the top 10 types of storms with maximum Human.Impact
FALSE # A tibble: 10 x 2
FALSE EVTYPE Human.Impact
FALSE <fctr> <dbl>
FALSE 1 TORNADO 96979
FALSE 2 EXCESSIVE HEAT 8428
FALSE 3 TSTM WIND 7461
FALSE 4 FLOOD 7259
FALSE 5 LIGHTNING 6046
FALSE 6 HEAT 3037
FALSE 7 FLASH FLOOD 2755
FALSE 8 ICE STORM 2064
FALSE 9 THUNDERSTORM WIND 1621
FALSE 10 WINTER STORM 1527
#plot figure
ggplot(health, aes(EVTYPE, Human.Impact)) + geom_bar(stat="identity", aes(fill=EVTYPE)) +
labs(title = "Human Casualties by Storm Type" , x= "Type Of Storm", y= "Total Human Impact") +
scale_fill_discrete(name="Storm Type") + coord_flip()
Figure 1: Impact of storms on human popluation
Damage to Property can be determined by combining the two variables PROPDMG and PROPDMGEXP. The PROPDMG gives the number and the PROPDMGEXP gives the magnitude of the damage. So if PROPDMG value is 23 and PROPDMGEXP value is k then the total property damage in dollar amount is \(23 x 10^3 = 23,000\)
The PROPDMGEXP values can be interpreted as follows:
| Symbol | Exponent 10x |
|---|---|
"" or ? or - or ? or 0 |
0 |
+ or 1 |
1 |
2 or h or H |
2 |
3 or k or K |
3 |
4 |
4 |
5 |
5 |
m or M or 6 |
6 |
7 |
7 |
8 |
8 |
9 |
9 |
We will use the above table to arrive at the final value of both variables, property damage and crop damage. Ultimately, we will sum the two variables to arrive at overall economic damage.
# Property damage
storms$PROPDMGEXP <- gsub("^$|-|\\?|0", 1E0, storms$PROPDMGEXP)
storms$PROPDMGEXP <- gsub("[1|\\+]", 1E1, storms$PROPDMGEXP)
storms$PROPDMGEXP <- gsub("[2|hH]", 1E2, storms$PROPDMGEXP)
storms$PROPDMGEXP <- gsub("[3|kK]", 1E3, storms$PROPDMGEXP)
storms$PROPDMGEXP <- gsub("[4]", 1E4, storms$PROPDMGEXP)
storms$PROPDMGEXP <- gsub("[5]", 1E5, storms$PROPDMGEXP)
storms$PROPDMGEXP <- gsub("[mM]|6", 1E6, storms$PROPDMGEXP)
storms$PROPDMGEXP <- gsub("[7]", 1E7, storms$PROPDMGEXP)
storms$PROPDMGEXP <- gsub("[8]", 1E8, storms$PROPDMGEXP)
storms$PROPDMGEXP <- gsub("[bB]", 1E9, storms$PROPDMGEXP)
# Crop damage
storms$CROPDMGEXP <- gsub("^$|-|\\?|0", 1E0, storms$CROPDMGEXP)
storms$CROPDMGEXP <- gsub("[1|\\+]", 1E1, storms$CROPDMGEXP)
storms$CROPDMGEXP <- gsub("[2|hH]", 1E2, storms$CROPDMGEXP)
storms$CROPDMGEXP <- gsub("[3|kK]", 1E3, storms$CROPDMGEXP)
storms$CROPDMGEXP <- gsub("[4]", 1E4, storms$CROPDMGEXP)
storms$CROPDMGEXP <- gsub("[5]", 1E5, storms$CROPDMGEXP)
storms$CROPDMGEXP <- gsub("[mM]|6", 1E6, storms$CROPDMGEXP)
storms$CROPDMGEXP <- gsub("[7]", 1E7, storms$CROPDMGEXP)
storms$CROPDMGEXP <- gsub("[8]", 1E8, storms$CROPDMGEXP)
storms$CROPDMGEXP <- gsub("[bB]", 1E9, storms$CROPDMGEXP)
#create economicdmg variable
storms$economicdmg <- storms$CROPDMG * as.numeric(storms$CROPDMGEXP) +
storms$PROPDMG * as.numeric(storms$PROPDMGEXP)
Now we can calculate the groww total of economic damage done by each type of storm and select the top 10.
economic <- storms %>%
group_by(EVTYPE) %>%
summarise(Economic.Impact = sum(economicdmg, na.rm = TRUE)) %>%
arrange(desc(Economic.Impact)) %>%
top_n(10)
FALSE Selecting by Economic.Impact
economic
FALSE # A tibble: 10 x 2
FALSE EVTYPE Economic.Impact
FALSE <fctr> <dbl>
FALSE 1 FLOOD 150319678320
FALSE 2 HURRICANE/TYPHOON 71913712800
FALSE 3 TORNADO 57362337155
FALSE 4 STORM SURGE 43323541000
FALSE 5 HAIL 18761224827
FALSE 6 FLASH FLOOD 18243995295
FALSE 7 DROUGHT 15018672000
FALSE 8 HURRICANE 14610229010
FALSE 9 RIVER FLOOD 10148404500
FALSE 10 ICE STORM 8967041810
ggplot(economic, aes(EVTYPE, Economic.Impact)) + geom_bar(stat="identity", aes(fill=EVTYPE)) +
labs(title = "Economic Damage by Storm Type" , x= "Type Of Storm", y= "Total Economic Impat in USD") + scale_fill_discrete(name="Storm Type") + coord_flip()
Figure 2: Impact of storms on property & crop
Human Popluation Impact: From figure 1 and from reading the data it is clear that the type of storm with the highest impact on popluation health is Tornados. As per the Storms Data tornadoes have had an impact on 96,979 people.
Economic Impact: From figure 2 and from reading the data it is clear that the type of storm with the highest economic damage is Flood. As per the Storms Data, floods have had an impact $150,319,678,320.
END OF REPORT