The data can be downloaded from the link https://d396qusza40orc.cloudfront.net/repdata%2Fdata%2FStormData.csv.bz2

About NOAA

The National Oceanic and Atmospheric Administration is an American scientific agency within the United States Department of Commerce that focuses on the conditions of the oceans, major waterways, and the atmosphere. NOAA warns of dangerous weather, charts seas, guides the use and protection of ocean and coastal resources, and conducts research to provide understanding and improve stewardship of the environment. NOAA was officially formed in 1970 and in 2017 had over 11,000 civilian employees.Its research and operations are further supported by 321 uniformed service members who make up the NOAA Commissioned Corps. Since October 2017, NOAA has been headed by Timothy Gallaudet, as acting Under Secretary of Commerce for Oceans and Atmosphere and NOAA interim administrator.

About Data

Some information appearing in Storm Data may be provided by or gathered from sources outside the National Weather Service (NWS), such as the media, law enforcement and/or other government agencies, private companies, individuals, etc. An effort is made to use the best available information, but because of time and resource constraints, information from these sources may be unverified by the NWS. Accordingly, the NWS does not guarantee the accuracy or validity of the information. Further, when information appearing in Storm Data originated from a source outside the NWS (frequently credit is provided), Storm Data users requiring additional information should contact that source directly. ##1.1Downloading the data

#data<-download.file(url="https://d396qusza40orc.cloudfront.net/repdata%2Fdata%2FStormData.csv.bz2",destfile = "data")

1.2 Loading the packages.

library(plyr)
library(ggplot2)
library(gridExtra)
library(grid)

1.3 Reading the data

stormDataRed<-read.csv("repdata%2Fdata%2FStormData.csv")

1.4 Looking at the data

dim(stormDataRed)
## [1] 902297     37

The analysis focuses only on the health and economic consequences od severe weather events,so we subset the columns. EVTYPE, FATALITIES, INJURIES, PROPDMG, PROPDMGEXP, CROPDMG, CROPDMGEXP.

1.5 Subsetting the columns.

stormData <- stormDataRed[,c("EVTYPE", "FATALITIES", "INJURIES", "PROPDMG","PROPDMGEXP","CROPDMG","CROPDMGEXP")]

2.Analysing the data

2.1Population

Fatalaties as well as the injuries are summarized according to the event type

harm2health <- ddply(stormDataRed, .(EVTYPE), summarize,fatalities = sum(FATALITIES),injuries = sum(INJURIES))
fatal <- harm2health[order(harm2health$fatalities, decreasing = T), ]
injury <- harm2health[order(harm2health$injuries, decreasing = T), ]

2.2Economic

exponential values are stored in a seperate column describing their value with letters (h = hundred, k = thousand, m = million, b = billion), the calucalion of the financial damage turns out to be slightly tricky. In a first step a function that converts the letter value of the exponent to a usable number must be implemented.

getExp <- function(e) {
    if (e %in% c("h", "H"))
        return(2)
    else if (e %in% c("k", "K"))
        return(3)
    else if (e %in% c("m", "M"))
        return(6)
    else if (e %in% c("b", "B"))
        return(9)
    else if (!is.na(as.numeric(e))) 
        return(as.numeric(e))
    else if (e %in% c("", "-", "?", "+"))
        return(0)
    else {
        stop("Invalid value.")
    }
}

values for property damage and crop damage are then calculated.

propExp <- sapply(stormDataRed$PROPDMGEXP, FUN=getExp)
stormDataRed$propDamage <- stormDataRed$PROPDMG * (10 ** propExp)
cropExp <- sapply(stormDataRed$CROPDMGEXP, FUN=getExp)
stormDataRed$cropDamage <- stormDataRed$CROPDMG * (10 ** cropExp)

Financial damage for crops and property have to be summarized according to the event type.

econDamage <- ddply(stormDataRed, .(EVTYPE), summarize,propDamage = sum(propDamage), cropDamage = sum(cropDamage))

events not causing any financial damage are removed.

econDamage <- econDamage[(econDamage$propDamage > 0 | econDamage$cropDamage > 0), ]

and The data is stored.

propDmgSorted <- econDamage[order(econDamage$propDamage, decreasing = T), ]
cropDmgSorted <- econDamage[order(econDamage$cropDamage, decreasing = T), ]

3.The Final Results.

3.1 Population health

Top 5 weather events affecting the populations health (injuries and deaths) are shown.

head(injury[, c("EVTYPE", "injuries")],5)
##             EVTYPE injuries
## 834        TORNADO    91346
## 856      TSTM WIND     6957
## 170          FLOOD     6789
## 130 EXCESSIVE HEAT     6525
## 464      LIGHTNING     5230
head(fatal[, c("EVTYPE", "fatalities")],5)
##             EVTYPE fatalities
## 834        TORNADO       5633
## 130 EXCESSIVE HEAT       1903
## 153    FLASH FLOOD        978
## 275           HEAT        937
## 464      LIGHTNING        816

3.2 Economic

Lsts of the Top 5 weather events causing financial damage to both property and crops are shown below.

head(propDmgSorted[, c("EVTYPE", "propDamage")], 5)
##                 EVTYPE   propDamage
## 153        FLASH FLOOD 6.820237e+13
## 786 THUNDERSTORM WINDS 2.086532e+13
## 834            TORNADO 1.078951e+12
## 244               HAIL 3.157558e+11
## 464          LIGHTNING 1.729433e+11
head(cropDmgSorted[, c("EVTYPE", "cropDamage")], 5)
##          EVTYPE  cropDamage
## 95      DROUGHT 13972566000
## 170       FLOOD  5661968450
## 590 RIVER FLOOD  5029459000
## 427   ICE STORM  5022113500
## 244        HAIL  3025974480

4.Visualizations

4.1 Populations

p1 <- ggplot(data=head(injury,10), aes(x=reorder(EVTYPE, injuries), y=injuries)) +
   geom_bar(fill="#999999",stat="identity")  + coord_flip() + 
    ylab("Total number of injuries") + xlab("Event type") +
    ggtitle("Health impact of weather events in the US - Top 10") +
    theme(legend.position="none")

p2 <- ggplot(data=head(fatal,10), aes(x=reorder(EVTYPE, fatalities), y=fatalities)) +
    geom_bar(fill="#E69F00",stat="identity") + coord_flip() +
    ylab("Total number of fatalities") + xlab("Event type") +
    theme(legend.position="none")

grid.arrange(p1, p2, nrow =2)

4.2 Economic

p1 <- ggplot(data=head(propDmgSorted,10), aes(x=reorder(EVTYPE, propDamage), y=log10(propDamage), fill=propDamage )) +
    geom_bar(fill="#999999", stat="identity",col="blue") + coord_flip() +
    xlab("Event type") + ylab("Property damage in dollars (log10)") +
    ggtitle("Economic impact of weather events in the US - Top 10") +
    theme(plot.title = element_text(hjust = 0))

p2 <- ggplot(data=head(cropDmgSorted,10), aes(x=reorder(EVTYPE, cropDamage), y=cropDamage, fill=cropDamage)) +
    geom_bar(fill="#E69F00", stat="identity",col="blue") + coord_flip() + 
    xlab("Event type") + ylab("Crop damage in dollars") + 
    theme(legend.position="none")

grid.arrange(p1, p2, ncol=1, nrow =2)