1. Synapsys

Meteorological events are constantly present in nature by the action of the convergence of certain climatic conditions that favors its development, for example the presence of strong winds and a snowfall can result in a snow storm. However, this type of phenomena may not frequently be aggravated by conditions of climatic variability, resulting in greatly aggravated events that can have serious consequences on the life, health and economic conditions of the human population.

Meteorological services are constantly generating large amounts of data with information on different variables associated with climate, for example wind speed, precipitation, temperature, among others, which allows the anticipation of some measure of weather events. This information together with databases that provide statistics of the effects derived from climate events at different levels are a great starting point for risk management. In this sense, it is important to know the information provided by the data about this type of events, in order to develop strategies to mitigate and control such events. Not all the phenomena present the same level of affectation, so knowing those that present themselves in greater measure and that carry greater losses allows us to focus the economic resources around them.

2. Data Processing

2.1. Getting the data

Once downloaded and decompressed the database in our folder of preference we can use the following instructions to read and upload the file .csv to the environment of R:

data.raw <- read.csv("D:/Coursera/Reproducible Research/Semana4/Proyecto/repdata_data_StormData.csv")

2.2. Summary of the data

To briefly describe the set of data we will obtain information about the number of observations and columns, and the name of the variables

dim(data.raw)
## [1] 902297     37
names(data.raw)
##  [1] "STATE__"    "BGN_DATE"   "BGN_TIME"   "TIME_ZONE"  "COUNTY"    
##  [6] "COUNTYNAME" "STATE"      "EVTYPE"     "BGN_RANGE"  "BGN_AZI"   
## [11] "BGN_LOCATI" "END_DATE"   "END_TIME"   "COUNTY_END" "COUNTYENDN"
## [16] "END_RANGE"  "END_AZI"    "END_LOCATI" "LENGTH"     "WIDTH"     
## [21] "F"          "MAG"        "FATALITIES" "INJURIES"   "PROPDMG"   
## [26] "PROPDMGEXP" "CROPDMG"    "CROPDMGEXP" "WFO"        "STATEOFFIC"
## [31] "ZONENAMES"  "LATITUDE"   "LONGITUDE"  "LATITUDE_E" "LONGITUDE_"
## [36] "REMARKS"    "REFNUM"

Types of events that are most harmful with respect to population health

2.3. Types of events that are most harmful with respect to population health

To answer the question What ypes of events that are most harmful with respect to population health? We will focus on the information provided by the column INJURIES, since it is directly related to population health. Therefore we want to know the number of people affected by each of the types of events described in the column EVTYPE. We will select the top 10 events to answer our question

library(dplyr)
## 
## Attaching package: 'dplyr'
## The following objects are masked from 'package:stats':
## 
##     filter, lag
## The following objects are masked from 'package:base':
## 
##     intersect, setdiff, setequal, union
group.EVTYPE <- group_by(data.raw, EVTYPE)
group.EVTYPE <- summarise(group.EVTYPE, total=sum(INJURIES))
top10.EVTYPE.injuries <- group.EVTYPE %>% top_n(10)
## Selecting by total

Types of events that have the greatest economic consequences

2.4. Types of events that have the greatest economic consequences

To answer the question which types of events have the greatest economic consequences? We will use the information from two columns: PROPDMG and CROPDMG, which have information on property damage and crop damage, respectively. We will create a new field that represents the sum of both indicators and again, as in the previous point, we will group the result by type of event

library(dplyr)
data.raw$Eco.Value <- data.raw$PROPDMG + data.raw$CROPDMG
group.EVTYPE <- group_by(data.raw, EVTYPE)
group.EVTYPE <- summarise(group.EVTYPE, total=sum(Eco.Value))
top10.EVTYPE.economics  <- group.EVTYPE %>% top_n(10)
## Selecting by total

3. Results

library(ggplot2)
ggplot(data=top10.EVTYPE.injuries, aes(x=EVTYPE, y=total))+
        geom_bar(stat="identity", color="black", fill="steelblue") +
        coord_flip() +
        xlab("Types of events") +
        ylab("Total value") +
        theme_minimal() +
        ggtitle("Plot1: Total injuries by types of events")

In the Plot1 we can observe that the greatest affectation (by far) to public health is caused by tornado events

library(ggplot2)
ggplot(data=top10.EVTYPE.economics, aes(x=EVTYPE, y=total))+
        geom_bar(stat="identity", color="black", fill="steelblue") +
        coord_flip() +
        xlab("Types of events") +
        ylab("Total value") +
        theme_minimal() +
        ggtitle("Plot2: Economics consequences by types of events")

As presented in Plot2, it can be seen that the greatest economic losses are caused by tornadoes. In second place in terms of economic impact are the Flash flood events and in third place TSTM winds