Introduction

Storms and other severe weather events can cause both public health and economic problems for communities and municipalities. Many severe events can result in fatalities, injuries, and property damage, and preventing such outcomes to the extent possible is a key concern.

Synopsis

This project involves exploring the U.S. National Oceanic and Atmospheric Administration’s (NOAA) storm database. This database tracks characteristics of major storms and weather events in the United States, including when and where they occur, as well as estimates of any fatalities and injuries.

Data

The Storm dataset is downloaded from the coursera website and unzipped before importing it to R
The dataset was downloaded from the coursera website on 7th october 2020, Download data

The downloaded data is then extracted using the winzip software, the extracted data is in the form of CSV, so the data can be imported to r easily

Loading the required packages before Analysis

options(tinytex.verbose = TRUE)

library(knitr)
library(tidyverse)
library(janitor)

Reading CSV dataset into R

I renamed the given data set to “storm data”, and i am importing the data into an object called data so that my future reference for the dataset can be made easily.

data <- read.csv("Storm_data.csv")

head(data, 10)

Top events causing fatalities

Selecting only required columns and getting top 10 events causing fatalities in the US

event <- c("EVTYPE", "FATALITIES", "INJURIES",  "CROPDMG")

fatal <- head(data %>% 
  select(event) %>% 
  group_by(EVTYPE) %>% 
  summarise(fatalities = sum(FATALITIES)) %>% 
  arrange(desc(fatalities)), 10)

fatal

Visualizing the above table using a barplot

fatal %>% 
  clean_names() %>% 
  ggplot(aes(reorder(evtype, fatalities), fatalities)) +
  geom_col(aes(fill = fatalities), col = "black") +
  scale_fill_gradient(low = "yellow", high = "red") +
  theme_minimal() +
  coord_flip() +
  geom_text(aes(label = fatalities, y = fatalities + 200), col = "brown",
            fontface = "bold") +
  theme(legend.position = "none",
        plot.title = element_text(colour = "brown", size = 15),
        axis.title.x = element_text(colour = "black", face = "bold"),
        axis.text.y = element_text(face = "bold"),
        axis.title.y = element_blank()) +
  ggtitle("Top 10 events which are causing fatalities in the US") +
  ylab("Number of fatalities")

Top events causing injuries

injury <- head(data %>% 
  select(event) %>% 
  group_by(EVTYPE) %>% 
  summarise(injuries = sum(INJURIES)) %>% 
  arrange(desc(injuries)), 10)

injury

Top 10 events causing Injuries in the US

injury %>% 
  clean_names() %>% 
  ggplot(aes(reorder(evtype, injuries), injuries)) +
  geom_col(aes(fill = injuries), col = "black") +
  scale_fill_gradient(low = "yellow", high = "red") +
  theme_minimal() +
  coord_flip() +
  geom_text(aes(label = injuries, y = injuries + 200), col = "brown",
            fontface = "bold") +
  theme(legend.position = "none",
        plot.title = element_text(colour = "brown", size = 20),
        axis.title.x = element_text(colour = "black", face = "bold"),
        axis.text.y = element_text(face = "bold"),
        axis.title.y = element_blank()) +
  ggtitle("Top 10 events which are causing injuries in the US") +
  ylab("Number of injuries")

Top events causing crop damages

crop <- head(data %>% 
  select(event) %>% 
  group_by(EVTYPE) %>% 
  summarise(cropd = sum(CROPDMG)) %>% 
  arrange(desc(cropd)), 10)

crop
crop %>% 
  clean_names() %>% 
  ggplot(aes(reorder(evtype, cropd), cropd)) +
  geom_col(aes(fill = cropd), col = "black") +
  scale_fill_gradient(low = "yellow", high = "red") +
  theme_minimal() +
  coord_flip() +
  geom_text(aes(label = cropd, y = cropd + 200), col = "brown",
            fontface = "bold") +
  theme(legend.position = "none",
        plot.title = element_text(colour = "brown", size = 20),
        axis.title.x = element_text(colour = "black", face = "bold"),
        axis.text.y = element_text(face = "bold"),
        axis.title.y = element_blank()) +
  ggtitle("Top 10 events which are causing crop damages in the US") +
  ylab("Number of injuries")

Results

  1. The top weather event most harmfull for population health is Tornado, which caused 5633 fatalities and 91346 injuries, followed by TSTM wind which caused 6957 injuries in the US

  2. The top event most harmfull for economic condition in the US is Hail which destroyed 579596.3 crops followed by flash flood which caused 179200.5 crops

The analysis of weather events and its distruction is very important for a country to protect its people and its economy, by doing this assignment i learnt not only to create and publish R markdown (which can be reproduced easily and can be shared easily with friends) file but also basics of data analysis.