This project involves exploring the U.S. National Oceanic and Atmospheric Administration’s (NOAA) storm database. This database tracks characteristics of major storms and weather events in the United States, including when and where they occur, as well as estimates of any fatalities, injuries, and property damage. The analysis of the fatalities and injuries specifies tornado as the most harmful weather events for population. In the same time, the biggest damage for the property and crop was caused by thunderstorm winds.
options(tinytex.verbose = TRUE)
setwd("C:/Users/rikig/OneDrive/Рабочий стол/project R 5.2")
getwd()
[1] “C:/Users/rikig/OneDrive/Рабочий стол/project R 5.2”
options(tinytex.verbose = TRUE)
StormData <- "repdata%2Fdata%2FStormData.csv.bz2"
if (!file.exists(StormData)){
fileUrl <- "https://d396qusza40orc.cloudfront.net/repdata%2Fdata%2FStormData.csv.bz2"
download.file(fileUrl, destfile = file.path("C:/Users/rikig/OneDrive/Рабочий стол/project R 5.2/", "repdata%2Fdata%2FStormData.csv.bz2")
, method="curl")
}
options(tinytex.verbose = TRUE)
StormDataTable <- read.csv(StormData)
options(tinytex.verbose = TRUE)
library('dplyr')
head(StormData)
[1] “repdata%2Fdata%2FStormData.csv.bz2”
PopulationDamage <- StormDataTable %>% select(,c(8,23,24,))
options(tinytex.verbose = TRUE)
PopulationDamage <- PopulationDamage %>%
group_by(EVTYPE) %>%
summarise(sum(FATALITIES, INJURIES))
colnames(PopulationDamage)[2] <- "Impact_on_Population"
PopulationDamage <- PopulationDamage %>%
arrange(desc(Impact_on_Population))
PopulationDamage <- PopulationDamage[1:5,]
options(tinytex.verbose = TRUE)
library(ggplot2)
ggplot(PopulationDamage, aes(x=reorder(EVTYPE, -Impact_on_Population), y=Impact_on_Population)) +
geom_bar(stat = "identity", color="darkolivegreen3", fill= "darkolivegreen3") +
xlab("Weather event") +
ylab("Population harm, num of cases")
options(tinytex.verbose = TRUE)
StormDataTable <- read.csv(StormData)
library('dplyr')
PropertyDamage <- StormDataTable %>% select(,c(8,25,26,27,28))
options(tinytex.verbose = TRUE)
options(scipen=999)
PropertyDamage$PROPTHOUSANDS <- ifelse(PropertyDamage$PROPDMGEXP == '', 0,
ifelse(PropertyDamage$PROPDMGEXP == 'K',PropertyDamage$PROPDMG,
ifelse(PropertyDamage$PROPDMGEXP == 'M',
PropertyDamage$PROPDMG*1000,PropertyDamage$PROPDMG* 1000000)))
PropertyDamage$CROPTHOUSANDS <- ifelse(PropertyDamage$CROPDMGEXP == '', 0,
ifelse(PropertyDamage$CROPDMGEXP == 'K',PropertyDamage$CROPDMG,
ifelse(PropertyDamage$CROPDMGEXP == 'M',
PropertyDamage$CROPDMG*1000,PropertyDamage$CROPDMG* 1000000)))
options(tinytex.verbose = TRUE)
PropertyDamage <- PropertyDamage %>% select(,c(1,6,7))
options(tinytex.verbose = TRUE)
PropertyDamage <- PropertyDamage %>%
group_by(EVTYPE) %>%
summarise(sum(PROPTHOUSANDS,CROPTHOUSANDS ))
colnames(PropertyDamage)[2] <- "Impact_on_Property"
PropertyDamage <- PropertyDamage %>%
arrange(desc(Impact_on_Property))
PropertyDamage <- PropertyDamage[1:5,]
options(tinytex.verbose = TRUE)
library(ggplot2)
ggplot(PropertyDamage, aes(x=reorder(EVTYPE, -Impact_on_Property), y=Impact_on_Property)) +
geom_bar(stat = "identity", color="salmon", fill= "salmon") +
xlab("Weather event") +
ylab("Property damage, $K")
The analysis of fatalities and injuries shows the following the most harmful for the population events: tornado, excessive heat, thunderstorm wind, flood and lighting, although we can see the tornado’s impact is the biggest one in top-5 the most harmful events As for the property the biggest damage was caused by thunderstorm winds, hail, tornado, flash flood and lighting. The maximum damage was received from thunderstorm winds Thus, we should pay attention to all of these types of weather events, taking in account that some of them are harmful either for population or property, and of course the most harmful events (tornado and thunderstorm winds) should get special attention.