There are weather events which can have a huge impact on people and properties. To understand better the impact on the people health and damages done to properties an analyses of which type weather events have the highest impact will help to take the proper prevention activities.
The analyses of the data collected by US National Weather Service will provide a summary of the top ten weather events with the highest impact on people health and properties. The health impact is related to the number of people which suffered injuries and the number of fatalities because of the weather. The damage done by sever weather events is grouped in properties damage and crop damage.
Based on the type of events the proper resources and activities can take place to minimize the impact and consequences of the weather event.
Storm Data used in the analyses is an official publication of the National Oceanic and Atmospheric Administration Data used in the analyses was collected by US National Weather Service. The data was provided in afile in a CSV format at the next link https://d396qusza40orc.cloudfront.net/repdata%2Fdata%2FStormData.csv.bz2. The file was downloaded on 20th May, 2014 and uncompresssed using Winrar.
R statistics was used to analyze the data and summarize the weather events which are most harmfull to health and create most damages.
Setting the working directory
setwd("D:\\cursuri\\ReproducibleResearch\\PeerAssessment2")
Loading the data in a data frame, sd.
sd <- read.csv("repdata-data-StormData.csv")
Data transformation
Change all the events type to upper case
sd$EVTYPE <- toupper(sd$EVTYPE)
Eliminate the punctuation from the events type name
sd$EVTYPE <- gsub("\\.", "", sd$EVTYPE)
There will be four main type of summaries based on number of fatalities, injuries and the amount of properties and crop damages. From the summaries will be drop the weather events weith zero sum.
Summary of weather events based on the number of fatalities
Fatalities <- aggregate(FATALITIES ~ EVTYPE, data = sd, FUN = sum, na.rm = TRUE)
Fatalities <- subset(Fatalities, Fatalities$FATALITIES > 0)
Find the first top 10 weather events based on fatalities
Fatalities10 <- Fatalities[order(Fatalities$FATALITIES, decreasing = TRUE),
]
Fatalities10 <- Fatalities10[1:10, ]
Summary of weather events based on the number of injuries
Injuries <- aggregate(INJURIES ~ EVTYPE, data = sd, FUN = sum, na.rm = TRUE)
Injuries <- subset(Injuries, Injuries$INJURIES > 0)
Find the first top 10 weather events based on injuries
Injuries10 <- Injuries[order(Injuries$INJURIES, decreasing = TRUE), ]
Injuries10 <- Injuries10[1:10, ]
Merge the top 10 fatalities and injuries by weather event type
HealthEvents10 <- merge(Fatalities10, Injuries10, by.x = "EVTYPE", all = TRUE)
HealthEvents10[is.na(HealthEvents10)] <- 0
HealthEvents10$EVTYPE <- as.factor(HealthEvents10$EVTYPE)
Summary of weather events based on properties damage
PropertyDamage <- aggregate(PROPDMG ~ EVTYPE, data = sd, FUN = sum, na.rm = TRUE)
PropertyDamage <- subset(PropertyDamage, PropertyDamage$PROPDMG > 0)
Find the first top 10 weather events based on property damage
PropertyDamage10 <- PropertyDamage[order(PropertyDamage$PROPDMG, decreasing = TRUE),
]
PropertyDamage10 <- PropertyDamage10[1:10, ]
Summary of weather events based on crop damage
CropDamage <- aggregate(CROPDMG ~ EVTYPE, data = sd, FUN = sum, na.rm = TRUE)
CropDamage <- subset(CropDamage, CropDamage$CROPDMG > 0)
Find the first top 10 weather events based on crop damage
CropDamage10 <- CropDamage[order(CropDamage$CROPDMG, decreasing = TRUE), ]
CropDamage10 <- CropDamage10[1:10, ]
Merge the top 10 property and crop damage by weather event type
Damage10 <- merge(PropertyDamage10, CropDamage10, by.x = "EVTYPE", all = TRUE)
Damage10[is.na(Damage10)] <- 0
Damage10$EVTYPE <- as.factor(Damage10$EVTYPE)
The weather event with the highest number of injuries is tornado with the total number of injuries of 9.1346 × 104.
Here is a summary with the top 10 weather events based on number of fatalities and injuries. The data is order by fatalities and injuries.
HealthEvents10[order(HealthEvents10$FATALITIES, HealthEvents10$INJURIES, decreasing = TRUE),
]
## EVTYPE FATALITIES INJURIES
## 12 TORNADO 5633 91346
## 2 EXCESSIVE HEAT 1903 6525
## 3 FLASH FLOOD 978 1777
## 6 HEAT 937 2100
## 9 LIGHTNING 817 5230
## 13 TSTM WIND 504 6957
## 4 FLOOD 470 6789
## 10 RIP CURRENT 368 0
## 7 HIGH WIND 248 0
## 1 AVALANCHE 224 0
## 8 ICE STORM 0 1975
## 11 THUNDERSTORM WIND 0 1488
## 5 HAIL 0 1361
A bar plot of the data presented above is draw for an easy analyses
library(lattice)
barchart(EVTYPE ~ FATALITIES + INJURIES, HealthEvents10, main = "Health events",
xlab = "Health events", auto.key = list(space = "top", columns = 2), scales = list(x = list(log = TRUE)))
Based on the data analysed the weather event with the most impact properties, the highest amount of damages, happend because of tornado and the amount of damage is 3.2123 × 106.
Damage10[order(Damage10$PROPDMG, Damage10$CROPDMG, decreasing = TRUE), ]
## EVTYPE PROPDMG CROPDMG
## 10 TORNADO 3212258 100019
## 2 FLASH FLOOD 1420125 179200
## 11 TSTM WIND 1335996 109203
## 3 FLOOD 899938 168038
## 8 THUNDERSTORM WIND 876844 66794
## 4 HAIL 688693 579596
## 7 LIGHTNING 603352 0
## 9 THUNDERSTORM WINDS 446298 18685
## 6 HIGH WIND 324732 17283
## 12 WINTER STORM 132721 0
## 1 DROUGHT 0 33899
## 5 HEAVY RAIN 0 11123
library(lattice)
barchart(EVTYPE ~ PROPDMG + CROPDMG, Damage10, main = "Weather events damagess",
xlab = "Damage amount (dollars)", auto.key = list(space = "top", columns = 2,
text = c("Property damage", "Crop damage")), scales = list(x = list(log = TRUE)))