In following analyses, we will perform downloading data, cleaning and transforming data and present basic graphical analyses for casualties and damages either to property or crop. Data were downloaded from https://d396qusza40orc.cloudfront.net/repdata%2Fdata%2FStormData.csv.bz2 and additional data could be found on https://d396qusza40orc.cloudfront.net/repdata%2Fpeer2_doc%2Fpd01016005curr.pdf and frequently asked questions are accessible at https://d396qusza40orc.cloudfront.net/repdata%2Fpeer2_doc%2FNCDC%20Storm%20Events-FAQ%20Page.pdf
library(dplyr)
##
## Attaching package: 'dplyr'
## The following objects are masked from 'package:stats':
##
## filter, lag
## The following objects are masked from 'package:base':
##
## intersect, setdiff, setequal, union
library(ggplot2)
library(readr)
library(skimr)
library(lubridate)
##
## Attaching package: 'lubridate'
## The following objects are masked from 'package:base':
##
## date, intersect, setdiff, union
library(stringr)
library(tidyr)
setwd("/home/shinobi/Downloads/ReproducibleScience/Week4")
Downloading of files is only performed if file does not exist in directory which is set in setwd command. In case you are running these commands than set your own directory in which data will be downloaded.
if (file.exists("repdata_data_StormData.csv.bz2")) {
test <- read.csv("repdata_data_StormData.csv.bz2", stringsAsFactors = FALSE)
file_date <- read.csv("date_downloaded.csv")
data_load <- as.Date(file_date$x)
} else {
download.file("https://d396qusza40orc.cloudfront.net/repdata%2Fdata%2FStormData.csv.bz2",
"repdata_data_StormData.csv.bz2")
data_load <- Sys.Date()
test <- read.csv("repdata_data_StormData.csv.bz2", stringsAsFactors = FALSE)
download_date_csv <- write.csv(data_load, "date_downloaded.csv")
}
Data were downloaded 2022-01-28 and are used in this analysis.
Package skimr is used to get quick overview of variable values and missingness of data. From provided output we can see that that most of the missing values are in character variables that are used for descriptive purposes.
test %>% skim()
## Warning in grepl("^\\s+$", x): input string 192565 is invalid in this locale
## Warning in grepl("^\\s+$", x): input string 194345 is invalid in this locale
## Warning in grepl("^\\s+$", x): input string 199735 is invalid in this locale
## Warning in grepl("^\\s+$", x): input string 199745 is invalid in this locale
## Warning in grepl("^\\s+$", x): input string 200467 is invalid in this locale
| Name | Piped data |
| Number of rows | 902297 |
| Number of columns | 37 |
| _______________________ | |
| Column type frequency: | |
| character | 18 |
| logical | 1 |
| numeric | 18 |
| ________________________ | |
| Group variables | None |
Variable type: character
| skim_variable | n_missing | complete_rate | min | max | empty | n_unique | whitespace |
|---|---|---|---|---|---|---|---|
| BGN_DATE | 0 | 1 | 16 | 18 | 0 | 16335 | 0 |
| BGN_TIME | 0 | 1 | 3 | 11 | 0 | 3608 | 0 |
| TIME_ZONE | 0 | 1 | 3 | 3 | 0 | 22 | 0 |
| COUNTYNAME | 0 | 1 | 0 | 200 | 1589 | 29601 | 0 |
| STATE | 0 | 1 | 2 | 2 | 0 | 72 | 0 |
| EVTYPE | 0 | 1 | 1 | 30 | 0 | 985 | 0 |
| BGN_AZI | 0 | 1 | 0 | 3 | 547332 | 35 | 0 |
| BGN_LOCATI | 0 | 1 | 0 | 21 | 287743 | 54429 | 0 |
| END_DATE | 0 | 1 | 0 | 18 | 243411 | 6663 | 0 |
| END_TIME | 0 | 1 | 0 | 12 | 238978 | 3647 | 0 |
| END_AZI | 0 | 1 | 0 | 3 | 724837 | 24 | 0 |
| END_LOCATI | 0 | 1 | 0 | 21 | 499225 | 34506 | 0 |
| PROPDMGEXP | 0 | 1 | 0 | 1 | 465934 | 19 | 0 |
| CROPDMGEXP | 0 | 1 | 0 | 1 | 618413 | 9 | 0 |
| WFO | 0 | 1 | 0 | 3 | 142069 | 542 | 0 |
| STATEOFFIC | 0 | 1 | 0 | 45 | 248769 | 250 | 0 |
| ZONENAMES | 0 | 1 | 0 | 7226 | 594029 | 25112 | 205988 |
| REMARKS | 0 | 1 | 0 | 41278 | 287433 | 436781 | 24658 |
Variable type: logical
| skim_variable | n_missing | complete_rate | mean | count |
|---|---|---|---|---|
| COUNTYENDN | 902297 | 0 | NaN | : |
Variable type: numeric
| skim_variable | n_missing | complete_rate | mean | sd | p0 | p25 | p50 | p75 | p100 | hist |
|---|---|---|---|---|---|---|---|---|---|---|
| STATE__ | 0 | 1.00 | 31.20 | 16.57 | 1 | 19 | 30 | 45.0 | 95 | ▆▇▇▁▁ |
| COUNTY | 0 | 1.00 | 100.64 | 107.28 | 0 | 31 | 75 | 131.0 | 873 | ▇▁▁▁▁ |
| BGN_RANGE | 0 | 1.00 | 1.48 | 5.48 | 0 | 0 | 0 | 1.0 | 3749 | ▇▁▁▁▁ |
| COUNTY_END | 0 | 1.00 | 0.00 | 0.00 | 0 | 0 | 0 | 0.0 | 0 | ▁▁▇▁▁ |
| END_RANGE | 0 | 1.00 | 0.99 | 3.37 | 0 | 0 | 0 | 0.0 | 925 | ▇▁▁▁▁ |
| LENGTH | 0 | 1.00 | 0.23 | 4.62 | 0 | 0 | 0 | 0.0 | 2315 | ▇▁▁▁▁ |
| WIDTH | 0 | 1.00 | 7.50 | 61.57 | 0 | 0 | 0 | 0.0 | 4400 | ▇▁▁▁▁ |
| F | 843563 | 0.07 | 0.91 | 1.00 | 0 | 0 | 1 | 1.0 | 5 | ▇▂▁▁▁ |
| MAG | 0 | 1.00 | 46.90 | 61.91 | 0 | 0 | 50 | 75.0 | 22000 | ▇▁▁▁▁ |
| FATALITIES | 0 | 1.00 | 0.02 | 0.77 | 0 | 0 | 0 | 0.0 | 583 | ▇▁▁▁▁ |
| INJURIES | 0 | 1.00 | 0.16 | 5.43 | 0 | 0 | 0 | 0.0 | 1700 | ▇▁▁▁▁ |
| PROPDMG | 0 | 1.00 | 12.06 | 59.48 | 0 | 0 | 0 | 0.5 | 5000 | ▇▁▁▁▁ |
| CROPDMG | 0 | 1.00 | 1.53 | 22.17 | 0 | 0 | 0 | 0.0 | 990 | ▇▁▁▁▁ |
| LATITUDE | 47 | 1.00 | 2874.94 | 1657.65 | 0 | 2802 | 3540 | 4019.0 | 9706 | ▅▇▆▁▁ |
| LONGITUDE | 0 | 1.00 | 6939.54 | 3958.06 | -14451 | 7247 | 8707 | 9605.0 | 17124 | ▁▁▂▇▁ |
| LATITUDE_E | 40 | 1.00 | 1451.61 | 1858.73 | 0 | 0 | 0 | 3549.0 | 9706 | ▇▃▂▁▁ |
| LONGITUDE_ | 0 | 1.00 | 3509.14 | 4475.68 | -14455 | 0 | 0 | 8735.0 | 106220 | ▇▁▁▁▁ |
| REFNUM | 0 | 1.00 | 451149.00 | 260470.85 | 1 | 225575 | 451149 | 676723.0 | 902297 | ▇▇▇▇▇ |
In order to perform analyses on yearly data we have to create new variable Year from text variable BGN_DATE.
test$year <- year(mdy_hms(test$BGN_DATE))
Secund part of data cleaning is creating variables that contain total damages for property and crops. In these cases, we must use two variables CROPDMGEXP and PROPDMGEXP (which contains descriptive values of amount of dollars H for hundred and similar values for other amounts) and values PROPDMG and PROPDMG.
test <- test %>%
mutate(total_property_dmg = case_when(
toupper(PROPDMGEXP) == "H" ~ PROPDMG * 100,
toupper(PROPDMGEXP) == "K" ~ PROPDMG * 1000,
toupper(PROPDMGEXP) == "M" ~ PROPDMG * 1000000,
toupper(PROPDMGEXP) == "B" ~ PROPDMG * 1000000000,
TRUE ~ 0
))
test <- test %>%
mutate(total_crop_dmg = case_when(
toupper(CROPDMGEXP) == "H" ~ PROPDMG * 100,
toupper(CROPDMGEXP) == "K" ~ PROPDMG * 1000,
toupper(CROPDMGEXP) == "M" ~ PROPDMG * 1000000,
toupper(CROPDMGEXP) == "B" ~ PROPDMG * 1000000000,
TRUE ~ 0
))
Next part of data cleaning for analysis is cleaning data for EVTYPE variable which holds information which event caused disaster. After displaying distinct values, we can see that there are many categories and a lot of typos and overlapping categories. These values will be transformed with case_when function in new variable.
total_categories <- length(unique(toupper(test$EVTYPE)))
test <- test %>%
mutate(eventype = case_when(
str_detect(EVTYPE, "ABNORMAL") ~ "ABNORMALWEATHER",
str_detect(EVTYPE, "TSTM WIND") ~ "TSTM WIND",
str_detect(EVTYPE, "ASTRONOMICAL") ~ "ASTRONOMICAL",
str_detect(EVTYPE, "AVALANC") ~ "AVALANCHE",
str_detect(EVTYPE, "BEACH EROSI") ~ "BEACH EROSION",
str_detect(EVTYPE, "BITTER WIND") ~ "BITTER WIND",
str_detect(EVTYPE, "BLIZZARD") ~ "BLIZZARD",
str_detect(EVTYPE, "BLOW-OUT TIDE") ~ "BLOW-OUT TIDE",
str_detect(EVTYPE, "BLOWING SNOW") ~ "BLOWING SNOW",
str_detect(EVTYPE, "BRUSH FIRE") ~ "BRUSH FIRE",
str_detect(EVTYPE, "COASTAL") ~ "COASTAL",
str_detect(EVTYPE, "COLD") ~ "COLD",
str_detect(EVTYPE, "CSTL FLOODING") ~ "COASTAL",
str_detect(EVTYPE, "DOWNBURST") ~ "DOWNBURST",
str_detect(EVTYPE, "DROUGHT") ~ "DROUGHT",
str_detect(EVTYPE, "DRY") ~ "DRY WEATHER",
str_detect(EVTYPE, "DUST") ~ "DUST",
str_detect(EVTYPE, "EARLY") ~ "EARLY WEATHER ISSUE",
str_detect(EVTYPE, "EXCESSIVE") ~ "EXCESSIVE WEATHER",
str_detect(EVTYPE, "EXTREME") ~ "EXTREME WEATHER",
str_detect(EVTYPE, "FLASH FLOOD") ~ "FLASH FLOOD",
str_detect(EVTYPE, "FLOOD") ~ "FLOOD",
str_detect(EVTYPE, "FOG") ~ "FOG",
str_detect(EVTYPE, "FREEZE") ~ "FREEZE",
str_detect(EVTYPE, "FROST") ~ "FROST",
str_detect(EVTYPE, "FUNNEL") ~ "FUNNEL",
str_detect(EVTYPE, "GLAZE") ~ "GLAZE",
str_detect(EVTYPE, "GRADIENT") ~ "GRADIENT",
str_detect(EVTYPE, "GUSTY THUNDERSTORM") ~ "GUSTY THUNDERSTORM",
str_detect(EVTYPE, "GUSTY WIND") ~ "GUSTY WIND",
str_detect(EVTYPE, "HAIL") ~ "HAIL",
str_detect(EVTYPE, "HEAT") ~ "HEAT",
str_detect(EVTYPE, "HEAVY RAIN") ~ "HEAVY RAIN",
str_detect(EVTYPE, "HEAVY SNOW") ~ "HEAVY SNOW",
str_detect(EVTYPE, "HEAVY SURF") ~ "HEAVY SURF",
str_detect(EVTYPE, "HIGH WIND") ~ "HIGH WIND",
str_detect(EVTYPE, "HOT") ~ "HOT WEATHER",
str_detect(EVTYPE, "HURRICANE") ~ "HURRICANE",
str_detect(EVTYPE, "HYPERTHERMIA") ~ "HYPERTHERMIA",
str_detect(EVTYPE, "ICE") ~ "ICE",
str_detect(EVTYPE, "LAKE") ~ "LAKE",
str_detect(EVTYPE, "LANDSLIDE") ~ "LANDSLIDE",
str_detect(EVTYPE, "LATE SEASON") ~ "LATE SEASON",
str_detect(EVTYPE, "LIGHT SNOW") ~ "LIGHT SNOW",
str_detect(EVTYPE, "MARINE") ~ "MARINE",
str_detect(EVTYPE, "MICROBURST") ~ "MICROBURST",
str_detect(EVTYPE, "MUD") ~ "MUD",
str_detect(EVTYPE, "RAIN") ~ "RAIN",
str_detect(EVTYPE, "RECORD COLD") ~ "RECORD COLD",
str_detect(EVTYPE, "RECORD COOL") ~ "RECORD COLD",
str_detect(EVTYPE, "RECORD DRY") ~ "RECORD DRY",
str_detect(EVTYPE, "RECORD HEAT") ~ "RECORD HEAT",
str_detect(EVTYPE, "RECORD SNOW") ~ "RECORD SNOW",
str_detect(EVTYPE, "RECORD TEMPERATURE") ~ "RECORD TEMPERATURE",
str_detect(EVTYPE, "RECORD WARM") ~ "RECORD WARM",
str_detect(EVTYPE, "RIP CURRENT") ~ "RIP CURRENT",
str_detect(EVTYPE, "RIVER FLOOD") ~ "RIVER FLOOD",
str_detect(EVTYPE, "SLEET") ~ "SLEET",
str_detect(EVTYPE, "SNOW") ~ "SNOW",
str_detect(EVTYPE, "STORM SURGE") ~ "STORM SURGE",
str_detect(EVTYPE, "STREET FLOOD") ~ "STREET FLOOD",
str_detect(EVTYPE, "STRONG WIND") ~ "STRONG WIND",
str_detect(EVTYPE, "THUNDERSTORM") ~ "THUNDERSTORM",
str_detect(EVTYPE, "THUDERSTORM") ~ "THUNDERSTORM",
str_detect(EVTYPE, "THUNDEERSTORM") ~ "THUNDERSTORM",
str_detect(EVTYPE, "THUNERSTORM") ~ "THUNDERSTORM",
str_detect(EVTYPE, "TIDAL FLOOD") ~ "TIDAL FLOOD",
str_detect(EVTYPE, "TORNADO") ~ "TORNADO",
str_detect(EVTYPE, "TORNDAO") ~ "TORNADO",
str_detect(EVTYPE, "TROPICAL STORM") ~ "TROPICAL STORM",
str_detect(EVTYPE, "TSTM") ~ "TSTM",
str_detect(EVTYPE, "UNSEASONABLE COLD") ~ "UNSEASONABLE COLD",
str_detect(EVTYPE, "UNSEASONABLY COLD") ~ "UNSEASONABLE COLD",
str_detect(EVTYPE, "UNSEASONABLY COOL") ~ "UNSEASONABLE COLD",
str_detect(EVTYPE, "UNSEASONABLY WARM") ~ "UNSEASONABLY WARM",
str_detect(EVTYPE, "URBAN AND SMALL") ~ "URBAN AND SMALL",
str_detect(EVTYPE, "URBAN FLOOD") ~ "URBAN FLOOD",
str_detect(EVTYPE, "URBAN/SMALL") ~ "URBAN FLOOD",
str_detect(EVTYPE, "URBAN/SML") ~ "URBAN FLOOD",
str_detect(EVTYPE, "VOLCANIC") ~ "VOLCANIC",
str_detect(EVTYPE, "WATER") ~ "WATER",
str_detect(EVTYPE, "WET MICOBURST") ~ "WET MICOBURST",
str_detect(EVTYPE, "WILD/FOREST FIRE") ~ "WILD/FOREST FIRE",
str_detect(EVTYPE, "WILDFIRE") ~ "WILDFIRE",
str_detect(EVTYPE, "WIND") ~ "WIND",
str_detect(EVTYPE, "WINTER STORM") ~ "WINTER STORM",
str_detect(EVTYPE, "WINTER WEATHER") ~ "WINTER WEATHER",
str_detect(EVTYPE, "WND") ~ "WIND",
str_detect(EVTYPE, "COOL") ~ "COOL WEATHER",
str_detect(EVTYPE, "DAM") ~ " DAM ISSUE",
str_detect(EVTYPE, "FLASH FLOOODING") ~ "FLASH FLOOD",
str_detect(EVTYPE, "GUSTNADO") ~ "GUSTNADO",
str_detect(EVTYPE, "HYPERTHERMIA") ~ "HYPERTHERMIA",
str_detect(EVTYPE, "HYPOTHERMIA") ~ "HYPERTHERMIA",
str_detect(EVTYPE, "LIGHTNING") ~ "LIGHTNING",
str_detect(EVTYPE, "LIGHTIN") ~ "LIGHTNING",
str_detect(EVTYPE, "MIXED PRECIP") ~ "MIXED PRECIPITATION",
str_detect(EVTYPE, "SMALL STREAM") ~ "SMALL STREAM",
str_detect(EVTYPE, "THUNERSTORM") ~ "THUNDERSTORM",
str_detect(EVTYPE, "THUNDERSTORM") ~ "THUNDERSTORM",
str_detect(EVTYPE, "WILD FIRES") ~ "WILDFIRE",
str_detect(EVTYPE, "WINTER") ~ "WINTER ISSUES",
str_detect(EVTYPE, "WINTRY MIX") ~ "WINTER ISSUES",
str_detect(EVTYPE, "SML STREAM FLD") ~ "SMALL STREAM",
str_detect(EVTYPE, "LOW TEMPERATURE RECORD") ~ "LOW TEMPERATURE",
str_detect(EVTYPE, "LIGNTNING") ~ "LIGHTNING",
str_detect(EVTYPE, "HIGH SURF") ~ "HIGH SURF",
str_detect(EVTYPE, "WINTER STORM") ~ "WINTER STORM",
str_detect(EVTYPE, "FREEZING") ~ "FREEZING DRIZZLE",
str_detect(EVTYPE, "HEAVY PRECIPITATION") ~ "HEAVY PRECITIPATION",
str_detect(EVTYPE, "HEAVY PRECIPATATION") ~ "HEAVY PRECITIPATION",
str_detect(EVTYPE, "HEAVY SHOWER") ~ "HEAVY SHOWER",
str_detect(EVTYPE, "LIGNTNING") ~ "LIGHTNING",
str_detect(EVTYPE, "LOW TEMPERATURE") ~ "LOW TEMPERATURE",
str_detect(EVTYPE, "RECORD HIGH TEMPERATURE") ~ "RECORD HIGH TEMPERATURE",
TRUE ~ toupper(EVTYPE)
))
total_reduced <- length(unique(toupper(test$eventype)))
Total categories before reduction was 898 and after reduction was ``298```.
Final data preprocessing is creating aggregate datasets used in graphical analysis and extracting first most influential cause for damages and casualties in years that we have in our period.
agg_data <- test %>%
group_by(year) %>%
summarise(TotalPropertyDamage = sum(total_property_dmg),
TotalCropDamage = sum(total_crop_dmg),
TotalDamage = sum(total_property_dmg) + sum(total_crop_dmg),
MeanPropertyDamage = mean(total_property_dmg),
MeanCropDamage = mean(total_crop_dmg),
MeanlDamage = mean(total_property_dmg) + mean(total_crop_dmg))
agg_data_gather <- agg_data %>%
gather(key = "Category", value="Damage", 2:7)
agg_data_fatalities <- test %>%
group_by(year) %>%
summarise(TotalFatalities = sum(FATALITIES),
TotalCasulties = sum(INJURIES))
agg_data_eventype <- test %>%
group_by(year, eventype) %>%
summarise(TotalPropertyDamage = sum(total_property_dmg),
TotalCropDamage = sum(total_crop_dmg),
TotalDamage = sum(total_property_dmg) + sum(total_crop_dmg),
TotalFatalities = sum(FATALITIES),
TotalCasulties = sum(INJURIES))
## `summarise()` has grouped output by 'year'. You can override using the `.groups` argument.
top_total_dmg <- top_n(agg_data_eventype, 1, wt=TotalDamage)
top_property_dmg <- top_n(agg_data_eventype, 1, wt=TotalPropertyDamage)
top_crop_dmg <- top_n(subset(agg_data_eventype, TotalCropDamage > 0), 1, wt=TotalCropDamage)
top_total_fatalities <- top_n(agg_data_eventype, 1, wt=TotalFatalities)
top_total_casulties <- top_n(agg_data_eventype, 1, wt=TotalCasulties)
First chart represents timeseries of damages in properties and crops in regular scale.
agg_data_gather %>%
filter(Category %in% c("TotalPropertyDamage", "TotalCropDamage", "TotalDamage")) %>%
ggplot(aes(year, Damage, color = Category))+
geom_line() +
labs(title="Total value of damages in analyzed time period", x ="Year", y = "Value of damages in USD")
In first chart we can see that data concerning crop damages are only available after 1992. Secund most significant part are 3 outliers in damages that could be potential typos and deserve attention before proceeding with analysis. After looking at data 3 data points were identified and there is no error in data.
Due to significant differences in amounts we had to change scale of nominal data to logarithmic scale.
agg_data_gather %>%
filter(Category %in% c("TotalPropertyDamage", "TotalCropDamage", "TotalDamage")) %>%
ggplot(aes(year, log(Damage), color = Category))+
geom_line() +
labs(title="Total value of damages in analyzed time period", x ="Year", y = "Value of damages in USD in log scale")
After displaying data in logarithmic scale, we can see that there is increasing trend in damages from weather.
Folowing chart displays mean value of damages.
agg_data_gather %>%
filter(Category %in% c("MeanPropertyDamage", "MeanCropDamage", "MeanlDamage")) %>%
ggplot(aes(year, (Damage), color = Category))+
geom_line() +
labs(title="Total value of damages in analyzed time period", x ="Year", y = "Mean value of damages in USD")
Following chart contains total fatalities in timeseries and linear trend line was added.
ggplot(agg_data_fatalities, aes(year, TotalFatalities))+
geom_line() +
geom_smooth(method = "lm", se = FALSE) +
labs(title="Total number of fatalities in analyzed time period", x ="Year", y = "Number of fatalities")
## `geom_smooth()` using formula 'y ~ x'
From this chart we can see that total fatalities are increasing in time.
Next part of graphical analysis is chart containing casualties in timeseries.
ggplot(agg_data_fatalities, aes(year, TotalCasulties))+
geom_line() +
geom_smooth(method = "lm", se = FALSE) +
labs(title="Total number of casualties in analyzed time period", x ="Year", y = "Number of casualties")
## `geom_smooth()` using formula 'y ~ x'
This timeseries also shows slow linear increase in number of casualties, but there are ale large deviations within timeseries.
Following 5 tables display number of times which event type was main cause of damages and casualties in year.
Display of top event type and total damages. Second table displays number of years for top total damages
top_total_dmg
## # A tibble: 62 × 7
## # Groups: year [62]
## year eventype TotalPropertyDama… TotalCropDamage TotalDamage TotalFatalities
## <dbl> <chr> <dbl> <dbl> <dbl> <dbl>
## 1 1950 TORNADO 34481650 0 34481650 70
## 2 1951 TORNADO 65505990 0 65505990 34
## 3 1952 TORNADO 94102240 0 94102240 230
## 4 1953 TORNADO 596104700 0 596104700 519
## 5 1954 TORNADO 85805320 0 85805320 36
## 6 1955 TORNADO 82660630 0 82660630 129
## 7 1956 TORNADO 116912350 0 116912350 83
## 8 1957 TORNADO 224388890 0 224388890 193
## 9 1958 TORNADO 128994610 0 128994610 67
## 10 1959 TORNADO 87453040 0 87453040 58
## # … with 52 more rows, and 1 more variable: TotalCasulties <dbl>
top_total_dmg %>% group_by(eventype) %>% summarize(count_events = n()) %>% arrange(desc(count_events))
## # A tibble: 7 × 2
## eventype count_events
## <chr> <int>
## 1 TORNADO 43
## 2 FLOOD 6
## 3 HAIL 4
## 4 HURRICANE 4
## 5 FLASH FLOOD 3
## 6 ICE 1
## 7 TROPICAL STORM 1
Display of top event type and property damages. Second table displays number of years for top total property damages.
top_property_dmg
## # A tibble: 62 × 7
## # Groups: year [62]
## year eventype TotalPropertyDama… TotalCropDamage TotalDamage TotalFatalities
## <dbl> <chr> <dbl> <dbl> <dbl> <dbl>
## 1 1950 TORNADO 34481650 0 34481650 70
## 2 1951 TORNADO 65505990 0 65505990 34
## 3 1952 TORNADO 94102240 0 94102240 230
## 4 1953 TORNADO 596104700 0 596104700 519
## 5 1954 TORNADO 85805320 0 85805320 36
## 6 1955 TORNADO 82660630 0 82660630 129
## 7 1956 TORNADO 116912350 0 116912350 83
## 8 1957 TORNADO 224388890 0 224388890 193
## 9 1958 TORNADO 128994610 0 128994610 67
## 10 1959 TORNADO 87453040 0 87453040 58
## # … with 52 more rows, and 1 more variable: TotalCasulties <dbl>
top_property_dmg %>% group_by(eventype) %>% summarize(count_events = n()) %>% arrange(desc(count_events))
## # A tibble: 9 × 2
## eventype count_events
## <chr> <int>
## 1 TORNADO 46
## 2 HURRICANE 6
## 3 FLOOD 3
## 4 HAIL 2
## 5 FLASH FLOOD 1
## 6 STORM SURGE 1
## 7 TROPICAL STORM 1
## 8 WILD/FOREST FIRE 1
## 9 WILDFIRE 1
Display of top crop event type and damages. Second table dis[plays number of years for top total property damages.
top_crop_dmg
## # A tibble: 19 × 7
## # Groups: year [19]
## year eventype TotalPropertyDam… TotalCropDamage TotalDamage TotalFatalities
## <dbl> <chr> <dbl> <dbl> <dbl> <dbl>
## 1 1993 FLOOD 5662096000 5793355500 1.15e10 25
## 2 1994 ICE 195977000 500050600000 5.00e11 4
## 3 1995 FLOOD 129676800 1891146500 2.02e 9 13
## 4 1996 TSTM WIND 407745950 4100000750 4.51e 9 22
## 5 1997 HAIL 202142820 2593275000 2.80e 9 0
## 6 1998 HAIL 1348376050 2630030610 3.98e 9 1
## 7 1999 HAIL 442952140 3199833700 3.64e 9 0
## 8 2000 HAIL 445986960 3399273350 3.85e 9 2
## 9 2001 TSTM WIND 318767660 942487060 1.26e 9 16
## 10 2002 HAIL 325268840 2796917500 3.12e 9 0
## 11 2003 HAIL 505952470 1847341500 2.35e 9 0
## 12 2004 FLOOD 844435500 1643958500 2.49e 9 24
## 13 2005 HURRICANE 49786635000 6916793310 5.67e10 34
## 14 2006 HAIL 1300570000 3582635100 4.88e 9 0
## 15 2007 HAIL 375726250 1742857750 2.12e 9 0
## 16 2008 FLOOD 2037495650 5976133070 8.01e 9 22
## 17 2009 HAIL 1440000550 8691341950 1.01e10 0
## 18 2010 FLASH FL… 839040900 2219598960 3.06e 9 67
## 19 2011 FLOOD 7725177450 4595789970 1.23e10 58
## # … with 1 more variable: TotalCasulties <dbl>
top_crop_dmg %>% group_by(eventype) %>% summarize(count_events = n()) %>% arrange(desc(count_events))
## # A tibble: 6 × 2
## eventype count_events
## <chr> <int>
## 1 HAIL 9
## 2 FLOOD 5
## 3 TSTM WIND 2
## 4 FLASH FLOOD 1
## 5 HURRICANE 1
## 6 ICE 1
Number of years for top fatalities events
top_total_fatalities %>% group_by(eventype) %>% summarize(count_events = n()) %>% arrange(desc(count_events))
## # A tibble: 8 × 2
## eventype count_events
## <chr> <int>
## 1 TORNADO 45
## 2 EXCESSIVE WEATHER 8
## 3 FLASH FLOOD 4
## 4 HEAT 1
## 5 HIGH WIND 1
## 6 LIGHTNING 1
## 7 RIP CURRENT 1
## 8 TSTM WIND 1
Number of years for top casulties events
top_total_casulties %>% group_by(eventype) %>% summarize(count_events = n()) %>% arrange(desc(count_events))
## # A tibble: 5 × 2
## eventype count_events
## <chr> <int>
## 1 TORNADO 58
## 2 EXCESSIVE WEATHER 1
## 3 FLOOD 1
## 4 HURRICANE 1
## 5 ICE 1
As we can see from the presented data, number of fatalities and casualties is rising in current times as compared to previous time periods. This is also applicable to amount of USD in damages caused by weather. If we were to analyze newer data, this trend would be probably more significant.