This is an R Markdown document. Markdown is a simple formatting syntax for authoring web pages (click the MD toolbar button for help on Markdown).
When you click the Knit HTML button a web page will be generated that includes both content as well as the output of any embedded R code chunks within the document. You can embed an R code chunk like this:
as of August 28, 2014, superceding the version of August 24. Always use the most recent version.
This study takes a look at storm data from National Hurricane Center. It tracks different tropical cyclones through the Atlantic Ocean, Carribean Sea, and Gulf of Mexico from 1995 to 2005. It includes various metadata about each storm including name, year, month, date, hour, latitude, longitude, type, air pressure, maximum wind speeds, and day of the hurricane season. This specific recipe will be looking into the Day of the hurricane season and wind speed and it’s affect on the air pressure of the storms.
remove(list=ls())
install.packages("nasaweather", repos='http://cran.us.r-project.org')
## package 'nasaweather' successfully unpacked and MD5 sums checked
##
## The downloaded binary packages are in
## C:\Users\Caroline\AppData\Local\Temp\Rtmp4Q3EQH\downloaded_packages
require(nasaweather)
## Loading required package: nasaweather
#library("nasaweather", lib.loc="~/R/win-library/3.0/")
x<-storms
attach(storms)
## The following object is masked from package:datasets:
##
## pressure
head(storms)
## name year month day hour lat long pressure wind type
## 1 Allison 1995 6 3 0 17.4 -84.3 1005 30 Tropical Depression
## 2 Allison 1995 6 3 6 18.3 -84.9 1004 30 Tropical Depression
## 3 Allison 1995 6 3 12 19.3 -85.7 1003 35 Tropical Storm
## 4 Allison 1995 6 3 18 20.6 -85.8 1001 40 Tropical Storm
## 5 Allison 1995 6 4 0 22.0 -86.0 997 50 Tropical Storm
## 6 Allison 1995 6 4 6 23.3 -86.3 995 60 Tropical Storm
## seasday
## 1 3
## 2 3
## 3 3
## 4 3
## 5 4
## 6 4
str(storms)
## Classes 'tbl_df', 'tbl' and 'data.frame': 2747 obs. of 11 variables:
## $ name : chr "Allison" "Allison" "Allison" "Allison" ...
## $ year : int 1995 1995 1995 1995 1995 1995 1995 1995 1995 1995 ...
## $ month : int 6 6 6 6 6 6 6 6 6 6 ...
## $ day : int 3 3 3 3 4 4 4 4 5 5 ...
## $ hour : int 0 6 12 18 0 6 12 18 0 6 ...
## $ lat : num 17.4 18.3 19.3 20.6 22 23.3 24.7 26.2 27.6 28.5 ...
## $ long : num -84.3 -84.9 -85.7 -85.8 -86 -86.3 -86.2 -86.2 -86.1 -85.6 ...
## $ pressure: int 1005 1004 1003 1001 997 995 987 988 988 990 ...
## $ wind : int 30 30 35 40 50 60 65 65 65 60 ...
## $ type : chr "Tropical Depression" "Tropical Depression" "Tropical Storm" "Tropical Storm" ...
## $ seasday : int 3 3 3 3 4 4 4 4 5 5 ...
The factor that was used in this analysis was Storm Type. The levels analyzed were Tropical Storms and Hurricanes. The other levels in this factor were Extratropical and Tropical Depression.
head(x)
## name year month day hour lat long pressure wind type
## 1 Allison 1995 6 3 0 17.4 -84.3 1005 30 Tropical Depression
## 2 Allison 1995 6 3 6 18.3 -84.9 1004 30 Tropical Depression
## 3 Allison 1995 6 3 12 19.3 -85.7 1003 35 Tropical Storm
## 4 Allison 1995 6 3 18 20.6 -85.8 1001 40 Tropical Storm
## 5 Allison 1995 6 4 0 22.0 -86.0 997 50 Tropical Storm
## 6 Allison 1995 6 4 6 23.3 -86.3 995 60 Tropical Storm
## seasday
## 1 3
## 2 3
## 3 3
## 4 3
## 5 4
## 6 4
tail(x)
## name year month day hour lat long pressure wind type
## 2742 Nadine 2000 10 21 6 33.3 -53.5 1000 50 Tropical Storm
## 2743 Nadine 2000 10 21 12 34.1 -52.3 1000 50 Tropical Storm
## 2744 Nadine 2000 10 21 18 34.8 -51.3 1000 45 Tropical Storm
## 2745 Nadine 2000 10 22 0 35.7 -50.5 1004 40 Extratropical
## 2746 Nadine 2000 10 22 6 37.0 -49.0 1005 40 Extratropical
## 2747 Nadine 2000 10 22 12 39.0 -47.0 1005 35 Extratropical
## seasday
## 2742 143
## 2743 143
## 2744 143
## 2745 144
## 2746 144
## 2747 144
summary(x)
## name year month day
## Length:2747 Min. :1995 Min. : 6.0 Min. : 1
## Class :character 1st Qu.:1995 1st Qu.: 8.0 1st Qu.: 9
## Mode :character Median :1997 Median : 9.0 Median :18
## Mean :1997 Mean : 8.8 Mean :17
## 3rd Qu.:1999 3rd Qu.:10.0 3rd Qu.:25
## Max. :2000 Max. :12.0 Max. :31
## hour lat long pressure
## Min. : 0.00 Min. : 8.3 Min. :-107.3 Min. : 905
## 1st Qu.: 3.50 1st Qu.:17.2 1st Qu.: -77.6 1st Qu.: 980
## Median :12.00 Median :25.0 Median : -60.9 Median : 995
## Mean : 9.06 Mean :26.7 Mean : -60.9 Mean : 990
## 3rd Qu.:18.00 3rd Qu.:33.9 3rd Qu.: -45.8 3rd Qu.:1004
## Max. :18.00 Max. :70.7 Max. : 1.0 Max. :1019
## wind type seasday
## Min. : 15.0 Length:2747 Min. : 3
## 1st Qu.: 35.0 Class :character 1st Qu.: 84
## Median : 50.0 Mode :character Median :103
## Mean : 54.7 Mean :103
## 3rd Qu.: 70.0 3rd Qu.:125
## Max. :155.0 Max. :185
The continuous variables in this dataset are longitude, latitude, air pressure, and wind speed.
The response variables in this dataset are air pressure and wind speed.
The data from ‘storms’ describes data about the tropical cyclones that are tracked through the Atlantic Ocean, Carribean Sea, and Gulf of Mexico from 1995 to 2005. The information about storms include various metadata about each storm including name, year, month, date, hour, latitude, longitude, type, air pressure, maximum wind speeds, and day of the hurricane season. There are four levels to type factor which includes Extratropical, Tropical Depression, Hurricane, and Tropical Storm.
This data originated from the National Hurricane Center’s archive of Tropical Cyclone reports, handscraped from track tables of individual tropical cyclone reports. We can assume that this data was collected using proper randomization techniques.
This data was publically available for anyone to use and perform analysis on. In this analysis, I will be testing data in order to see if there is a different in air pressure at the center of the storm for two different types of storms: Tropical Storms and Hurricanes. A two-sample t-test will be performed in order to determine if there was a difference between the means.
The null hypothesis that will be tested is:
The mean air pressure for Tropical Storms is equal to the mean air pressure for Hurricanes.
The rationale for the collection of data was just for the National Hurricane Center to gather information on the tropical cyclones that travel through the Atlantic Ocean, Carribean Sea, and Gulf of Mexico from 1995 to 2005.
This data was collected with no intention, just for data collection.
No, there were no replicates or repeated measures.
The original dataset was organized without experimental groups, with measurements recorded based on certain variables. For this analysis, the data from two different types of Tropical cyclones were used: Hurricans and Tropical Storms.
#Here we will just look at the boxplots.
boxplot(pressure~wind, data = storms)
boxplot(pressure~type, data = storms)
A two sample t-test will be performed to test the above hypothesis.
model = aov(pressure~wind,data = storms)
anova(model)
## Analysis of Variance Table
##
## Response: pressure
## Df Sum Sq Mean Sq F value Pr(>F)
## wind 1 822018 822018 16388 <2e-16 ***
## Residuals 2745 137685 50
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
model = aov(pressure~type,data = storms)
anova(model)
## Analysis of Variance Table
##
## Response: pressure
## Df Sum Sq Mean Sq F value Pr(>F)
## type 3 538001 179334 1166 <2e-16 ***
## Residuals 2743 421702 154
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
model = aov(pressure~wind*type,data = storms)
anova(model)
## Analysis of Variance Table
##
## Response: pressure
## Df Sum Sq Mean Sq F value Pr(>F)
## wind 1 822018 822018 22097.9 <2e-16 ***
## type 3 30604 10201 274.2 <2e-16 ***
## wind:type 3 5193 1731 46.5 <2e-16 ***
## Residuals 2739 101888 37
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
From the t-test, it is clear that we can reject the null hypothesis that the mean air pressure of Tropical Storms is equal to the mean air pressure of Hurricanes.
A QQ plot is used in order to test the normality of the data. From the plots seen below, it can be seen that normal Q-Q plots returned a linear relationship between the air pressure and their theoretical quantities.
qqnorm(residuals(model))
qqline(residuals(model))
Shapiro-Wilk tests use the null hypothesis as a test of normality. As we can see, both of the p-values returned less than 0.1, meaning the population is normal.
plot(fitted(model), residuals(model))
interaction.plot(storms$pressure, storms$type, storms$wind)
No literature was used.
The data originated from the National Hurricane Center’s archive of Tropical Cyclone Reports (http://www.nhc.noaa.gov/). This dataset was hand-scraped from best track tables in the individual tropical cyclone reports (PDF, HTML and Microsoft Word) by Jon Hobbs and is publically available at: https://github.com/hadley/nasaweather.
The Tropical Cyclone Reports had a variety of storm type designations and there appeared to be no consistent naming convention for cyclones that were not hurricanes, tropical depressions, or tropical storms. Many of these designations have been combined into the “Extratropical” category in this dataset.