This problem was retrieved from LinkedIn Learning course Code Clinic: R

Import Data

Data will be imported from lynda.com Github data repository. The data corresponds to barometric pressures from Lake Pend O’Reille in Northern Idaho. We have almost 20 megabytes of data from the years 2012 thorugh 2015.

First, we define a function that will import each of the datafiles automatically.

mytempfile <- tempfile()
ImportFiles <- function (dataPath){
  read.table(dataPath,
             header = TRUE,
             stringsAsFactors = FALSE)
}

Additionally, a progress bar will be set to keep track of the documents imported

my_progressBar <- txtProgressBar(min = 2012, max =2015, style=3)

for(dataYear in 2012:2015){
  dataPath <- paste0('https://raw.githubusercontent.com/lyndadotcom/LPO_weatherdata/master/Environmental_Data_Deep_Moor_',
                     dataYear,
                     '.txt')
  if (exists('weather_data')){
    my_tempfile <- ImportFiles(dataPath)
    weather_data <- rbind(weather_data, my_tempfile)
  }else{
    weather_data <- ImportFiles(dataPath)
  }
  setTxtProgressBar(my_progressBar, value = dataYear)
}
## 
  |                                                                            
  |                                                                      |   0%
  |                                                                            
  |=======================                                               |  33%
  |                                                                            
  |===============================================                       |  67%
  |                                                                            
  |======================================================================| 100%

Finally, we see a little glimpse of the data we just imported

head(weather_data) %>%
  knitr::kable(format = 'markdown')
date time Air_Temp Barometric_Press Dew_Point Relative_Humidity Wind_Dir Wind_Gust Wind_Speed
2012_01_01 00:02:14 34.3 30.5 26.9 74.2 346.4 11 3.6
2012_01_01 00:08:29 34.1 30.5 26.5 73.6 349.0 12 8.0
2012_01_01 00:14:45 33.9 30.6 26.8 75.0 217.8 12 9.2
2012_01_01 00:21:00 33.8 30.6 27.3 76.6 280.8 17 14.0
2012_01_01 00:27:16 33.9 30.6 27.4 77.0 80.6 17 9.2
2012_01_01 00:33:31 33.8 30.6 27.0 76.0 11.0 17 12.2

Modeling Barametric Pressure between two dates

First, we set the start and end date we want to analyze and calculate the date interval with lubridate

startDate <- '2013-01-02 12:00:00'
endDate <- '2013-01-04 12:00:00'
dateInterval <- interval(ymd_hms(startDate),ymd_hms(endDate))

Then, we filter the data within the interval, by creating a new column with both dates and time.

barometric_data <- weather_data %>%
  mutate(big_date = ymd_hms(paste(date, time)))%>%
  filter(big_date %within% dateInterval)%>%
  select(Barometric_Press,big_date)

barometric_data%>% 
  head() %>%
  knitr::kable(align= 'c')%>%
  kableExtra::kable_styling()
## Warning in kableExtra::kable_styling(.): Please specify format in kable.
## kableExtra can customize either HTML or LaTeX outputs. See https://
## haozhu233.github.io/kableExtra/ for details.
Barometric_Press big_date
28.4 2013-01-02 14:36:04
28.4 2013-01-02 14:42:12
28.4 2013-01-02 14:48:21
28.4 2013-01-02 14:54:29
28.4 2013-01-02 15:00:37
28.4 2013-01-02 15:50:45

Get the linear relationship in the data

barometric_data %>%
  lm(Barometric_Press~ big_date, data=.)
## 
## Call:
## lm(formula = Barometric_Press ~ big_date, data = .)
## 
## Coefficients:
## (Intercept)     big_date  
##   2.641e+03   -1.925e-06

Graphing results

barometric_data%>%
  ggplot(aes(big_date, Barometric_Press))+
  geom_point()+
  geom_smooth(method='lm')+
  ggtitle(paste('Barometric Pressure from', startDate, 'to ',endDate))+
  ylab('Barometric Pressure')+
  xlab('Date and Time')+
  theme_minimal()

Because the slope between these two days is negative it will mean that the weather will tend to be rainy.