Synopsis

Analysis of Temperature for days in all the months for years from 1995 to present is done in this document.
1. It is found that a lot of variation in temperature is seen in months for January anf February for over years
2. Year 1996 & 2014 has seen most extreme temperature on the lower end while Years 1997 & 2012 has seen extremes in Upper end

Packages Required & Data Collection

library(ggplot2)
library(knitr)
Weather <- read.table("http://academic.udayton.edu/kissock/http/Weather/gsod95-current/OHCINCIN.txt", col.names = c("Month","Day", "Year", "Temperature"))

Source Code

The Data contains Temperature for Cincinnati, Ohio for every day in a month from 1995 to 2016. Text file downloaded is present here. Columns include Month, Day, Year and Temperature in Fahrenheit

Data Description

colnames(Weather); 
## [1] "Month"       "Day"         "Year"        "Temperature"
nrow(Weather)
## [1] 7963
str(Weather)
## 'data.frame':    7963 obs. of  4 variables:
##  $ Month      : int  1 1 1 1 1 1 1 1 1 1 ...
##  $ Day        : int  1 2 3 4 5 6 7 8 9 10 ...
##  $ Year       : int  1995 1995 1995 1995 1995 1995 1995 1995 1995 1995 ...
##  $ Temperature: num  41.1 22.2 22.8 14.9 9.5 23.8 31.1 26.9 31.3 31.5 ...
Weather[Weather$Temperature == -99, 4] <-  NA
sum(is.na(Weather$Temperature))
## [1] 14
summary(Weather)
##      Month             Day             Year       Temperature   
##  Min.   : 1.000   Min.   : 1.00   Min.   :1995   Min.   :-2.20  
##  1st Qu.: 4.000   1st Qu.: 8.00   1st Qu.:2000   1st Qu.:40.20  
##  Median : 6.000   Median :16.00   Median :2005   Median :57.10  
##  Mean   : 6.479   Mean   :15.72   Mean   :2005   Mean   :54.73  
##  3rd Qu.: 9.000   3rd Qu.:23.00   3rd Qu.:2011   3rd Qu.:70.70  
##  Max.   :12.000   Max.   :31.00   Max.   :2016   Max.   :89.20  
##                                                  NA's   :14

Data Visualization

Monthly Analysis

ggplot(data = Weather) +
  geom_point(mapping = aes(x = Day, y = Temperature )) +
  facet_grid(. ~ Month) + 
  ylab("Temperature in Fahrenheit")  +
  labs(title = "Monthly trend and Variation over years")

Yearly Anlaysis

boxplot(Temperature~Year,
        data=Weather,
        main="Temperature Yearly Plot", 
        xlab="Years", ylab="Temperature in Fahrenheit")

Averages

MeanMonthlyTemperature <- aggregate( x=Weather, 
                                     by = list( Weather$Year ,Weather$Month), 
                                     FUN = mean, na.rm=TRUE)
colors <- rainbow(22)
Years <- 1995:2016
plot(MeanMonthlyTemperature[MeanMonthlyTemperature$Year == 1995, 6],
     type = "l",
     col = colors[1] , 
     xlab = "Month", 
     ylab = "Average Temperature in Fahrenheit", 
     main = "Average trends for months over year")
for ( i in 2:22)
  lines(MeanMonthlyTemperature[MeanMonthlyTemperature$Year == Years[i], 6],type = "l",col = colors[i])

#legend("bottom", legend = c("1995","1996", 1997:2016),  col = rainbow(22) )