Synopsis

The purpose of this markdown file is to throw some light on the average temperature data of Cincinnati, Ohio. This has been collected from 1995-01-01 till 2016-10-19.

Dataset Source: http://academic.udayton.edu/kissock/http/Weather/gsod95-current/OHCINCIN.txt

Packages Required

library(tidyverse)  #To load the packages of tidyverse like ggplot2,dplyr etc.

Source Code

VARIABLES:

1. MONTH: The first column contains month from 1-12

2. DAY: The second column contains the day of the month

3. YEAR: The year from 1995 to current year.

4. TEMP: The average of the temparature for the day.

For more information about the dataset source

Loading the dataset from the source

wtr<-"http://academic.udayton.edu/kissock/http/Weather/gsod95-current/OHCINCIN.txt"
Ohio<-read.table(wtr,header = FALSE)
names(Ohio)<-c("MONTH","DAY","YEAR","TEMP")

Data Description

No. of rows 7963

No. of Variables 4

Missing values: The missing values are given as -99 in the original data.

Ohio[Ohio=="-99"]<-NA
sum(is.na(Ohio))
## [1] 14
Ohio_new<-na.omit(Ohio)

Structure of Data

str(Ohio)
## 'data.frame':    7963 obs. of  4 variables:
##  $ MONTH: int  1 1 1 1 1 1 1 1 1 1 ...
##  $ DAY  : int  1 2 3 4 5 6 7 8 9 10 ...
##  $ YEAR : int  1995 1995 1995 1995 1995 1995 1995 1995 1995 1995 ...
##  $ TEMP : num  41.1 22.2 22.8 14.9 9.5 23.8 31.1 26.9 31.3 31.5 ...

Summary of Original Data

summary(Ohio)
##      MONTH             DAY             YEAR           TEMP      
##  Min.   : 1.000   Min.   : 1.00   Min.   :1995   Min.   :-2.20  
##  1st Qu.: 4.000   1st Qu.: 8.00   1st Qu.:2000   1st Qu.:40.20  
##  Median : 6.000   Median :16.00   Median :2005   Median :57.10  
##  Mean   : 6.479   Mean   :15.72   Mean   :2005   Mean   :54.73  
##  3rd Qu.: 9.000   3rd Qu.:23.00   3rd Qu.:2011   3rd Qu.:70.70  
##  Max.   :12.000   Max.   :31.00   Max.   :2016   Max.   :89.20  
##                                                  NA's   :14

Summary of New Data

summary(Ohio_new)
##      MONTH             DAY             YEAR           TEMP      
##  Min.   : 1.000   Min.   : 1.00   Min.   :1995   Min.   :-2.20  
##  1st Qu.: 4.000   1st Qu.: 8.00   1st Qu.:2000   1st Qu.:40.20  
##  Median : 6.000   Median :16.00   Median :2005   Median :57.10  
##  Mean   : 6.477   Mean   :15.71   Mean   :2005   Mean   :54.73  
##  3rd Qu.: 9.000   3rd Qu.:23.00   3rd Qu.:2011   3rd Qu.:70.70  
##  Max.   :12.000   Max.   :31.00   Max.   :2016   Max.   :89.20

Data Visualization

VIZ 1:GGPLOT This plot explaing the temp w.r.t month the color is sequential color scheme which shows year

ggplot(data=Ohio_new)+
  geom_point(mapping= aes(x=MONTH, y= TEMP, color=YEAR))

VIZ 2:FACET WRAP This has a different plot for each month. Each month has information for all the year temp.

ggplot(data = Ohio_new) + 
  geom_point(mapping = aes(x =YEAR, y = TEMP)) + 
  facet_wrap(~ MONTH, nrow = 4)

VIZ 3: BAR-GRAPH This plot gives us the traditional bar graphs.

ggplot(data = Ohio_new) + 
  geom_bar(mapping = aes(x =TEMP,fill=MONTH), position = "dodge")