Synopsis

This Rmarkdown file is a part of the week 3 exercise. In this markdown we create r code chunks to load the data from a url into a dataframe using R. We identify the columns by checking for unique values.Nrow gives the number of records in the whole data set.Summary gives details about the mean,median and the ranges. Three visualizations are done- histogram of temperature.The second visualization shows a bar graph of the year versus temperature. The last graph is a box plot of the temperature in every month.

Packages

library(ggplot2)
The library ggplot2 is loaded to plot the graphs.
??ggplot2 gives more details about ggplot2

#Source Code

  urlink<-"http://academic.udayton.edu/kissock/http/Weather/gsod95-current/OHCINCIN.txt";
  datacincin <- read.table(urlink);
  head(datacincin)
##   V1 V2   V3   V4
## 1  1  1 1995 41.1
## 2  1  2 1995 22.2
## 3  1  3 1995 22.8
## 4  1  4 1995 14.9
## 5  1  5 1995  9.5
## 6  1  6 1995 23.8
colnames(datacincin)<-c("Month","Day","Year","Temperature")

The first few rows of the data set datacincin is loaded and displayed. Unique values of each column are found.

  head(datacincin)
##   Month Day Year Temperature
## 1     1   1 1995        41.1
## 2     1   2 1995        22.2
## 3     1   3 1995        22.8
## 4     1   4 1995        14.9
## 5     1   5 1995         9.5
## 6     1   6 1995        23.8
  unique(datacincin$Month)
##  [1]  1  2  3  4  5  6  7  8  9 10 11 12
  unique(datacincin$Day)
##  [1]  1  2  3  4  5  6  7  8  9 10 11 12 13 14 15 16 17 18 19 20 21 22 23
## [24] 24 25 26 27 28 29 30 31
  unique(datacincin$Year)
##  [1] 1995 1996 1997 1998 1999 2000 2001 2002 2003 2004 2005 2006 2007 2008
## [15] 2009 2010 2011 2012 2013 2014 2015 2016
  unique(datacincin$Temperature)
##   [1]  41.1  22.2  22.8  14.9   9.5  23.8  31.1  26.9  31.3  31.5  44.4
##  [12]  58.0  60.2  45.6  33.7  35.5  41.4  46.0  34.6  25.1  24.0  23.0
##  [23]  21.4  20.7  22.1  26.5  22.9  23.2  20.4  35.0  37.2  30.8  26.4
##  [34]  17.6   9.0  13.3  13.8  15.3  31.7   6.3  25.9  38.9  35.8  33.1
##  [45]  37.6  44.7  42.3  36.5  34.2  50.6  37.5  52.8  54.1  43.6  33.0
##  [56]  24.5  27.9  33.3  47.7  59.1  26.6  30.4  48.7  57.7  61.4  64.0
##  [67]  60.3  58.5  49.4  55.0  58.4  50.4  51.6  42.1  45.2  49.1  40.2
##  [78]  43.4  38.4  39.9  44.0  49.3  50.8  57.5  63.8  59.5  57.3  57.4
##  [89]  46.9  48.3  50.0  61.2  66.5  69.6  65.4  52.5  51.9  45.8  49.6
## [100]  52.6  48.4  54.8  55.7  46.1  51.0  54.2  55.4  54.3  56.8  61.1
## [111]  66.2  67.1  58.2  70.4  66.9  61.8  66.7  66.4  58.1  62.4  64.2
## [122]  63.2  67.0  74.7  67.5  63.0  65.1  70.8  66.0  61.0  65.8  67.6
## [133]  69.0  69.7  68.1  70.5  73.7  75.6  75.9  76.1  69.9  63.1  62.7
## [144]  66.1  71.3  71.6  73.2  77.6  74.5  73.3  72.1  71.8  74.0  75.7
## [155]  72.6  71.9  73.9  64.5  74.3  72.9  72.3  68.9  75.0  76.5  79.7
## [166]  82.9  84.6  82.4  77.3  80.9  76.4  75.4  74.1  72.2  77.8  77.5
## [177]  81.7  80.3  82.7  79.3  79.1  83.3  81.6  73.5  75.5  76.6  76.8
## [188]  80.1  83.5  81.5  82.1  82.3  81.2  76.9  78.9  70.6  76.2  77.0
## [199]  74.9  75.3  69.5  73.6  74.6  76.0  72.0  73.8  71.4  65.3  58.3
## [210]  62.3  47.9  51.7  53.6  55.2  61.5  64.1  66.3  63.7  64.7  59.7
## [221]  61.3  55.9  62.8  64.9  59.3  48.0  44.1  60.7  55.3  41.7  59.6
## [232]  55.8  42.2  47.8  49.5  47.6  46.8  58.9  64.8  41.3  29.1  33.5
## [243]  47.4  35.1  26.3  23.7  35.2  33.2  39.1  43.5  38.6  30.5  39.4
## [254]  30.1  30.9  45.4  39.2  49.7  43.3  39.8  29.7  25.4  19.0   8.1
## [265]  18.0  25.7  29.8  43.9  33.6  35.7  21.3  25.6  19.3  21.8  21.0
## [276]  29.0  39.3  40.6  19.2  21.9  17.3  16.8  13.7  22.5  27.3  28.7
## [287]  36.4  52.0  56.0  13.0  25.2  32.6  21.5  23.1  36.8  24.7  14.0
## [298]   9.9   9.6   2.5  -2.2   5.4  18.7  32.2  43.7  41.0  44.6  26.8
## [309]  37.1  30.6  22.7  23.3  46.4  49.2  44.8  50.7  61.6  20.2  23.5
## [320]  33.4  21.1  13.9  14.3  56.4  56.9  42.6  32.0  27.0  27.6  49.8
## [331]  54.9  29.5  38.8  48.9  40.5  53.9  34.3  38.1  31.8  36.1  38.3
## [342]  62.2  40.8  60.0  66.8  65.5  65.9  44.3  54.6  44.9  52.2  56.7
## [353]  51.8  56.1  64.3  70.0  57.6  46.2  45.5  51.1  67.8  78.0  63.9
## [364]  60.6  67.2  67.4  63.5  65.6  70.2  76.7  72.4  68.6  75.2  80.8
## [375]  81.4  69.8  70.9  77.1  78.4  80.6  70.3  68.3  70.7  69.2  78.1
## [386]  79.9  80.2  71.5  73.0  78.7  79.5  79.2  79.0  75.8  73.4  60.9
## [397]  60.4  62.0  57.1  56.6  69.1  58.6  52.4  48.5  47.0  51.4  59.8
## [408]  63.6  62.1  53.3  65.2  40.0  32.8  39.7  43.1  29.4  37.4  31.6
## [419]  46.6  44.5  28.8  35.3  40.1  59.4  52.7  15.7  12.1  21.2  28.9
## [430]  38.0  59.0  60.8  22.3  28.6   3.2   5.1   9.3   5.6   7.8  31.4
## [441]  18.4  26.2  17.8  27.1  42.0  41.5  34.9  27.4  30.0  29.6  49.9
## [452]  54.7  40.3  42.7  36.3  52.3  47.5  43.8  46.5  39.5  53.2  51.2
## [463]  51.3  39.0  54.0  29.2  34.1  50.9  48.2  52.1  48.8  57.9  56.5
## [474]  58.7  53.8  53.1  54.4  67.9  57.8  64.6  63.4  68.7  72.7  80.0
## [485]  73.1  77.9  69.4  65.0  74.4  78.6  85.5  72.5  78.8  68.8  68.4
## [496]  62.6  62.5  71.0  71.2  71.1  69.3  67.3  63.3  74.8  59.9  57.2
## [507]  60.1  52.9  47.3  46.7  38.7  45.1  36.7  45.3  45.0  42.8  35.4
## [518]  25.8  57.0  41.9  24.2  31.2  36.2  31.9  42.9  42.5  39.6  34.7
## [529]  28.5  21.6  37.7  28.4  37.0  37.3  32.4  32.7  34.0  34.8  35.9
## [540]  40.7  40.4  42.4  43.0  47.1  34.5  19.8  18.1  55.1  48.6  35.6
## [551]  74.2  50.3  44.2  59.2  53.5  65.7  64.4  71.7  80.4  83.0  82.8
## [562]  81.8  77.4  76.3  72.8  77.2  70.1  54.5  53.4  56.3  50.1  53.7
## [573]  45.7  33.9  46.3  49.0  36.0  24.3  14.2 -99.0  13.4  24.4  11.5
## [584]   4.0  19.5  16.7  19.9  50.2  38.5  37.8  27.5  36.9  32.9  41.2
## [595]  30.3  41.6  55.6  79.8  78.2  67.7  80.5  82.5  78.5  77.7  84.0
## [606]  83.9  87.7  87.0  61.9  66.6  53.0  45.9  43.2  29.9  37.9  20.0
## [617]  12.8  28.2  24.1  19.6  40.9  30.7  25.5   7.1  26.7  14.7   8.2
## [628]   8.9  10.7  22.4  23.6  32.1  34.4  36.6  61.7  48.1  50.5  60.5
## [639]  62.9  68.0  68.2  75.1  58.8  20.5  24.8  26.1  12.4  13.2  22.6
## [650]   6.6  10.9   7.9  18.3  16.3  22.0  15.5  20.9  29.3  68.5  79.6
## [661]  80.7  47.2  55.5  25.3  17.1  19.1  18.8  15.0  23.9  56.2  78.3
## [672]  82.2  81.9  82.0  83.8  85.3  81.1  38.2  27.8  24.9  27.2  32.3
## [683]  18.9  16.1  17.9  15.8   5.3  33.8  27.7  14.5  23.4  51.5  30.2
## [694]  28.1  20.6  16.6  17.0  16.0  18.6  10.8   3.4  25.0  11.3  20.3
## [705]   8.0   8.3  11.9  11.2  11.8  32.5  28.0  26.0  83.2  81.0  79.4
## [716]  85.6  85.8  84.1  83.6  20.1  41.8  83.4  83.7  16.2  12.6  12.2
## [727]  10.0   9.1  84.2  86.2  84.8  85.9  84.3  87.3  86.4  21.7  15.2
## [738]  24.6  31.0  81.3   7.6  11.7   1.8  14.6  17.2  13.1  15.1  15.4
## [749]  16.9  19.4  83.1  86.1  85.0  17.4  13.5   7.4  85.1  82.6  86.5
## [760]  84.7  87.6  89.2  87.8  16.5  10.5  10.1  -1.6   3.9   3.6   2.0
## [771]   6.1  17.5  20.8  28.3   4.8  12.0  16.4   1.5  17.7  10.4  12.7

From the unique values we know that the first column contains the days or the date of the month. From the second values we know that the data give is the month of the year. The third column is the year in which the temperature is recorded. The fourth column has the temperature values.

Summary command summarizes the data set. The mean, median and the quartile ranges are obtained using summary. nrow gives the count of the rows in the data.

  summary(datacincin)
##      Month             Day             Year       Temperature    
##  Min.   : 1.000   Min.   : 1.00   Min.   :1995   Min.   :-99.00  
##  1st Qu.: 4.000   1st Qu.: 8.00   1st Qu.:2000   1st Qu.: 40.10  
##  Median : 6.000   Median :16.00   Median :2005   Median : 57.00  
##  Mean   : 6.479   Mean   :15.72   Mean   :2005   Mean   : 54.46  
##  3rd Qu.: 9.000   3rd Qu.:23.00   3rd Qu.:2011   3rd Qu.: 70.70  
##  Max.   :12.000   Max.   :31.00   Max.   :2016   Max.   : 89.20
  nrow(datacincin)
## [1] 7963

Next, the null values, if any, in the data set are obtained.

  sum(is.na(datacincin$Month))
## [1] 0
  sum(is.na(datacincin$Day))
## [1] 0
  sum(is.na(datacincin$Year))
## [1] 0
  sum(is.na(datacincin$Temperature))
## [1] 0

The following visualizations are plotted to understand the data better.

  hist(datacincin$Temperature)

The above histogram shows the mean and the ranges of the temperature.

The following graph shows the barplot of month versus the temperature. The temperature variations over the different months can be observed.

  counts <- table(datacincin$Month,datacincin$Temperature)
  barplot(counts,col="darkgreen",border="red")

The following graph shows a plot of the year against the temperature. The year is plotted on the y axis and temperature on the X axis.

plot(datacincin$Year,datacincin$Temperature,col="darkgreen")