This is a simple data visualization exercise from Coursera UIUC Data Visualization course.

The data used in this assignment contains world temperatures and comes from NASA (Table Data: Global and Hemispheric Monthly Means and Zonal Annual Means).

In this assignment I’ll use data visualization skills learned from the course to explore some insights from the data set. The visualization tool I’m using is R, with ggplot2 package.

World Temperature Data

Read in the data set and take a look at the head as shown below:

  Year Glob NHem SHem X24N.90N X24S.24N X90S.24S X64N.90N X44N.64N
1 1880  -19  -33   -5      -38      -16       -5      -89      -54
2 1881  -10  -18   -2      -27       -2       -5      -54      -40
3 1882   -9  -17   -1      -21      -10        4     -125      -20
4 1883  -19  -30   -8      -34      -22       -2      -28      -57
5 1884  -27  -42  -12      -56      -17      -11     -127      -58
6 1885  -31  -41  -21      -61      -17      -20     -119      -70
  X24N.44N EQU.24N X24S.EQU X44S.24S X64S.44S X90S.64S
1      -22     -26       -5       -2       -8       39
2      -14      -5        2       -6       -3       37
3       -3     -12       -8        3        8       42
4      -20     -25      -19       -1        0       37
5      -41     -21      -14      -15       -5       40
6      -43     -11      -23      -27       -7       38

The data set contains time information Year, global average temperature Glob, north hemisphere average temperature NHem and south hemisphere average temperature SHem. I don’t know what the rest of the columns mean actually…

Visualization

First, I would like to take a look at the distribution of Glob, NHem, SHem with histogram.

The first plot shows the global average temperature has a bimodal distribution. The range is roughly from -50F to 80F. The most common temperature from 1880 to 2014 is somewhere between -20F and -10F, and there is another peak between 60F and 65F, although the second peak is far lower than the first one.

The next two plots illustrate the average temperature distribution in north and south hemisphere respectively. As we can see, they’re both nearly bimodal distribution. However the range of the north hemisphere (-55F to 95F) is much larger than the south hemisphere (-50 to 60), especially in the higher temperature part. So people living in the north hemisphere will probably come across higher temperature more often than people living in the south hemisphere.

Next, I want to explore how temperature changes along with time.

The plot indicates that the overall trend of the global temperature is increasing, but there is a little bit decreasing during the late 19th century. Ever since that, it looks like the global average temperature is increasing all the time.

The average temperature of the north and south hemisphere basically follow the same trend as the global. The south has a much smoother trend line than the north.

From the above visualization, we can get a general idea of how the average temperature is distributed, and how it changes along with time. Next step, if we have more related data, we can dig deeper into the data set and try to find out the reason why the average temperature is increasing in the past decades (Is global warming a truth or a lie?).