Warming Temperatures Pt.1

The scientific community at this point is nearly unanimous in their belief that global temperatures across the earth are rising and these higher temperatures are likely as a result of man made activities. The people at NASA have been kind enough to put together a great resource on the web with literature related to this fact that anyone can review that can be found here: NASA

I didn't want to repeat research that has already been done, but I have a few things that I don't know yet in detail about global warming that I've always wanted to know:

  • What does warming actually look like in a given city? What is the magnitude and can we use some statistical techniques to find out just how much the temperature at that location has changed?

  • On average what does the warming look like across cities? Implicit in this question I think is, are there any cities where there is no warming at all or actually significant declines in temperatures?

There is a lot to cover here, so I'm going to address each of these one project at a time. To begin, I'm going to take a look at Ho Chi Minh City in Vietnam. The data for this analysis comes from our friends at Kaggle and Berkeley Earth: Data

Overview

There are two charts below. The top is a review of the average annual temperature in Ho Chi Minh City from 1862 to 2012. The second chart below is a look at the average uncertainty in the measurement over time. What we can see is there is a strong trend up and to the right in the temperature chart, while the uncertainty of the measurement decreases over time.


Splitting Time

From here, the next thought that came to mind for me was, if we were to cut the temperature data in half (by date) and compare the two periods side by side, what does the median temp of each subset look like and how much higher is the mean/median in the later period than in the earlier?


The details of mean and median temps in the first half and second half of the dataset are in the chart below. As you already could see from the charts above, the second half median temp is higher than the first. Next, let's test differences between the first and second half mean temps to see if these differences are statisticaly significant.

Figure 1: Median and Mean Temperature, 1H & 2H
1863 - 1938 1939 - 2012
Mean 26.934 27.586
Median 27.124 27.815

Significance Testing

First we're going to investigate the QQplots and histograms to see if our data are approximately normally distributed. The QQplot and histogram below are for the H1, (first half), dataset. What we notice is that the data do not appear to be noramlly distributed. We see a fare amount of skew in the data, which would tend to make sense to me if I think of it in this manner: Vietnam is simply just hot most of the time. There tends to be many more months and years where temperatures tend towards the warmer side.


Looking at the QQplot and the histogram for H2, the second half, I notice a few interesting things. The number of observations in the range between 28-30 degrees celcius is much higher than in the first half, which again fits our general theme of warming over time. However, again the data is not approximately normal and shows a fair amount of skew towards higher temps.



Because the sample size for each set H1 and H2 is large, n = 900 for each, we're going to assume that the t-distribution tends to be approximately normal for our test case. By doing so, we could make the argument that it is appropriate to ignore the requirement of normality in our datasets. In addition, I'm going to use Welch's T test rather than Students T to take care of the homogeneity of variance requirement. The results of the call are below.

Figure 2: T Test Results, 1H & 2H
Method T Statistic P.Value Conf.Int(LB) Conf.Int(UB) Mean of H1 Mean of H2
Results Welch Two Sample t-test -12.139 1.23252089959293e-32 -0.75713 -0.5465 26.93386 27.58567


Looking at the data above, I interpret it as follows: it is likely that the difference in mean temperature in Ho Chi Minh City between the first and second half of the dataset is significant. This tends to support the observation that Ho Chi Minh City is in fact warmer, just as we have heard in pronouncments from major scientists.

Warming Temperatures Pt.2

For the second part of this analysis, because of the experience that I wanted the user to have, I've created a standalone web application that can be used to answer the question that I posed above: what does warming look like across all major cities? My answer can be found here: WarmingPT2