Abstract
One of the most prominent graphs in climate activism — the correlation between CO2 and global temperature over the last 400,000 years, based on the Vostok Ice Cores — is misleading when it is shown in order to bolster the case that CO2 increases have caused global warming. The graph intermingles two effects. Temperature can predict CO2. CO2 can predict temperature. Both effects are well known and one does not preclude the other. When presented in a CO2-temperature context, this intermixing renders the graph deceptive. In truth, the prime problem of the analysis is the high auto-correlation of changes in CO2. It makes it difficult to pin down the role of CO2 shocks on temperature innovations. Put into econometric-analysis terms, the x variable is poorly identified. It is itself endogenous most of the time. Perhaps most importantly, this joint 400,000 year history is not relevant today. Although the driving forces of CO2 were not well-known for most of this long period, scientists know that humanity has injected about 130 ppm of CO2 into the atmosphere, unrelated to other factors, during the last 200 years.The source of the original data is the Vostok ice core data on
temperature and CO2.1 The data itself is stored locally in a file
called temp-co2-solar-422k.csv. First, we need to read
it.
options(digits=2, width=200)
library(formatR)
library(data.table)
dlvl <- read.csv("temp-co2-solar-422k.csv")
dlvl <- dlvl[complete.cases(dlvl),]
summary(dlvl)
## year temp co2 solar
## Min. :-414000 Min. :-9.2 Min. :182 Min. :388
## 1st Qu.:-311100 1st Qu.:-7.0 1st Qu.:204 1st Qu.:424
## Median :-208200 Median :-5.5 Median :225 Median :441
## Mean :-208200 Mean :-4.9 Mean :227 Mean :441
## 3rd Qu.:-105300 3rd Qu.:-3.5 3rd Qu.:248 3rd Qu.:456
## Max. : -2400 Max. : 3.2 Max. :298 Max. :497
cat("N= ",nrow(dlvl), "\n")
## N= 4117
The data are regularly spaced in 100 year intervals each. With 4,117 observations, and many ups and downs, we have a fairly large dataset. Statistical tools are well-suited to uncovering reasonably reliable relationships in such large data sets.
The following is the “canonical” plot of concern. It is commonly shown to suggest to naive audiences that CO2 has caused global warming in the past.
with(dlvl, {
plot(year/1000, scale(temp), col="blue", xlab="Relative Year, in Thousands", ylab="Z-Score",
type="l", lwd=3, main="Misleading Association Graph of 400,000 Years")
lines(year/1000, scale(co2), col="brown", lwd=3)
lines(year/1000, scale(solar), col="orange")
legend("topleft", col=c("brown","blue"), lty=1, box.col="white", lwd=3, legend=c("CO2","Temp") )
})
Clearly, CO2 and temperature move together. The associated verbal suggestion — sometimes explicit, sometimes left to the audience’s imagination — is often that CO2 determines temperature.2 The visually implied X-Y association can also be graphed:
with(dlvl, {
plot(co2, temp, xlab="CO2", ylab="Temperature", main="Misleading Association Graph of 400,000 Years, XY", cex=0.4)
lines(co2,temp, col="gray")
abline( lm(temp ~ co2), col="blue", lwd=2)
})
The blue line is the best-fit OLS line, \(\text{temp}_t = -24 + 0.084*\text{CO2}_t\). The suggestion in both plots is that an increase in 100 ppm of CO2 (about 33% of the mean CO2 in the data) predicts warming of 8.4°C. This is of course absurd.
This naive interpretation, that this graph even suggests that CO2 drove global warming, is misleading and therefore wrong.
Anyone who understands basic data analysis should understand this plot is misleading as far as establishing a (causal) link from CO2 to temperature is concerned. Indeed, the only point of my analysis is to plead not to use this graph any longer. This plotted relationship is a misleading and classic example of a spurious relation. A classic example is the association between ice cream sales and murders. Both are higher in summer, and the two plots between ice-cream sales and murder would look just like two plots of CO2 and temperature above.
There are better ways to analyze the CO2, temperature, and solar data, shown below. These better ways address the facts that the graph misleads with respect to two problems:
Could a third variable — such as trends, volcanos, solar radiation, or anything else — have caused (co-)variation in both CO2 and temperature?
Is CO2 causing warming or is warming causing CO2, or are both causing one another?
The remedy to the first problem is to work in changes of variables, not in levels of variables. The remedy to the second problem is to work with lead-lag associations. I am not the first to have noticed that temperature changes can also anticipate CO2 changes. However, some climate-change critics have jumped to the equally incorrect conclusion that such feedback effects then reject the hypothesis that CO2 drives temperature. Feedbacks are not mutually exclusive with respect to the hypothesis of interest, which is whether CO2 changes anticipate temperature changes. Section 3 below analyzes the two data series to disentangle both directions below.
If the true model is \(y_t = a + b*x_t\), then it follows algebraically that \((\Delta y_t) = 0 + b*(\Delta x_t)\), where \(\Delta\) is the change (also called the first difference). A good test uncovering many spurious correlations is to estimate regressions both in levels and differences. If the coefficient \(b\) is not the same (or at least similar) in both regression models, it suggests that the level correlation is spurious. In such a case, we would have learned that \(y_t = a + b*x_t\) was not the correct model to begin with.
To work with changes, it is useful to define and work with “chg” and “lag” R functions.
chg <- function(x,...) { o <- x - shift(x,...); names(o) <- paste0("d",names(x)); o }
lag <- function(x,...) { o <- shift(x,...); names(o) <- paste0("l",names(x)); o }
## the above use data.table's shift function
ds <- cbind(dlvl, chg(dlvl)); rownames(ds) <- NULL ## combine levels and changes into one data set
print(head(ds)) ## show the output to make it easier to understand how this works
## year temp co2 solar dyear dtemp dco2 dsolar
## 1 -414000 0.84 285 443 NA NA NA NA
## 2 -413900 0.83 285 443 100 -0.010 -0.28 0.29
## 3 -413800 0.82 285 444 100 -0.009 -0.28 0.30
## 4 -413700 0.81 284 444 100 -0.009 -0.28 0.30
## 5 -413600 0.80 284 444 100 -0.008 -0.28 0.30
## 6 -413500 0.85 284 445 100 0.051 -0.28 0.30
Here are some basic background statistics on our data, both levels and changes:
print(summary( ds ))
## year temp co2 solar dyear dtemp dco2 dsolar
## Min. :-414000 Min. :-9.2 Min. :182 Min. :388 Min. :100 Min. :-1.67 Min. :-13.0 Min. :-1.28
## 1st Qu.:-311100 1st Qu.:-7.0 1st Qu.:204 1st Qu.:424 1st Qu.:100 1st Qu.:-0.11 1st Qu.: -0.4 1st Qu.:-0.40
## Median :-208200 Median :-5.5 Median :225 Median :441 Median :100 Median :-0.01 Median : -0.1 Median :-0.01
## Mean :-208200 Mean :-4.9 Mean :227 Mean :441 Mean :100 Mean : 0.00 Mean : 0.0 Mean : 0.00
## 3rd Qu.:-105300 3rd Qu.:-3.5 3rd Qu.:248 3rd Qu.:456 3rd Qu.:100 3rd Qu.: 0.12 3rd Qu.: 0.3 3rd Qu.: 0.41
## Max. : -2400 Max. : 3.2 Max. :298 Max. :497 Max. :100 Max. : 1.92 Max. : 6.0 Max. : 1.47
## NA's :1 NA's :1 NA's :1 NA's :1
Theoretically, it would be better not to work with plain changes, but with (log one plus) percent changes in CO2 (and also for temperature in Kelvin). Trust me that it matters little, except that the exposition is easier if I just use plain differences in CO2 as I do here, because the units are more familiar.
We first define a function that returns only the regression coefficients that we want to see, thereby removing a lot R output clutter.
showcoef <- function( formula, controls= (~.), data=ds )
coef(summary(lm( update(formula,controls), data=data)))[,c(1,3)]
Here is the basic level regression, \(\text{temp}_t = a + b*\text{CO2}_t\):
showcoef( temp ~ co2 )
## Estimate t value
## (Intercept) -23.910 -139
## co2 0.084 111
Now, the regression in differences of \(\Delta\text{temp}_t = a + b*(\Delta\text{CO2}_t)\):
showcoef( dtemp ~ dco2 )
## Estimate t value
## (Intercept) -0.00044 -0.1
## dco2 0.03310 6.3
The coefficient estimate of 0.03 is much smaller than 0.08, suggesting spurious level trend correlation in the prior regression.3
For the same reason that first differencing should yield the same coefficient if the model is reasonably correct, so should second differencing:
showcoef( chg(dtemp) ~ chg(dco2) )
## Estimate t value
## (Intercept) -0.00032 -0.055
## chg(dco2) 0.01767 1.415
Again, even the change regression contains spurious trends. It is only this last changes-in-changes regression that is finally stable with respect to further differencing.
Note that the above regression inputs were still contemporaneous. They only solve the spurious correlation issue with respect to trends. They do not address the question of whether CO2 drives warming or vice-versa.
A better test is based on the idea that if CO2 really changes temperature, then (unexpected) changes in CO2 should anticipate (unexpected) changes in temperature. Econometricians call this Granger-Sims causality (GSC).4 GSC is a necessary but not a sufficient condition to read potential causality into data, even if there are no important omitted variables. If there is no GSC, then the data cannot show that CO2 causally influences temperature. (This will not be a problem in this data, however. The problems are elsewhere.) GSC tests may be too weak in actual data to find an association that exists. (The obvious example is to consider a case in which one has only 2 data points.) In this case, without GSC, the data is only decisive in stating that they are inconclusive, not that there is no association.
Although the correct way to estimate the structure are Sims’ vector-autoregressions (VARs), which we shall do in section 3.2, we start with simple OLS regressions in section 3.1. The estimates are similar.
First, consider changes in CO2. The strongest association in the data, by far, is that they are highly autocorrelated:5
showcoef( dco2 ~ lag(dco2) )
## Estimate t value
## (Intercept) 0.00021 0.031
## lag(dco2) 0.83836 98.586
# showcoef( dco2 ~ lag(dco2) + I(lag(dco2)*(abs(dco2)>1)) ) # large shocks are more autocorrelated than small ones.
This is stable and applies even to changes in changes in CO2:
showcoef( lag(dco2) ~ lag(lag(dco2)) )
## Estimate t value
## (Intercept) 0.00018 0.025
## lag(lag(dco2)) 0.83831 98.555
The data inform us that once CO2 changes got going into a particular direction, they tended to continue the same direction. CO2 changes had strong internal dynamics, almost random-walk like: whenever CO2 increased over 100 years, it strongly tended to increase over the next 100 years again, and by almost as much. Ergo, when a shock to CO2 occurs, it has long-term effects, far beyond a century. The half-life of shocks to changes in CO2 is about \(-\log(2)/\log(0.83)\approx3.7\) centuries.
I am not a climate scientist. My subject expertise (unlike my data analysis expertise) is very limited. I do not understand why CO2 changes were so highly autocorrelated. Not shown, this seems not to be driven by temperature or solar forcing. Buffers likely play a role. Presumably, when a shock to CO2 occurs for whatever reasons, it take earth centuries to undo it.
Empirically, the autocorrelation of CO2 is what will make it difficult to determine how CO2 changes influenced temperature changes — CO2 changes over the last 100 years tend to look very similar to CO2 changes from, say, 500 to 600 years ago.
The true variable of interest are changes in temperatures, not changes in CO2. Rather than start with the simplest regression, the following regression already includes a host of controls, to be explained in a moment.
showcoef( dtemp ~ lag(dco2) , ~ . + lag(dsolar) + lag(dtemp) +
lag(temp) + lag(dtemp)*lag(temp) + lag(co2) + lag(solar) + dsolar )
## Estimate t value
## (Intercept) -0.77886 -5.20
## lag(dco2) 0.03046 5.73
## lag(dsolar) 0.03540 0.37
## lag(dtemp) -0.09593 -3.36
## lag(temp) -0.02149 -6.22
## lag(co2) 0.00150 4.61
## lag(solar) 0.00076 3.39
## dsolar -0.02708 -0.28
## lag(dtemp):lag(temp) -0.04711 -8.65
This regression suggests the following:
When solar radiation was high, temperature tended to increase (0.00076). Changes in solar radiation were fairly unimportant (-0.02708), suggesting a very slow response of temperature to solar radiation.
When lagged CO2 was high (0.0015) and when lagged CO2 increased recently (0.03046), temperature tended to increase. This is good evidence that changes in CO2 influenced future warming, both short-term and long-term. (The relation is robust to inclusion or exclusion of many control variables, such as the state variables included here, too.)
Earth has a strong thermostat: (a) when temperature was high, it tended to go down (-0.021); (b) when temperature recently has gone up, it tended to go down again (-0.096); and (c) temperature really wanted to go down again if both temperature was high and temperature had recently gone up (-0.047).
There are two interesting earth-science questions beyond my expertise related to this regression output.
The first concerns earth’s thermostat. It seems to work, even controlling for solar forcing and CO2. What is its cause? Will it continue?
The second concerns the magnitude of the coefficient estimate on
lag(dco2). It’s still way too big. It suggests that on the
margin, an increase of 250 ppm (about doubling) in CO2 predicts global
warming of nearly \(0.03046\times250\approx7.6\)°C. Standard
climate change models would suggest increases of less than half this
much, about \(3\)°C. The 400,000
association between lagged CO2 changes and temperature changes
was far too strong. (This is even more worrisome because Archer suggests
that about 3/4 of the CO2 disappears in the carbon cycle rapidly before
it has much opportunity to drive the greenhouse effect.)
The improved statistical estimation method are Sims’ vector autoregressions. They estimate equations on both \(\Delta \text{CO2}\) dynamics and \(\Delta \text{Temp}\) dynamics together to disentangle better how they influence one another (in innovations, too). The specification explicitly allows for the two variables to influence one another, too.6
We first need to do some basic setup and create variables for the package.
library(vars)
## Loading required package: strucchange
## Loading required package: urca
# define some variables used later as controls
ds <- within(ds, {
lagsolar <- lag(solar)
lagco2 <- lag(co2)
lagtemp <- lag(temp)
})
dvar <- ds[complete.cases(ds),]
dvar.mainseries <- subset( dvar, T, select=c(dco2,dtemp) ); ## the two key var variables
We begin with a one-lag VAR analysis. The format of the coefficient-test output is first the dependent variable, then the independent variables, then an indicator of the lag.
var1 <- VAR( dvar.mainseries, type="none", lag.max=1 )
print(var1)
##
## VAR Estimation Results:
## =======================
##
## Estimated coefficients for equation dco2:
## =========================================
## Call:
## dco2 = dco2.l1 + dtemp.l1
##
## dco2.l1 dtemp.l1
## 0.837 0.046
##
##
## Estimated coefficients for equation dtemp:
## ==========================================
## Call:
## dtemp = dco2.l1 + dtemp.l1
##
## dco2.l1 dtemp.l1
## 0.024 0.100
coeftest(var1)[,c(1,3)]
## Estimate t value
## dco2:dco2.l1 0.837 98.0
## dco2:dtemp.l1 0.046 1.9
## dtemp:dco2.l1 0.024 4.5
## dtemp:dtemp.l1 0.100 6.4
This suggests (as before) that
Carbon-dioxide changes, \(\Delta\text{CO2}\), are highly autocorrelated (0.837). When CO2 has increased, it wants to continue to increase. When CO2 has decreased, it wants to continue to decrease. Of all the association in this 400,000 year data, it is by far the strongest one. It is evidence of strong buffers and/or a strong CO2 feedback effect on earth.
When temperature has recently gone up, \(\Delta\text{CO2}\) wants to go up just a little more (0.046).
When temperature has recently gone up, then the temperature wants to go up just a little more (0.1).[^This disappears with better control for the level of temperature and recent temperature changes interacted. The reason can be inferred due to the acclerating shape of the xy graph above.]
The association of most interest to us: When CO2 has recently gone up, then the temperature wants to go up just a little more. This is what we found before, and the magnitude of the coefficient remains troubling. The coefficient is still far too large, suggesting a warming effect of about \(2.4\times250\approx6\)°C for a doubling of CO2. And this is even more disconcerting, because it is not even for a recent 1-50 year increase but for a 100-year lagged increase in CO2.
The theory further predicts that the coefficient on lag CO2 should decrease with lag. A change in CO2 this century should have more ability to predict the temperature in the next 100 years than, say, in 100 years in five centuries. This is a quasi-placebo test. There should be very little association beyond the first one or two lags.
The following includes 10 lags of CO2 changes, i.e., the last ten centuries. The printouts are only for the coefficients on the \(\Delta\text{CO2}\) predictors in the \(\Delta\text{temp}\) prediction, although the analysis itself remains based on the full VAR.
var10 <- VAR( dvar.mainseries, type="none", lag.max=10)
## with controls: use VAR( dvar.mainseries, lag.max=10, exog= subset( dvar, T, select=c(lagsolar, lagco2, lagtemp) )
var10
##
## VAR Estimation Results:
## =======================
##
## Estimated coefficients for equation dco2:
## =========================================
## Call:
## dco2 = dco2.l1 + dtemp.l1 + dco2.l2 + dtemp.l2 + dco2.l3 + dtemp.l3 + dco2.l4 + dtemp.l4 + dco2.l5 + dtemp.l5 + dco2.l6 + dtemp.l6 + dco2.l7 + dtemp.l7 + dco2.l8 + dtemp.l8 + dco2.l9 + dtemp.l9 + dco2.l10 + dtemp.l10
##
## dco2.l1 dtemp.l1 dco2.l2 dtemp.l2 dco2.l3 dtemp.l3 dco2.l4 dtemp.l4 dco2.l5 dtemp.l5 dco2.l6 dtemp.l6 dco2.l7 dtemp.l7 dco2.l8 dtemp.l8 dco2.l9 dtemp.l9 dco2.l10 dtemp.l10
## 0.84864 0.05839 0.02070 0.01303 -0.03842 0.05003 0.00097 0.08456 -0.01525 0.00036 0.00038 0.09773 -0.11836 0.06611 0.06887 0.06901 0.03370 0.07269 0.01032 0.00906
##
##
## Estimated coefficients for equation dtemp:
## ==========================================
## Call:
## dtemp = dco2.l1 + dtemp.l1 + dco2.l2 + dtemp.l2 + dco2.l3 + dtemp.l3 + dco2.l4 + dtemp.l4 + dco2.l5 + dtemp.l5 + dco2.l6 + dtemp.l6 + dco2.l7 + dtemp.l7 + dco2.l8 + dtemp.l8 + dco2.l9 + dtemp.l9 + dco2.l10 + dtemp.l10
##
## dco2.l1 dtemp.l1 dco2.l2 dtemp.l2 dco2.l3 dtemp.l3 dco2.l4 dtemp.l4 dco2.l5 dtemp.l5 dco2.l6 dtemp.l6 dco2.l7 dtemp.l7 dco2.l8 dtemp.l8 dco2.l9 dtemp.l9 dco2.l10 dtemp.l10
## 0.0219 0.1116 -0.0060 -0.2423 0.0157 -0.0257 0.0016 -0.0368 -0.0125 0.0140 0.0229 0.0134 -0.0163 -0.0053 0.0371 0.0278 -0.0116 0.0392 0.0081 -0.0460
The important coefficients (past changes in CO2 predicting the change in temperature) are now graphed:
plotvarcoef <- function( ctbl , vnm="CO2", yl=0.05 ) {
plot( 1:nrow(ctbl), ctbl[,1], xlab=paste("100-Year Lag of",vnm,"Change"), ylab=paste("Coefficient on",vnm,"Change"), type="b", ylim=c(-yl,yl), main="Explaining 100-Year Ahead Temperature Changes")
lines( 1:nrow(ctbl), ctbl[,1] + 1*ctbl[,2], col="gray", lty=2 )
lines( 1:nrow(ctbl), ctbl[,1] - 1*ctbl[,2], col="gray", lty=2 )
lines( 1:nrow(ctbl), ctbl[,1] + 2*ctbl[,2], col="gray", lty=3 )
lines( 1:nrow(ctbl), ctbl[,1] - 2*ctbl[,2], col="gray", lty=3 )
lines( c(0,20), c(0,0), lty=2, col="gray")
points( 1, ctbl[1,1], col="blue", cex=2)
}
coef.dtemp <- coef(var10)$dtemp
coef.dtemp.dco2 <- coef.dtemp[grepl("dco2", rownames(coef.dtemp)),]
# print(coef.dtemp.dco2[,1])
plotvarcoef( coef.dtemp.dco2 )
The data analysis fails the placebo test.
Archer explains that the theory says that the coefficients should be decaying. More specifically, theory predicts a long-run coefficient of about 0.01 (in this sample, a 250 ppm increase should induce a 3°C increase). It should be split into about 0.0075 on lag0, 0.002 on lag1, and lower coefficients on further lags. The most recent CO2 change should be more powerful than more lagged CO2 changes.
If I were to claim that this data suggests that changes in CO2 have driven changes in temperature, then
why are CO2 changes from many centuries ago similarly powerful as the most recent CO2 changes?
why are the coefficient estimates so large?
The statistical reason for the first part of this mess is the high correlation among CO2 changes. When CO2 increased in the last 100 years, it also likely increased before. As far as the regression is concerned, many recently past CO2 changes look somewhat alike in their power and could have been responsible for their influence on predicting temperature changes. 4,000 centuries should have been enough to uncover the relationship, but just weren’t. The reason for the second part of this mess, the terribly high coefficient estimates are a mystery to me.
The data absolutely do not reject the hypothesis that changes in CO2 drive changes in global warming. (I would go further and characterize this as “they hint at an association.”) The data just do not reject the hypothesis that the relationship is not strong enough to identify this relationship cleanly. Thus, the use of the visual graph at the outset is not only misleading (for ignoring the reverse association), it is badly misleading.
Absolutely not!
This is not evidence against the role of CO2 on global temperatures. It is only evidence that the theoretically predicted relation in this data set is difficult to uncover, because we have such strong autocorrelation of changes in CO2. We have absence of evidence in this graph, not evidence of absence in this data.
There are good reasons for this. In particular:
CO2 or global temperatures could be measured with too much noise. It’s not like we had satellites measuring CO2 and temperature for hundreds of thousands of years. The measurement comes from proxies and only in one place.
There are state variables (buffers) in the system that can obscure the relationship to the point where the graph is highly misleading.
As in all statistical analysis, theory and empirical identification of more variables can improve the estimated associations. Advances in knowledge could point either way. The inclusion of omitted control variables could bolster the case for a causal association of CO2 on future warming or it could undermine it. The answer to whether CO2 causes warming is beyond the analysis here — indeed shown to be beyond the analysis feasible merely with CO2, temperature, and solar data, even using state information — and not the expertise of the author.
The question examined in this writeup was not whether CO2 causes warming, but whether the canonical graph is reasonably representative of the association in the data and the predicted association from the theory. The answer is a clear no. The canonical graph is highly misleading. It is not solid empirical evidence in favor of a role of CO2 in warming. There is evidence of unaccounted trends, omitted variables, and feedback effects. The graph is not even mildly representative of what can be gathered from a better analysis of the data, either. It is best not shown to unsuspecting audiences.
It is not very relevant at all.
Scientists have known for a long time that the graph reflects feedback effects and omitted variables. They are not evidence for or against a causal effect of CO2 on temperature. This data set — likely the best public dataset available at the moment covering this 400,000 year span — suggests that the empirical evidence of how CO2 influenced temperature in prehistoric times is insufficient in itself. It does not suggest that the relationship was not there, only that the inference must be based on other evidence. The canonical graph is misleading and should not be shown in order to bolster the case for the relationship.
My note has clarified how it is the auto-correlation in CO2 changes that makes reliable and proper inference from the data so difficult, even ignoring the misleading aspects of the presentation of the canonical graph. Thus, the attention on the canonical graph seems misplaced. The problem in the prehistorical data is that the causes of the CO2 increases were not known and difficult to disentangle. In contrast, scientists are sure that it was humanity that has injected about 130 ppm of CO2 into the atmosphere over the last 100 years. Thus, this is an entirely different situation at hand today.
Sea-level does not seem to make much difference.
Further use of solar forcing controls does not make much difference.
If there are no omitted obscuring variables, then:
Causation implies correlation.
Correlation does not imply causation.
**Absence of correlation implies absence of causation.**
Archer, David, The Long Thaw, 2016
Leamer, Edward, VECTOR AUTOREGRESSIONS FOR CAUSAL INFERENCE?, Carnegie Papers.
Stock, James H. and Mark W. Watson, Vector Autoregressions, Working Paper.
Lorius, Claude et al. The ice-core record: climate sensitivity and future greenhouse warming, Nature, 1990.
Rahmstorf, Stefan . Cosmic Rays, Carbon Dioxide, and Climate, EOS 2004.
Note that the references are dated, because the graph is. It continues to be prominently displayed, though.
* http://www.climatedata.info/proxies/data-downloads/ , itself originally from - http://www.ncdc.noaa.gov/paleo/indexice.html and - http://www1.ncdc.noaa.gov/pub/data/paleo/climate_forcing/orbital_variations/berger_insolation/ (solar forcing data, in m/W^2, at 65degree north, mid-July).↩︎
The orange line is solar heat hitting the planet, caused by astronomical variations. For the most part, the data show that it had relatively little influence.↩︎
The T-statistic is smaller in differences than in levels, but this is always the case and does not suggest a problem. Note that for 4,000 or so observations, a T-statistic of 5 is not that uncommon. It just means there is good statistical relationship. It does not tell you whether the association itself is strong. For this judgment, you want to assess the magnitude of the coefficient multiplied by the spread in the variable itself.↩︎
GSC is still not conclusive proof of a causal role of CO2 on temperature (in the same sense that the weather forecast comes first but it does not cause the weather).↩︎
Oddly, large changes in CO2 are more autocorrelated than
small changes. Adding I(lag(dco2)*(abs(dco2)>1)) yields
a coefficient of 0.58 for the ordinary dco2 autocoef and 0.58+0.40 for
the large-change dco2 autocoef. This contradicts the idea that exogenous
shocks push CO2 away and the autocorrelation then primarily comes from
buffers that slow down mean-reversion.↩︎
However, in this data set, it turns out that OLS regressions and VAR regressions yield almost the same results.↩︎