1. Introduction

In this project, possible associated variables with the number of sold houses between the \(1^{st}\) of January 2007 and the \(30^{th}\) of June 2016 on the Stirling Ackroyd real estate company were analysed.

In this report the variable Gold prices Against Sterling will be explored and investigated.

2. Data Manipulation

The data used in this report were taken from Quandl Fincancial and Economic Data. The file has two columns and 116 observations (from August 2006 to August 2016). As described above, the original data were manipulated and reduced to 114 observations (to match the time of interest) and a column containing only the year of the observation was added.

3. Analysis

The dataset were summarized and some results can be seen below:

##    Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
##   324.5   610.2   769.1   751.7   900.2  1111.0

The present dataset has standard deviation of 219.89 and amplitude of 786.674. It is of interest to analyze how the variable behaves along time, as it can be seen in the plot below.

In the analysis of time series it is comum that the observations are correlated among time,this characteristic is called autocorrelation. The Durbin-Watson test is a popular option to check the hypothesis of autocorrelation. In a confidence level of 95% the output of the test is a value (p-value) between 0 and 1: if (p-value \(>\) 0,05) the null hypothesis of non-autocorrelation is not rejected, otherwise (p-value \(\leq\) 0.05) we assume that the observations of the series are correlated among time. Also, the autocorrelation function (ACF) tests the significance of the coefficient among time (lags). The p-value is smaller than 0,05, so we assume that autocorrelation is significant.

To observe and compare the variation among the years of the Gold prices Against Sterling the plot below shows a BoxPlot for the index in each year separately.

From 2007 to 2011 the Gold prices increased considerably and its variance also gets bigger across time. From 2012 to 2015 the prices declined and 2012 presented the biggest variation among the years.

As explained in section 1, the aim of the report is to verify the relationship between the Gold Price and the Number of Sold Houses on the Stirling Ackroyd real estate company. The first exploratory analysis is to check the Pearson Correlation between the two variables, as shown below.

## 
##  Pearson's product-moment correlation
## 
## data:  gold$Value and house.sold$numSoldHouses
## t = -8.0201, df = 112, p-value = 1.127e-12
## alternative hypothesis: true correlation is not equal to 0
## 95 percent confidence interval:
##  -0.7091294 -0.4725647
## sample estimates:
##        cor 
## -0.6039861

In the output there are two main informations: the correlation coefficient (-0.6039861) and the p-value (1.127e-12). The correlation coefficient (CC) can be interpreted as a measure of the degree of linear relationship between two variables and the p-value is a test to check the CC is significant: if (p-value \(>\) 0,05) the null hypothesis of correlation equals to 0 is not rejected, otherwise (p-valor \(\leq\) 0.05) the correlation is significant. In this case, the linear relationship between the two variables is negative and moderated. As a complementary analysis, the plot shown above is a scatter plot between the variables and a regression line. It is clearly seen that an linear model is not well fitted to this dataset.

A point that has to be considered is that the data from the two variables are from quite a long time (9 years) and economic changes can be notice in short periods of time. Taking this into account the correlation coefficient will be analized considering different periods of time: i) the last 5 years and ii) the last 3 years. The output from the correlation test and coefficient can be seen below and right beneath that a scatter plot is displayed, aiming to observe the existence of linear relationship.

i) Last 5 years

## 
##  Pearson's product-moment correlation
## 
## data:  gold.5$Value and house.sold.5$numSoldHouses
## t = -1.1571, df = 58, p-value = 0.252
## alternative hypothesis: true correlation is not equal to 0
## 95 percent confidence interval:
##  -0.3892925  0.1078205
## sample estimates:
##        cor 
## -0.1502169

Not as expected, the correlation coefficient is insignificant (p-value > 0.05) when only the data from the last 5 years is used.

ii) Last 3 years

## 
##  Pearson's product-moment correlation
## 
## data:  gold.3$Value and house.sold.3$numSoldHouses
## t = 0.35237, df = 34, p-value = 0.7267
## alternative hypothesis: true correlation is not equal to 0
## 95 percent confidence interval:
##  -0.2736369  0.3813009
## sample estimates:
##       cor 
## 0.0603217

Once again, the correlation coefficient is not significant. This indicates that the Gold prices Against Sterling is not a good explanatory variable to the Number os Sold Houses.