The purpose of this document is to present the findings of an empirical model on the determinants of house prices. I will format the document as to be readable for a general audience without excluding the key statistical operations. It will be necessary to have a suite of models for any meaningful analysis and this document will be the baseline. The intention being to be a worked through econometric example, not a crash course in econometrics.

Unreliability: Econometric approaches have been unable to reliably detect market disequilibrium. Some examples of this can be seen in section 2. For that reason it would be a stretch to assume the first iterations of such a model, developed for discussion as an introduction, to be overly insightful and reliable.

While generally a simpler model specification is preferred over a complicated one, the models in this document surely lack the complex to accurately predict disequilibrium. The result of the preferred specification here has the Irish house prices 13% above the model prediction. See Figure 4.1.

Going Forward an improved understanding is required. - Why that on one hand models are useful, but on the other there are clear and not so clear limitations. Moving closer to a specification of that developed by McQuinn in the ESRI as discussed in their Autumn Economic Outlook. - Where the model suggests Irish house prices were over-valued by approximately 7% at that time. Also prioritise applying Time Series methods to improve predictions and forecasting.

1 Prerequisites

Analyzing the ratio of completions to planning from Model 1¹, the main shortcoming of which was that it was a yearly model, and did not have enough observations, the ratio was insignificant. - It does not seem plausible that a supply ratio is insignificant. Other specification issues aside, and prior to running another model I’ve dug deeper into the relationship between house prices and the completion to planning ratio, hereafter the supply ratio.

Figure 1.1: Supply Ratio - Clear disparity between houses & apartments

Figure 1.2: The resulting balance of units from the supply ratio

Viability issues suppresses the supply of apartments, a consequence of this is that the supply ratio does not reflect the true tightness in the market. The relatively high quantity of apartment stock in the pipeline that can be seen in Figure 1.2 is illusory.

1.1 Supply Ratio & Prices

What is the relationship between the tightness of supply for houses and the price of residential property?

The 12 month moving average (mean) was used in to answer this question, the code for this dataset accessed through PxStat on the CSO website is: HPM08.

Figure 1.3: The correlation between prices and the 12 month moving average supply ratio for houses

Correlation is measure from +1 to -1, the correlation (R value) between property prices and the supply ratio can be seen in Figure 1.3. The rule of thumb applied regularly in economics and the behavioural sciences is that anything above 0.7 is regarded as having a high correlation. Given that the correlation is positive the result is that tighter supply can be associated with higher prices. - Exactly as we would expect.

The supply ratio used here is the 12 month moving average in a given quarter. I’ve labelled the quarter of the first and last three observations, recall figure 1.1 for yearly change in ratio.

1.2 Other Considerations

Measuring affordability in an empirical model is more difficult than would be assumed. An appropriate explanatory variable should not be in of itself a function of the independent variable.² Given that mortgage repayments (possible explanatory variable) are a function of house prices (independent variable).

2 Literature

By no means an exhaustive view of the literature

“In general, the choice of a house price model and its empirical estimation is very much influenced by the quality and availability of data” - Bragoudakis, Bank of Greece

Bragoudakis et al (2016)

When housing is treated as a consumption good, it’s demand is generally a function of a number of variables such as:

Household income
Interest rates
Financial wealth or demographic and labour market factors

As an investment good:

Risk-free rate or alternative incestment returns
Housing stock premium
Housing services - generally proxied by the rental yield of a property

Credit for house price forecasting point to mixed results:

Generally credit availability is found to have a strong positive effect on house prices
- Not straightforward: credit is statistically insignificant in the short and medium run, but remains a key determinant of house prices in the long-run Annett, 2005.
- Multi-directional causality between house prices and credit Goodhart and Hofmann i.e. Money growth has a significant effect on house prices and credit, and credit influences money and house prices and house prices influence both credit and money
- Some evidence that the impact of private credit on house prices is stronger in a boom
- Credit and money aggregates lead developments in house prices
- Asymmetrically effects through the cycle:
  - Demographic variables
  - Unemployment rate
  - Disposable income
  - Debt-to-income ratio

Supply-side factors are useful in forecasting house prices are primarily real construction costs and construction technology shocks.

Price momentum: price is effected by its lagged value. Price momentum present in the short-run & reversal in the long run

Fundamental variables alone are typically not enough to explain house prices
Theoretical models may explain momentum:
- Irrational exuberance and unrealistic expectations.
- Risk-shifting behaviour by banks Allen and Gale, 2000
- Procyclical behaviour
- Down payment constraints in sellers reservation prices Stein, 1995
The autocorrelation structure³ is typically found to be market specific and to differ across countries

Sunega et al (2014)

Econometric models have produced contradictory results and have failed to provide warning of housing market crashed. The standard econometric approaches have been unable to reliably detect market disequilibrium - Sunega et al (2014)

Studies published up to 2007 tended to include that house prices for the most part were not too far from their fundamental / equilibrium value. Contemporaneously similar models were reaching the opposite conclusion. The authors provide two distinct but well specified models which produce different outcomes, even when using the same data. The authors attribute this variation as largely stemming from the interest rate variable.

Examples of such models to the literature regarding Irish data are McQuinn & O’Reilly (2006), Central Bank of Ireland, McQuinn (2017), ESRI and Roche, ESRI

Whittle (2014)

Some behavioural explanations which may effect house price bubbles.

Herd Behaviour

Morone & Samanidou (2008) showed an individual making a decision is likely to override the private information they hold, to conform to a popular trend of thinking.

Amateurs vs Experts

Behavioural biases impact negotiation of a selling price. Asking prices, however erroneous will affect a buyers judgement Diaz & Black, 1996. Individuals do not use sound economic valuations, and overemphasize the potentially arbitrary reference point (asking price).
Amateur investors are quicker to admit to biases, experts are more likely to continue to endorse a flawed estimation.

Anchoring, Loss Aversion and Endowment Bias

Disposition effect: “Investors are risk averse when in profit and risk loving when in a loss” - DeWeaver & Shannon, 2010. Consider the money illusion, where inflation devalues nominal prices. Home owners are unlikely to sell their property at a nominal loss but would sell at a nominal gain, even if it that would be a loss in real terms.

3 Linear Regression Model

Considering the findings in the literature discussed in the last section, its necessasy take care when interpretating of the results.

3.1 Baseline Specification

There will be various specifications of the following model:

\[ \operatorname{P} = \alpha + \beta_{1}(\operatorname{COMP}) + \beta_{2}(\operatorname{MPI}) + \beta_{3}(\operatorname{POP}) + \beta_{4}(\operatorname{SAVC}) + \beta_{5}(\operatorname{DISPC}) + \beta_{6}(\operatorname{UNEMP}) + \epsilon \]

Our independent variable is P, which is house prices. This is explained by the variables on the right hand side of the equation to some degree, which is represented by the Greek symbol beta. The alpha is what is known as the intercept and the epsilon, at the end of the equation is the error term. The variables will be tested to see if they are appropriate.

3.1.1 House Prices

Figure 3.1: All property types included, that is a blend of new and secondhand homes

Naturally HPM05, which is mean value by month is not as smooth as the 12 month moving average which was used previously 1.3. - This will be the house price applied in the model.

The explanatory variables are detailed in the following subsections in order of appearance.

3.2 Variables

3.2.1 Completions

COMP

Dataset: NDQ01 which is the number of new dwelling completions by type of house.

Figure 3.2: New Dwelling Completions for Houses and Apartments - Seasonally Adjusted

3.2.2 Mortgage Interest

MPI

Dataset: CPM03 which is the consumer price index by sub indices. Mortgage Interest is the filtered sub index.

Figure 3.3: This represents the average monthly percentage change in mortgage interest paid in a given quarter

Figure 3.4: This represents the average yearly percentage change in mortgage interest paid in a given quarter

3.2.3 Population

POP

Dataset: PEA01 which is estimate population in April of a given year. This was interpolated and can be seen below.

Figure 3.5: Demographic variable consists of the age group typically in a position to purchase a home

Figure 3.6: Where necessary this series will be used to generate per capita values

3.2.4 Savings

SAVC

Dataset: ISQ03 which is the Quarterly Accounts at Current Market Prices Seasonally Adjusted, Institutional Sector filtered to “Households including NPISH (S.14+S.15)”. The Current Account variable was filtered to “Gross saving (B.8g)”.

Figure 3.7: Series must be adjusted to per capita basis

Figure 3.8: Value expressed in euro terms, denominator (population) 25 to 64 years

3.2.5 Disposable Income

DISPC

Dataset: ISQ03. The Current Account variable was filtered to “Gross disposable income (B.6g)”. The population between 25 & 64 were used as per the POP variable previously.

National Bank of Belgium - Methodological note on financial indicators

Figure 3.9: Series must be adjusted to per capita basis

Figure 3.10: Value expressed in euro terms, denominator (population) 25 to 64 years

3.2.6 Unemployment

UNEMP

Dataset: LRM11 which is the number of people on the live register, the population between 25 & 64 were used as per the POP variable previously.

Figure 3.11: Series must be adjusted to per capita basis

3.3 Correlation

3.3.1 Dependent & Independent

High correlation between completions and house prices, higher prices being associated with higher completions. On the face of it this may be counterintuitive given supply constraints are usually associated with higher prices. Given how far prices and completions fell from peak, which was out of sample this relationship is less surprising. Completions alone may not be the best metric of housing supply

Figure 3.12: High correlation between completions and house prices, higher prices being associated with higher completions. On the face of it this may be counterintuitive given supply constraints are usually associated with higher prices. Given how far prices and completions fell from peak, which was out of sample this relationship is less surprising. Completions alone may not be the best metric of housing supply

Low to moderate correlation between change in mortgage interest payments and house prices. Consider the largest increases took place in 2011 while house prices were in decline post GFC. These increases immediately preceeded the ECB cutting rates toward zero and later going negative. Throughout the sample the increases clustered around zero and negative. The low rate environment of the last decade drove this trend.

Figure 3.13: Low to moderate correlation between change in mortgage interest payments and house prices. Consider the largest increases took place in 2011 while house prices were in decline post GFC. These increases immediately preceeded the ECB cutting rates toward zero and later going negative. Throughout the sample the increases clustered around zero and negative. The low rate environment of the last decade drove this trend.

Figure 3.14: Strong correlation between house prices and estimate population

The correlation is slightly short of the 0.7 threshold for a strong correlation, which may (again) on first inspection seem surprising. However the 10 highest observations for savings per capita all came post pandemic which as we well aware was 'unprecedented'

Figure 3.15: The correlation is slightly short of the 0.7 threshold for a strong correlation, which may (again) on first inspection seem surprising. However the 10 highest observations for savings per capita all came post pandemic which as we well aware was ‘unprecedented’

Figure 3.16: Strong positive correlation as would be expected, as previously notes the top 10 observations came post pandemic.

Figure 3.17: Strong negative correlation between house prices and persons on the live register

Recall Figure 3.8

3.3.2 Independent & Independent

The above matrix is separated into three portions, The diagonal density plots which illustrates the density of observations for each of the explanatory (independent variables), this separates the: Correlation value for each pair in the top right of the plot and the scatter plot on the bottom left.

Figure 3.18: The above matrix is separated into three portions, The diagonal density plots which illustrates the density of observations for each of the explanatory (independent variables), this separates the: Correlation value for each pair in the top right of the plot and the scatter plot on the bottom left.

3.4 Results

In keeping with the objective of having this document readable to a general audience the statistical explantions will not be exhaustive. Here is a useful quote from the book Statistics Done Wrong by Alex Reinhart: “If you want to prove that your drug works, you do so by showing the data is inconsistent with the drug not working… Remember, p is a measure of surprise, with a smaller value suggesting that you should be more surprised”.

3.4.1 Base Specification

Table 3.1: Model Dataframe
Year_Q	price	COMP	MPI	POP	SAVC	DISPC	UNEMP
2011 Q1	246445.3	2205	2.2333333	2473920	875.5337	8713.299	360836.3
2011 Q2	227540.7	1900	1.2666667	2479066	733.7441	8577.424	363691.0
2011 Q3	228576.0	1682	2.0333333	2484212	828.0290	8519.803	372824.0
2011 Q4	212705.3	1368	-0.9666667	2487238	1006.3371	8667.043	355724.0
2012 Q1	201048.7	1336	-2.9333333	2487758	1152.0414	8798.686	362753.0
2012 Q2	196781.7	1180	-0.4333333	2488277	924.3343	8650.965	363538.3
2012 Q3	213900.0	1198	-1.8666667	2488797	978.7862	8830.773	369714.3
2012 Q4	199807.0	1221	0.0333333	2489018	1042.9818	8822.357	351946.7
2013 Q1	190612.0	1067	-0.1666667	2488820	813.2369	8700.510	359439.7
2013 Q2	198411.0	1176	-0.7000000	2488622	840.2241	8702.809	356688.7
2013 Q3	210688.3	1023	-0.8333333	2488424	759.9188	8712.343	358086.7
2013 Q4	216347.7	1272	-1.2333333	2488569	945.1214	8898.286	333687.3
2014 Q1	197728.0	1260	-1.0333333	2489287	844.4186	8792.077	336137.0
2014 Q2	210135.0	1399	-0.4666667	2490004	626.9066	8769.463	332753.7
2014 Q3	229335.3	1389	-0.8333333	2490722	609.8634	8829.168	331164.7
2014 Q4	217916.7	1451	-1.3666667	2492272	682.1085	9074.049	306164.0
2015 Q1	213758.0	1597	-0.1000000	2495489	804.6520	9145.303	305634.0
2015 Q2	218073.3	1642	-1.0666667	2498705	915.2741	9307.621	300837.0
2015 Q3	240377.0	2003	-0.3666667	2501922	826.1650	9334.025	303140.3
2015 Q4	231720.3	1920	-0.8000000	2506026	636.4659	9260.080	279733.0
2016 Q1	236915.7	2350	-0.4000000	2512286	599.0561	9481.008	279483.7
2016 Q2	242098.3	2406	-0.6000000	2518545	698.8161	9559.883	271309.7
2016 Q3	257509.0	2436	-0.4333333	2524805	992.5518	9782.933	269092.3
2016 Q4	257258.0	2584	0.0666667	2530818	698.5885	9679.086	244388.7
2017 Q1	261512.7	3091	-0.0666667	2536088	1022.8353	10014.638	241822.0
2017 Q2	265434.3	3520	0.0000000	2541358	1065.1787	10048.172	234378.3
2017 Q3	281928.3	3668	0.0333333	2546628	1196.8769	10243.350	230189.7
2017 Q4	281432.7	3909	-0.2666667	2551764	1036.5379	10304.637	209970.0
2018 Q1	285160.3	4202	-0.1000000	2556391	899.7060	10267.210	209595.7
2018 Q2	286976.3	4363	0.2000000	2561018	1028.1070	10485.676	200321.7
2018 Q3	301115.0	4566	0.1000000	2565644	1137.3361	10621.114	197014.0
2018 Q4	290483.0	4621	0.2333333	2570684	1111.3776	10739.556	178063.0
2019 Q1	289525.0	4799	0.3666667	2577785	1165.3418	11009.842	175591.0
2019 Q2	289774.0	5192	0.1000000	2584886	1194.6370	11049.232	172749.7
2019 Q3	303818.0	5633	0.4000000	2591987	1197.9229	11047.511	174656.3
2019 Q4	298901.0	5402	0.2333333	2598952	1265.5103	11252.999	161420.0
2020 Q1	291812.3	5618	0.2666667	2604964	2693.3195	12109.573	169620.0
2020 Q2	283920.7	3488	0.1000000	2610976	4091.5730	11777.204	192160.0
2020 Q3	295987.3	5079	0.3666667	2616989	2481.8602	11437.573	196362.0
2020 Q4	310155.3	6092	-0.0333333	2622822	2894.5925	11694.276	172666.3
2021 Q1	311370.0	4753	0.2333333	2626681	4006.5764	12253.865	165420.0
2021 Q2	315762.0	5133	0.2666667	2630541	3040.8191	12344.608	155450.0
2021 Q3	336212.0	4729	0.3000000	2634401	2830.6245	12622.983	155903.7
2021 Q4	339565.0	5723	0.3666667	2638523	2369.8868	12585.831	147805.7
2022 Q1	340572.7	6447	0.3666667	2648667	2678.7059	12696.575	151299.7
2022 Q2	347849.0	8245	0.3666667	2658811	2799.3712	12905.391	160827.7
2022 Q3	370965.3	7681	1.3666667	2668956	2653.8470	12979.609	170403.7

The below table is the results of an OLS regression: Simple/Multiple linear regression. This is the method used to estimate the coefficients (the Greek letter beta from the equation in section 3), which describe the relationship between that explanatory variables and the dependent variables.

**Base Specification - Results**

	Dependent variable:

	price

COMP	6.039^*
	(3.542)

MPI	6,720.093^***
	(2,042.728)

POP	0.096
	(0.258)

SAVC	-8.919^*
	(4.714)

DISPC	21.301^**
	(9.959)

UNEMP	-0.040
	(0.070)

Constant	-196,485.000
	(587,797.800)


Observations	47
R²	0.960
Adjusted R²	0.954
Residual Std. Error	10,225.460 (df = 40)
F Statistic	161.551^*** (df = 6; 40)

Note:	p<0.1; p<0.05; p<0.01

When interpreting regression coefficients the results show the change in the dependent variable given a unit change in the explanatory variable, all else equal. The asterix indicates statistical significance. p<0.05 is the typical level considered significant (95% confidence) in the social sciences, with anything lower also indicating significant at a higher confidence level.

3.4.1.1 Tests

## [[1]]

## 
## [[2]]

## 
## [[3]]

## 
## [[4]]

## 
##  studentized Breusch-Pagan test
## 
## data:  BASE
## BP = 7.0743, df = 6, p-value = 0.314

From the above there is a clear multicollinearity problem, meaning that several of the indepdendet variables in the model are correlated. As a result disposable income (DISPC) and population (POP), which are above the threshold will be dropped. See Figure 3.18 - Disposable income and Population are highly correlated with eachother and with unemployment.

Inspecting the plot of residuals it is not obvious that there is not a heteroscedasticity issue. A heteroscedasticity can occur when the standard deviation of a predicted variable varies over time.

The interpretation of the Breusch-Pagan test for heteroscedasticity is that due to the p-value being insignificant (i.e., >0.05), we do not reject the null hypothesis of homoscedasticity. This double negative can be confusing, but in simple terms the test could be said to have passed.

**Specification 2 - Results**

	Dependent variable:

	price

COMP	13.455^***
	(2.387)

MPI	5,674.707^**
	(2,202.194)

SAVC	4.172^*
	(2.445)

UNEMP	-0.200^***
	(0.054)

Constant	263,042.700^***
	(21,501.430)


Observations	47
R²	0.950
Adjusted R²	0.945
Residual Std. Error	11,237.460 (df = 42)
F Statistic	198.425^*** (df = 4; 42)

Note:	p<0.1; p<0.05; p<0.01

[[1]] [[2]] [[3]] [[4]]

studentized Breusch-Pagan test

data: SPEC2 BP = 4.3213, df = 4, p-value = 0.3643

As we can see Specification 2 is suitable.

3.4.2 Log Specification

In general, log specifications are used to interpret price elasticities. The coefficient of a log explanatory variable can be interpreted as the percentage change of the log dependent variable.⁴

**Log Specification 2 - Results**

	Dependent variable:

	log(price)

log(COMP)	0.209^***
	(0.039)

MPI	0.012
	(0.009)

log(SAVC)	0.027
	(0.017)

log(UNEMP)	-0.095
	(0.076)

Constant	11.788^***
	(1.268)


Observations	47
R²	0.956
Adjusted R²	0.952
Residual Std. Error	0.040 (df = 42)
F Statistic	227.508^*** (df = 4; 42)

Note:	p<0.1; p<0.05; p<0.01

3.4.2.1 Tests

[[1]] [[2]] [[3]] [[4]]

As we can see there is a multicollinearity problem. From our earlier assertion (Figure 3.12) completions alone may not be representative of supply. The supply ratio for houses (SRH) may be a better measure as discussed from 1.1. So before removing the log of unemployment from the model I will include SRH.

3.4.3 Specification B

The SRH is ratio for a given quarter, not the moving average. - To preserve observations given the completions data begin from 2011. SRH.6 and SRH.12 are the 6 and 12 month moving average which necessarily drops some observations.

**Specification B - Results**

	Dependent variable:

	log(price)
	(1)	(2)	(3)

SRH	0.041
	(0.051)

SRH.6		0.035
		(0.078)

SRH.12			-0.082
			(0.128)

MPI	0.040^***	0.032^***	0.014
	(0.009)	(0.011)	(0.016)

log(SAVC)	0.015	0.013	0.027
	(0.022)	(0.025)	(0.029)

log(UNEMP)	-0.464^***	-0.483^***	-0.533^***
	(0.038)	(0.040)	(0.049)

Constant	18.079^***	18.333^***	18.939^***
	(0.583)	(0.602)	(0.705)


Observations	47	46	44
R²	0.927	0.929	0.931
Adjusted R²	0.920	0.922	0.924
Residual Std. Error	0.052 (df = 42)	0.052 (df = 41)	0.051 (df = 39)
F Statistic	132.820^*** (df = 4; 42)	133.748^*** (df = 4; 41)	132.432^*** (df = 4; 39)

Note:	p<0.1; p<0.05; p<0.01

4 Findings

4.1 Actual vs Fitted

“All models are wrong but some are useful.” - George Box, Statistician.

Using tslm() function to fit a linear model with time series components produces the fitted value.

The variation from fitted value can be inferred as prices being above or below “equilibrium”, as defined by the model specification. It does not mean that prices will return to this equilibrium or mean that this fitted value is optimal.

Figure 4.1: The fitted model closely tracks the actual house price value. There has been a divergence to the latest period.

Clear shortfalls of the model:

Data availability
Lack of complexity of credit variable
Not enough consideration for lagged effects

4.2 Variation

In the latest period on the preferred specification actual house prices are over-valued by 13%.

Expressing the variance between the actual and fitted values as a percentage of the fitted values shows that the actual price varies from being 'under' and 'over' valued across the sample period. This is what we would expect, however whether this is representative of the market dynamic remains to be seen.

Figure 4.2: Expressing the variance between the actual and fitted values as a percentage of the fitted values shows that the actual price varies from being ‘under’ and ‘over’ valued across the sample period. This is what we would expect, however whether this is representative of the market dynamic remains to be seen.

5 Going Forward

The very next step is to develop various other specifications and analyse the results over the same time horizon. Using proxys for some of the data series which date back further than 2012 should give some insight to house prices responsiveness to credit conditions in different economic cycles.

A model I was working on which I did not circulate.↩︎
An independent variable is the key variable of interest in an econometric model. We want to explain the independent variable given a series of well specified dependent variables↩︎
The relationship between a series and its lags↩︎
A log transformation would also be commonly applied when dealing with heteroscedasticity↩︎

House Price Fundamentals - Model 2

Cormac Harten
cormac.harten@glenveagh.ie

2023-03-03

1 Prerequisites

1.1 Supply Ratio & Prices

1.2 Other Considerations

2 Literature

3 Linear Regression Model

3.1 Baseline Specification

3.1.1 House Prices

3.2 Variables

3.2.1 Completions

3.2.2 Mortgage Interest

3.2.3 Population

3.2.4 Savings

3.2.5 Disposable Income

3.2.6 Unemployment

3.3 Correlation

3.3.1 Dependent & Independent

3.3.2 Independent & Independent

3.4 Results

3.4.1 Base Specification

3.4.1.1 Tests

3.4.2 Log Specification

3.4.2.1 Tests

3.4.3 Specification B

4 Findings

4.1 Actual vs Fitted

4.2 Variation

5 Going Forward

House Price Fundamentals - Model 2

Cormac Harten cormac.harten@glenveagh.ie

2023-03-03

1 Prerequisites

1.1 Supply Ratio & Prices

1.2 Other Considerations

2 Literature

3 Linear Regression Model

3.1 Baseline Specification

3.1.1 House Prices

3.2 Variables

3.2.1 Completions

3.2.2 Mortgage Interest

3.2.3 Population

3.2.4 Savings

3.2.5 Disposable Income

3.2.6 Unemployment

3.3 Correlation

3.3.1 Dependent & Independent

3.3.2 Independent & Independent

3.4 Results

3.4.1 Base Specification

3.4.1.1 Tests

3.4.2 Log Specification

3.4.2.1 Tests

3.4.3 Specification B

4 Findings

4.1 Actual vs Fitted

4.2 Variation

5 Going Forward

Cormac Harten
cormac.harten@glenveagh.ie