Click the Original, Code and Reconstruction tabs to read about the issues and how they were fixed.

Original

Figure 1 shows the original graph from a Bloomberg news article published on 22 July 2021 by Rita Nazareth and Claire Ballentine.

In the graph, the title says:

“S&P 500 regains record relative to non-U.S. stock index after five-month slump.”


Figure 1 Source: Bloomberg Article “Tech Stocks Lead S&P (2021)”


The graph shows a period of 26 years from 1995 to 2021. The five months mentioned in the title is hardly visible, it occupies less than 2% of the graph.

What are the authors thinking when they wrote the title?? Are they trying to cover up something by not showing clearly the most relevant part of the graph?

Objective

  • The original data visualisation and the accompanying news article are intended to inform the readers that the US stock market has regained record relative to non-US stock markets

  • The target audiences are people who are interested in and maybe exposed to the share markets anywhere in the world

The visualisation chosen had the following three main issues:

  • The most important part of the graph is missing

This is a failure to answer a practical question. The appropriate thing to do when a piece of important news is published in a reputable media is to provide the relevant material to support the story.

  • The missing part of the graph and the story are misleading the global readers

This is an ethical issue and the use of deceptive method. The graph and the story should convey the correct message. There are problems when ratio is used as a tool for comparison. The inherent problem with ratio is due to a scale issue. For example, Investor A invested $1, and received $2 after a year earned a 100% pa return. Investor B invested $100, and received $150 after a year earned a 50% pa return. Who has a better performance with the investment? The popular answer is A. However, finance textbooks tell us that the correct answer is B as this investor earns more money ($50 vs $1). Popular answer does not always provide the right answer.

  • The story has not been reported correctly and responsibly

This is a perceptual Issue. Ratio can be exaggerated when there exists a scale discrepancy. For the story to be reported correctly, a non-ratio method should be used. Finance professionals are taught to use NPV, EVA and similar methods to determine if shareholders will increase in wealth before a project can be accepted for investment. However, for the purpose of this exercise, I have used Linear regression (LR) to investigate a 1-year relationship between the stock markets in the US and the rest of the world, rather than focusing on only five months. Although the LR does not address the wealth issue, it conveys a better and correct message to the readers in comparison to the ratio method.

Reference

Code

The following code was used to fix the issues identified in the original.

library(ggplot2)

library(readxl)

Assessment2 <- read_excel("SPXUS.xlsx", sheet = 1)

str(Assessment2)
## tibble [253 x 5] (S3: tbl_df/tbl/data.frame)
##  $ Date  : POSIXct[1:253], format: "2020-07-23" "2020-07-24" ...
##  $ SP500 : num [1:253] 3236 3216 3239 3218 3258 ...
##  $ BMI   : num [1:253] 271 268 271 270 272 ...
##  $ BMIxUS: num [1:253] 200 198 200 200 201 ...
##  $ RatioX: num [1:253] 16.2 16.2 16.2 16.1 16.2 ...
qplot(x = Date, y = RatioX, data = Assessment2,
      geom = "line")

p <- ggplot(data = Assessment2,
            aes(x = SP500,
                y = BMIxUS))

p + geom_point() + geom_smooth(method = "lm")

library(readxl)

SPXUS <- read_excel("SPXUS.xlsx", sheet = 1)

cor(SPXUS$BMIxUS, SPXUS$SP500)
## [1] 0.9540858
SP500.BMIxUS.lm <- lm(BMIxUS ~ SP500, data = SPXUS)
summary(SP500.BMIxUS.lm)
## 
## Call:
## lm(formula = BMIxUS ~ SP500, data = SPXUS)
## 
## Residuals:
##      Min       1Q   Median       3Q      Max 
## -16.0076  -4.3592  -0.6145   4.6776  13.7259 
## 
## Coefficients:
##              Estimate Std. Error t value Pr(>|t|)    
## (Intercept) 13.544440   4.338422   3.122  0.00201 ** 
## SP500        0.057401   0.001137  50.464  < 2e-16 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 6.204 on 251 degrees of freedom
## Multiple R-squared:  0.9103, Adjusted R-squared:  0.9099 
## F-statistic:  2547 on 1 and 251 DF,  p-value: < 2.2e-16

Data Reference

Reconstruction

1. Ratio Graph

Figure 2 shows the 1-year price data from 23-7-2020. I have defined RatioX as S&P 500/S&P Global Ex-U.S. BMI Index, exactly the same as the graph provided by Bloomberg, but enlarged for the last 12 months.


Figure 2 S&P500/S&P Global Ex-U.S. BMI Index ratio


To the untrained eyes, the graph seems to agree with one of the other Bloomberg Headlines: ‘S&P 500 regains record relative to non-U.S. stock index’. However, this is misleading as it is a perceptual illusion. From trough to peak, the ratio is 15.5 to 17.5, it looks like the US market is many times ahead of the rest of the world.

However, SP500 started in 1957 (Investopedia) and reached 4,411.79 points on 23-7-2021 (WSJ), with a market cap of US$38 Trillion on 30-6-2021 (Wikipedia). S&P Global Ex-U.S. BMI started in 1994 and it is only 250.78 point on 23-7-2021, and its largest market cap was US$1.87 Trillion (S&P Global and Factsheet). Clearly, a comparison based on ratio suffers from the timing and scale problems (a 37-year gap between the starting dates, and 21x difference in market cap). In this case, ratio should not be used as a performance indicator.

2. Linear Regression

The appropriate measure of performance can be measured in many ways, I have chosen the Linear Regression between the two indices for 1 year, see Figure 3 where a trend line is also shown. A more comprehensive analysis is outside the scope of this web report.


Figure 3 Regression of S&P Global Ex-U.S. BMI vs SP500


The regression results show that in the last 12 months, the change in the Ex-U.S. BMI was 5.7% of the change in the SP500. At an Adjusted R-squared of 0.91, p ~ 0, this relationship is very strong and significant.

The LR method provides a more acceptable performance measure between the two indices, it does not mislead like what the Ratio graph in Figure 2 is suggesting. Take note that the LR method is not affected by the large difference in size (index points and market cap) between the two indices, and a sophisticated statistical analysis can be conducted if necessary.