1 Introduction

1.1 The R software

The R software can be downloaded for free from the internet r-project. One of the most convenient and efficient interfaces for using R can also be downloaded for free from the internet: RStudio. There is also the possibility of working on RStudio via the cloud at the following address: posit.cloud. In this latter way, you can work online even if R is not installed on the computer you are working on. R is available for different operating systems: Linux, Apple and Microsoft.

2 International Monetary Fund data

2.1 The IMF website

The website to refer to for IMF data is https://www.imf.org/external/datamapper/datasets.

In the “World Economic Outlook” (WEO) section there are the main economic indicators (GDP, inflation, population, trade balance and public finances).

To access these data it is necessary to install the imfapi package.

The command to install a package is install.packages('NAME'). However, in the RStudio interface there is a section (bottom right) Packages where you can search for a package and install it

2.2 Downloading the data

Once the imfapi package has been loaded into memory, you can download the data related to the chosen countries.

Countries are represented by the three-letter code that can be found at this address.

In the case we are interested in we have: Germany DEU, France FRA, Italy ITA and Spain ESP.

library(imfapi)
dati=imf_get(
  dataflow_id = "WEO",
  dimensions = list(FREQUENCY = c("A"),
                    COUNTRY = c('DEU','FRA','ITA','ESP'))
)

To inspect the variable dati you can use the command View(dati).

The downloaded data contain all the historical series available on the IMF website. For us, in particular, the one related to GDP at current prices in (billions of) US dollars is of interest, whose code is NGDPD (the code can be found within the URL address related to this variable).

With the command View(dati) we observe which columns are inside the variable dati. The “series_id” column reports the codes of the historical series (we are interested in NGDPD), while the “entity_id” column reports the country codes (we are interested in the three codes seen previously).

Within the variable dati we can select some values using square brackets, inside which we indicate the desired codes.

DEU=dati[dati$INDICATOR=='NGDPD' & dati$COUNTRY=='DEU',]$OBS_VALUE
FRA=dati[dati$INDICATOR=='NGDPD' & dati$COUNTRY=='FRA',]$OBS_VALUE
ITA=dati[dati$INDICATOR=='NGDPD' & dati$COUNTRY=='ITA',]$OBS_VALUE
ESP=dati[dati$INDICATOR=='NGDPD' & dati$COUNTRY=='ESP',]$OBS_VALUE

In the previous code, the symbol “&” indicates that both conditions must be satisfied: that is, we will take only the data that contain, at the same time, the code NGDPD and the code of the country of interest.

The dollar symbol “$” indicates, within a variable made up of multiple columns, which columns we want to extract.

The comma at the end of the square brackets indicates that all other data (i.e., those not identified by the previous conditions) must be taken in their entirety.

Among everything that has been selected, only the content of the “value” column is taken.

2.3 Creating a time series

To transform data and dates into a time series we use the zoo package (which must be installed beforehand).

Within the package, the zoo command allows you to associate numeric values (as first input) with dates (second input).

library(zoo)

## 
## Caricamento pacchetto: 'zoo'

## I seguenti oggetti sono mascherati da 'package:base':
## 
##     as.Date, as.Date.numeric

PIL=zoo(cbind(DEU,FRA,ITA,ESP),seq(1980,2025))

The cbind command creates a matrix by binding, side by side, each variable indicated in the arguments of the command.

2.4 Graphical representation

At this point we can graphically represent the time series. The simple plot command returns the following.

plot(PIL)

We thus see that the time series are represented “stacked” separately.

To represent them all on the same graph you must specify that all the traces must be shown on the same panel with the option screens=1.

plot(PIL,screens=1)

At this point, however, the traces must be differentiated using different colors. The option is col inside which you can indicate colors by their names in English.

plot(PIL,screens=1,col=c('black','red','blue','orange'))

To make the traces more visible you can specify a particular line thickness using the option lwd: line width.

plot(PIL,screens=1,col=c('black','red','blue','orange'),lwd=2)

We can also add a grid in the background of the Cartesian plane and insert a legend in the most appropriate position.

plot(PIL,screens=1,col=c('black','red','blue','orange'),lwd=2)
grid()
legend('topleft',legend=c('Germany','France','Italy','Spain'),
col=c('black','red','blue','orange'),lwd=2,bty='n')

The command bty within the legend indicates the box type. The legend, without any specification of this option, appears inside a rectangle. If we want the legend not to be framed, the value n indicates that we do not want to add a border.

2.5 Normalizing the graphs

To make GDP values comparable with each other, we can make them all start from the same value (by convention equal to 1) and on the same date.

The command we need, in this case, is sweep which allows applying a transformation to the data of a table. We can, for example, define the following command.

PIL1=sweep(PIL,2,PIL[1],'/')

The syntax means: we take the variable PIL and, for each column (dimension 2 of the matrix), we take the first value of the column PIL[1] and divide ‘/’ the entire column by this value.

We can thus represent the new variable on a graph.

plot(PIL1,screens=1,col=c('black','red','blue','orange'),lwd=2)
grid()
legend('topleft',legend=c('Germany','France','Italy','Spain'),
col=c('black','red','blue','orange'),lwd=2,bty='n')

To appreciate how the relative position of the countries changes by changing the start date of the analysis, we can represent the same type of graph, but starting from 2008.

The window command allows us to select a time window within a time series (indicating a start date and an end date).

PIL2=window(PIL,start=2008)
PIL2=sweep(PIL2,2,PIL2[1],'/')
plot(PIL2,screens=1,col=c('black','red','blue','orange'),lwd=2)
grid()
legend('topleft',legend=c('Germany','France','Italy','Spain'),
col=c('black','red','blue','orange'),lwd=2,bty='n')

Finally, we observe how the ranking changes starting from 2020.

PIL3=window(PIL,start=2020)
PIL3=sweep(PIL3,2,PIL3[1],'/')
plot(PIL3,screens=1,col=c('black','red','blue','orange'),lwd=2)
grid()
legend('topleft',legend=c('Germany','France','Italy','Spain'),
col=c('black','red','blue','orange'),lwd=2,bty='n')

3 Two stock market indices compared

3.1 Downloading data from Yahoo

The Yahoo website contains numerous historical series of financial data. For us, in particular, the two indices are of interest:

S&P 500 whose label in the Yahoo database is ^GSPC
NASDAQ 100 whose label is ^NDX

To download data from Yahoo you need to install the quantmod package. The getSymbols command is then used to connect R with Yahoo’s database and download the data.

library(quantmod)

## Caricamento del pacchetto richiesto: xts

## Caricamento del pacchetto richiesto: TTR

## Registered S3 method overwritten by 'quantmod':
##   method            from
##   as.zoo.data.frame zoo

getSymbols(c('^GSPC','^NDX'),
           src='yahoo',return.class='zoo',
           from='1985-01-01')

## [1] "GSPC" "NDX"

The option src indicates the data source; this option is necessary because quantmod also allows downloading data from other online databases.

The option return.class allows choosing what type of output we want from the getSymbols command: we want a zoo time series, as already seen in the previous paragraph.

To check what the downloaded variables contain, we use the command View(GSPC), thus observing that we downloaded the opening prices, the highs, the lows, the closing prices, the volumes and the adjusted closes. We are interested only in the adjusted closes, which we select with the following command.

S=merge(GSPC$GSPC.Adjusted,NDX$NDX.Adjusted,
        all=FALSE)

The merge command merges the two variables based on dates. The all option specifies whether we want, as output, all observations, or not. If the option is indicated as FALSE, only the dates for which values are available for all variables will be returned.

Once the time series S has been created we can plot it as already seen previously. In this case we also specify the label to give to the x-axis (with the option xlab – which we keep empty in the example) and the label for the y-axis (with the option ylab).

plot(S,screens=1,col=c('blue','red'),lwd=2,
     xlab='',ylab='Stock market indices')
grid()
legend('topleft',legend=c('S&P 500','NASDAQ 100'),
       col=c('blue','red'),lwd=2,bty='n')

To highlight important dates within the graph, we define the relevant dates using the following command.

date=as.Date(c('2000-03-10',
               '2008-09-15',
               '2020-03-20'))

The as.Date command transforms a string (or a series of strings) into a date: the string must be defined in the form year-month-day where the year must have 4 digits, while month and day must have 2 digits.

We can also define texts to add to the graph.

eventi=c('Dot-com\nbubble',
         'Lehman\nBrothers',
         'Pandemic')

The strings contain the command \n which separates one line from the next with a line break.

To add vertical lines to the previous graph, the abline command allows defining straight lines to overlay on the graph. The text command, instead, allows adding text to a graph.

plot(S,screens=1,col=c('blue','red'),lwd=2,
     xlab='',ylab='Stock market indices')
grid()
legend('topleft',legend=c('S&P 500','NASDAQ 100'),
       col=c('blue','red'),lwd=2,bty='n')

abline(v=date,lty=2)
text(date,max(S),labels=eventi,pos=1)

The option v in the abline command indicates “vertical”: therefore, vertical lines are created at the x-values in the variable date and all the lines are dashed (option lty=2).

The text command inserts text at the coordinates given by the first two arguments. We take the x-values from the variable date and as y-value we take the maximum of the prices. The labels are inserted in the variable eventi. Finally, the position pos indicates where the text must appear relative to the point indicated by the coordinates: 1 indicates the position below the point, 2 indicates to the left, 3 indicates above and 4 indicates to the right.

4 10-year interest rates

The quantmod package also allows downloading data from the FRED database managed by the US Federal Reserve. The website is https://fred.stlouisfed.org/.

By searching for the 10-year interest rate for Italy, for example, you obtain the time series with label: IRLTLT01ITM156N. The code IT within the label indicates Italy. We can therefore download different countries by changing only the two central letters.

library(quantmod)
getSymbols(c('IRLTLT01DEM156N',
             'IRLTLT01FRM156N',
             'IRLTLT01ITM156N',
             'IRLTLT01ESM156N',
             'IRLTLT01GRM156N'),
           src='FRED',
           return.class='zoo')

## [1] "IRLTLT01DEM156N" "IRLTLT01FRM156N" "IRLTLT01ITM156N" "IRLTLT01ESM156N"
## [5] "IRLTLT01GRM156N"

This time we downloaded the data from “FRED”, indicating it as the source (src) of the data.

Now we merge all interest rates into a single matrix.

tassi=merge(IRLTLT01DEM156N,
            IRLTLT01FRM156N,
            IRLTLT01ITM156N,
            IRLTLT01ESM156N,
            IRLTLT01GRM156N)

By plotting the data, we obtain the following result.

plot(tassi,screens=1,
     col=c('black','red','blue','orange','darkgreen'),
     lwd=2,xlab='')
grid()
legend('topleft',legend=c('DE','FR','IT','ES','GR'),
       col=c('black','red','blue','orange','darkgreen'),
       lwd=2,bty='n')

This graph shows the time series even for dates corresponding to which not all interest rates are available (in fact we did not specify the option all=FALSE).

To omit missing data (not available na) you need to use the na.omit command.

plot(na.omit(tassi),screens=1,
     col=c('black','red','blue','orange','darkgreen'),
     lwd=2,xlab='')
grid()
legend('topleft',legend=c('DE','FR','IT','ES','GR'),
       col=c('black','red','blue','orange','darkgreen'),
       lwd=2,bty='n')

4.1 The spread between rates

To calculate the spread between two rates (Italian and German) it is sufficient to extract, from the variable tassi, the columns related to the two interest rates and compute their difference.

spread=tassi$IRLTLT01ITM156N-tassi$IRLTLT01DEM156N
spread=na.omit(spread)
plot(spread,col='blue',lwd=2,
     xlab='',ylab='IT DE spread')
grid()

On the graph you can overlay colored rectangles with the rect command, which takes as input the coordinates of the lower-left corner and the upper-right corner of the rectangle. The value of the dates (x-axis) is extracted from the time series with the time command.

To make colors transparent it is necessary to use colors defined as a combination of red, green and blue with the command rgb(x,y,z,a) where x is the percentage of red, y the percentage of green and z the percentage of blue; finally a is the degree of transparency (between 0, color absent, and 1, fully opaque color, not transparent).

Recalling that yellow is a combination of green and red, we can give the following commands.

plot(spread,col='blue',lwd=2,
xlab='',ylab='IT DE spread')
grid()

#green rectangle
rect(min(time(spread)),0,
max(time(spread)),1,
col=rgb(0,1,0,0.2),
border='NA')

#yellow rectangle
rect(min(time(spread)),1,
max(time(spread)),3,
col=rgb(1,1,0,0.2),
border='NA')

#red rectangle
rect(min(time(spread)),3,
max(time(spread)),max(spread),
col=rgb(1,0,0,0.2),
border='NA')

5 Inflation and wages in Italy

On the FRED website we can transform the data through the interface of the website itself; for example, we can calculate the percentage changes of each variable compared to its value in the previous period or compared to its value in the previous year. These transformations can also be carried out in R, but using a specific package: fredr.

To use this package you need to register on the FRED website and obtain an access key (all at no cost). In the command that follows I insert my access key.

library(fredr)
fredr_set_key('c5b150d7ef18ec656ac2f4a3541dd60f')

The command to download the data is fredr which takes as input the variable code, the units of measurement, the frequency and the aggregation method. For wages in Italy the variable is LCWRIN01ITM661S which reports the levels of the wage index. We want to calculate the percentage change compared to the previous year (option units='pc1'), with annual frequency (frequency='a') and calculating the average value for the year (aggregation_method='avg').

X=fredr('LCWRIN01ITM661S',
        units='pc1',
        frequency='a',
        aggregation_method='avg')

To transform this variable into a time series of type zoo we can use the command already seen previously (within the quantmod package).

library(quantmod)
w=zoo(X$value,X$date)

We also download from the FRED website the value of the consumer price index for Italy: FPCPITOTLZGITA. This is already in the form of the change compared to the same period of the previous year and, therefore, does not require further processing.

X=fredr('FPCPITOTLZGITA')
inf=zoo(X$value,X$date)

Now we can plot the two variables w and inf by merging them into a single time series with the merge command.

plot(merge(w,inf,all=FALSE),screens=1,
     col=c('blue','red'),lwd=2,
     xlab='',ylab='Pct. change')
grid()
legend('topright',legend=c('Wages','Consumer prices'),
       col=c('blue','red'),lwd=2,bty='n')

To add the dates of Italy’s exit from the ECU and its re-entry into the Euro, we give the following commands.

plot(merge(w,inf,all=FALSE),screens=1,
     col=c('blue','red'),lwd=2,
     xlab='',ylab='Pct. change')
grid()
legend('topright',legend=c('Wages','Consumer prices'),
       col=c('blue','red'),lwd=2,bty='n')

date=as.Date(c('1992-01-01',
               '1996-12-31'))
eventi=c('Exit \nECU','Re-entry \nEuro')

abline(v=date,lty=2)
text(date,25,labels=eventi,pos=1,cex=0.8)

6 Stock market, exchange rate and US rate

From the Yahoo website we download three time series related to the following variables:

S&P500 index: code ^GSPC
exchange rate between Euro and Dollar: code EURUSD=X
interest rate on 10-year US bonds: code ^TNX

We download the data starting from the day before Trump’s announcement on US tariffs.

library(quantmod)
getSymbols(c('^SPX','EURUSD=X','^TNX'),
           src='yahoo',return.class='zoo',
           from='2025-04-01')

## Warning: EURUSD=X contains missing values. Some functions will not work if
## objects contain missing values in the middle of the series. Consider using
## na.omit(), na.approx(), na.fill(), etc to remove or replace them.

## Warning: ^TNX contains missing values. Some functions will not work if objects
## contain missing values in the middle of the series. Consider using na.omit(),
## na.approx(), na.fill(), etc to remove or replace them.

## [1] "SPX"      "EURUSD=X" "TNX"

We create the variables of interest by taking, from the downloaded data, only the adjusted closes.

borsa=SPX$SPX.Adjusted
cambio=`EURUSD=X`$`EURUSD=X.Adjusted`
tasso=TNX$TNX.Adjusted

In the case of the exchange rate we inserted backticks because the name of the variable contains an R command (the ‘=’ symbol).

To plot the three variables separately we give the following commands-

plot(borsa,col='red',lwd=2)

plot(cambio,col='orange',lwd=2)

plot(tasso,col='blue',lwd=2)

To plot them all together on the same graph it is convenient to normalize them to 1 at the start of the period. We therefore proceed to merge the variables into a single one and then calculate its normalization as already seen previously.

dati=merge(borsa,cambio,tasso,all=FALSE)

dati_norm=sweep(dati,2,dati[1],'/')

The graphical representation of the data, at this point, is obtained with the following commands.

plot(dati_norm,screens=1,col=c('red','orange','blue'),
     lwd=2,xlab='',ylab='1/4/2025 = 1')
grid()
legend('topleft',legend=c('S&P 500','x USD = 1 EUR','10-year rate'),
       col=c('red','orange','blue'),lwd=2,bty='n')
abline(h=1,lty=2)

Querying economic data

Francesco Menoncin

2025-10-19