Introduction

This will be an investigation in the various indicators that are kept by the Central Bank to monitor the economy. These include inflation, balance of payments, gdp and others. This way, what is sought is a solid understanding of how policy makers view the economy from high up in their ivory tower.

Note on Reproducibility

It was not possible to automate the process of downloading the data needed from the site for some few reasons. Instead the data was manually downloaded and placed in folder. The site will be provided though https://www.centralbank.go.ke/

library(tidyverse)
## -- Attaching packages --------------------------------------- tidyverse 1.3.1 --
## v ggplot2 3.3.5     v purrr   0.3.4
## v tibble  3.1.6     v dplyr   1.0.7
## v tidyr   1.2.0     v stringr 1.4.0
## v readr   2.1.2     v forcats 0.5.1
## -- Conflicts ------------------------------------------ tidyverse_conflicts() --
## x dplyr::filter() masks stats::filter()
## x dplyr::lag()    masks stats::lag()

Indicator 1: Inflation

inflation_df = read.csv('Inflation Rates.csv')
# Changing variable names
names(inflation_df) = c('year', 'month', 'annualaverageinflation', 'monthlyinflation')

# Date to character
inflation_df = mutate(inflation_df, year = as.character(year))
# Annual Average
ggplot(inflation_df, aes(year, annualaverageinflation)) +
  geom_boxplot() +
  theme(axis.text.x = element_text(angle = 90))

# Per monthly
ggplot(inflation_df, aes(year, monthlyinflation)) +
  geom_boxplot() +
  theme(axis.text.x = element_text(angle = 90))

The analysis suggests that inflation has been trending downward since the early 2000s. Writing this in 2022, this sounds counter intuitive but for now this observation will do.

Indicaor 2: Interest Rates

interest_df = read.csv('Commercial Banks Weighted Average Rates.csv')
cbk_rate = read.csv("Central Bank Rate .csv")
# TIDYING DATA
# Renaming Year Variable
# Date to character
interest_df = mutate(interest_df, Year = ï..Year, ï..Year = NULL
                     ,Year = as.character(Year)
                     )
cbk_rate = mutate(cbk_rate, date = lubridate::dmy(ï..Date), ï..Date = NULL)

We are mostly interested in the banks lending rate. That is in fact a fiscal tool that central banks use to regulate the monetary supply in the economy. We will not go into detail here but central banks reduce lending rates to increase supply and increase rates to reduce supply.

ggplot(interest_df, aes(Year, Lending)) + 
  geom_boxplot(outlier.color = 'Blue') +
  theme(axis.text = element_text(angle = 90))

## CBK rates
ggplot(cbk_rate, aes(date, Rate)) +
  geom_line(size = 1.5) +
  geom_point(color = "Blue")

From a cursory glance, bank lending rates have been trending downward generally. They were the highest since the early ’90s and their drop mirrors the introduction of economic liberalization policies. This remains to be seen though.

2012

It is interesting that the rates were the highest in this year, in the recent past. Let us see this in more detail.

year_2012 = filter(interest_df, Year == '2012')

#Plotting
ggplot(year_2012, aes(Month, Lending)) +
  geom_histogram(stat = 'identity') +
  ylab('Lending Rate') +
  theme(axis.text.x = element_text(angle = 90))
## Warning: Ignoring unknown parameters: binwidth, bins, pad

Rates stayed consistently high over this period, is all we can infer for now.

Indicator 3: Balance of Payments

This simply relates to how outflows and inflows of a country. When a country has a negative balance it is running a deficit and when, if, it has a positive balance then it is running a surplus. Let us investigate our data.

bop = read.csv("Foreign Trade Summary.csv")

# Tidying up
bop = mutate(bop, Year = as.character(ï..Year), ï..Year = NULL)

# Removing commas manually since all loops have failed
bop =
mutate(bop, Commercial.Imports = str_replace_all(Commercial.Imports, ',', ''),
       Government.Imports = str_replace_all(Government.Imports, ',', ''),
       Total = str_replace_all(Total, ',', ''),
       Domestic.FOB = str_replace_all(Domestic.FOB, ',', ''),
       Re.Exports = str_replace_all(Re.Exports, ',', ''),
       Total.FOB = str_replace_all(Total.FOB, ',', ''),
       Trade.Balance = str_replace_all(Trade.Balance, ',', '')
       )

# Converting to numeric
bop = mutate(bop, Commercial.Imports = as.numeric(Commercial.Imports),
             Government.Imports = as.numeric(bop$Government.Imports),
             Total = as.numeric(Total),
             Domestic.FOB = as.numeric(Domestic.FOB),
             Re.Exports = as.numeric(bop$Re.Exports),
             Total.FOB = as.numeric(Total.FOB),
             Trade.Balance = as.numeric(Trade.Balance)
             )

Trade Balances

This will reflect a deficit or a surplus. It is almost guaranteed to be in the negative given the position the country occupies globally.

ggplot(bop, aes(Year, Trade.Balance)) +
  geom_boxplot(outlier.colour = "Red") +
  ylab("Trade Blance (Ksh Million)") +
  theme(axis.text.x = element_text(angle = 90))

Trade balances have ballooned since the early 1990s. Most of that increase has happened in the last decade. It is interesting to consider what may have caused this drastic change of affairs but for that more information is needed. How would this compare to other countries?

Imports

ggplot(bop, aes(Year, Total)) +
  geom_boxplot(outlier.colour = "Orange") +
  ylab("Total Imports (Ksh Million)") +
  theme(axis.text.x = element_text(angle = 90))

Imports have increased drastically over the past two decades. We will explore later what has accounted for this large increase and the composition, by goods, of said imports.

Tanzania: A Reference Country

bop_tz = read.csv("./Balance of Payments in USD.csv", skip = 2)

# Removing null columns
bop_tz = select(bop_tz, !(X.1: X.15) & !X)

# Cleaning up variable names
var_names = names(bop_tz) %>% str_replace_all("\\.", "/") %>%
            str_remove_all("X")
names(bop_tz) = var_names

# Cleaning up column values
bop_tz = apply(bop_tz, 2,function(x) {str_remove_all(x, ",")}) %>%
            as.data.frame

# Making proper variables into column names
bop_tz_long = pivot_longer(bop_tz, cols = names(bop_tz)[-1], names_to = "Year", values_to = 'type') %>%
  mutate(value = as.numeric(type), type = NULL)

Trade Balance

The clean up has not gone according to plan. But the data is still usable for plotting.

bop_tz_total = 
rbind(filter(bop_tz_long, Items == "OVERALL BALANCE (Total Groups A through D)"), filter(bop_tz_long, Items == "Financing gap"))
# Plotting
ggplot(bop_tz_total, aes(Year, value, ~ Items)) +
  geom_histogram(stat = 'identity') +
  ylab("Trade Balance") +
  theme(axis.text.x = element_text(angle = 90))
## Warning: Ignoring unknown parameters: binwidth, bins, pad
## Warning: Removed 21 rows containing missing values (position_stack).

While not as successful as hoped, our plot shows us that Tamzanias blance of pauments fluctuates from year to year. In comparison, Kenya’s has been on an almost exponential increase since the 90’s. That would point to differences in internationa policy between the two countries as we may assume.

Imports

We now compare the imports

# Pattern matching for the import variable
import = bop_tz$Items[str_detect(bop_tz$Items, "import")]
export = bop_tz$Items[str_detect(bop_tz$Items, "exports")]

tz_import = filter(bop_tz_long, Items == c(import, export))

# Plotting
ggplot(tz_import, aes(Year, value, fill = Items)) +
  geom_histogram(stat = 'identity') +
  ylab("Imports/ Exports") +
  theme(axis.text.x = element_text(angle = 90))
## Warning: Ignoring unknown parameters: binwidth, bins, pad

It is unclear whether the negative in the imports is for convention or whether it implies actual negative imports; exports. However, it is more likely that the minus sign is conventional as imports are negative in the balance sheet. so we shall take its absolute value as the total number of imports. That said, imports are significantly larger than exports which is not particularly surprising. However, unlike Kenya, they have not increased quite as rapidly. That is a crucial difference.

Indicator 4: Exchange Rates

# Downloading Data
ex_url = "https://www.centralbank.go.ke/uploads/exchange_rates/1859585626_Monthly%20Exchange%20rate%20(period%20average).csv"
ex_filepath = "./Exchange Rates Period Average"

if (!file.exists(ex_filepath)) {
  download.file(ex_url, ex_filepath)
}

# Reading in
ex_rate = read.csv(ex_filepath, skip = 1)
# Plotting
ggplot(ex_rate, aes(as.character(Year), United.States.dollar)) +
  geom_boxplot(outlier.colour = "Purple") +
  xlab("Year") +
  theme(axis.text.x = element_text(angle = 90))

It is necessary to zoom into the past year, 2022 to see how the exchange rate has faired.

ex_2022 =  filter(ex_rate, Year == 2022) %>% mutate(Month = as.character(Month))

# Plotting
ggplot(ex_2022, aes(Month, United.States.dollar)) +
  geom_histogram(stat = 'identity')
## Warning: Ignoring unknown parameters: binwidth, bins, pad

A little surprisingly, the shilling has not weakened as significantly as would be expected from the events of this year. Since exchange rates are largely public figures there is a good deal of confidence with this subset of data which may not be the case with the other indicators.

Indicator 5: Public Debt

url_debt = "https://www.centralbank.go.ke/uploads/government_finance_statistics/2071018851_Public%20Debt.csv"
filepath_debt = "./Public Debt.csv"

if(!file.exists(filepath_debt)) {
  download.file(url_debt, filepath_debt)
}

# Reading in
public_debt = read.csv("Public Debt.csv" , skip = 3)
# Removing commas from figures
public_debt = apply(public_debt, 2, function(x){str_remove_all(x, ",")}) %>%
                  as.data.frame()


# Summarizing
debt_ave = group_by(public_debt, Year) %>%
  summarise(Domestic.Debt = mean(Domestic.Debt %>% as.numeric / 100000),
            External.Debt = mean(External.Debt %>% as.numeric / 100000),
            Total = mean(Total %>% as.numeric / 100000)
            )
debt_long = pivot_longer(debt_ave, cols = names(debt_ave)[-1], names_to = "Type", values_to = "Value")
ggplot(debt_long, aes(Year, Value, fill = Type)) +
  geom_histogram(stat = 'identity', position = "dodge") +
  ylab("Billion $ ") +
  theme(axis.text.x = element_text(angle = 90))
## Warning: Ignoring unknown parameters: binwidth, bins, pad

National debt has increased exponentially in the past ten years. The figure is now more than four times higher than back then. The government has come to increasingly rely on extensive borrowing to not only develop but to meet the shortfall in the balance of payments. The deficit, as we have seen has also been growing exponentially and it is covered through borrowing.

Indicator 6: Expenditure and Revenue

Now to the final measure we will consider in this report. This one is especially important in the face of rising taxes and calls for lowering government expenditure.

# Downloading
rev_url = "https://www.centralbank.go.ke/uploads/government_finance_statistics/2121218719_Revenue%20and%20Expenditure.csv"
rev_filepath = "./Government Revenue and Expenditure.csv"

if(!file.exists(rev_filepath)) {
  download.file(rev_url, rev_filepath)
}
# Reading in
rev_exp = read.csv(rev_filepath, skip = 5)

# Modifying Column names
## paste is for combining elements of vectors into one, sapply does this for all columns
nms = sapply(names(rev_exp), function(x){ paste(x, rev_exp[1,x])})

## Renaming some columns
nms[1] = "Year"
nms[2] = "Month"

## Cleaning up names
nms = str_to_lower(nms) %>% str_remove_all("\\.") %>% str_trim()

## Renaming dataframe variables
names(rev_exp) = nms

## Removing the first two columns with NAs
rev_exp = filter(rev_exp, !is.na(rev_exp$year))

## Removing commas in figures
rev_exp2 = apply(rev_exp[-(1:2)], 2, function(x) {
    str_remove_all(x, ",") %>% as.numeric / 100000
}) %>% as.data.frame

rev_exp = cbind(rev_exp[1:2], rev_exp2)

Note

Some columns are entirely made up of missing values. These column include other tax income and county transfers. Perhaps these values are missing in the csv file itself or they result from the way the data is read in. However we continue and this will be dealt with possibly in future.

Plotting

This dataset is quite rich in information. We need to choose what variables interest us. Let us begin with the total revenue as compared with total expenditure. We shall then dive deeper.

ggplot(rev_exp, aes(as.character(year), revenue1)) +
  geom_boxplot() +
  labs(title = "Total Revenue") + 
  ylab("Billion $") +
  xlab("Year") +
  theme(axis.text.x = element_text(angle = 90))

ggplot(rev_exp, aes(as.character(year), rev_exp$`total1 expenditure`)) +
  geom_boxplot() +
  labs(title = "Total Expenditure") + 
  ylab("Billion $") +
  xlab("Year") +
  theme(axis.text.x = element_text(angle = 90))
## Warning: Use of `rev_exp$`total1 expenditure`` is discouraged. Use `total1
## expenditure` instead.

Expenditure outstrips revenue as would be expected, but they seem to be growing in perfect lockstep. It is as though the government is always trying to increase revenue collection to meet ever rising expenditures. Expenditures cause rising revenue collection? That is only a hypothesis and will need to be tested.

Now we zoom in to see the revenue collection per month.

rev_exp_month = filter(rev_exp, year == 2021)

ggplot(rev_exp_month, aes(month, revenue1)) +
  geom_histogram(stat = 'identity') +
  labs(title = "Total Revenue in 2021") + 
  ylab("Billion $") +
  xlab("Month") +
  theme(axis.text.x = element_text(angle = 90))
## Warning: Ignoring unknown parameters: binwidth, bins, pad

ggplot(rev_exp_month, aes(month, rev_exp_month$`total1 expenditure`)) +
  geom_histogram(stat = 'identity') +
  labs(title = "Total Expenditure in 2021") + 
  ylab("Billion $") +
  xlab("Month") +
  theme(axis.text.x = element_text(angle = 90))
## Warning: Ignoring unknown parameters: binwidth, bins, pad
## Warning: Use of `rev_exp_month$`total1 expenditure`` is discouraged. Use `total1
## expenditure` instead.

A very strange pattern has emerged. When we zoom into any year and look at the revenue collections for each month, they have the exact distribution. That is not possible in reality. It is too regular to have been a record of real revenue collection. It is either that or a mistake has been made in cleaning up the data.

Final Remarks

The regularity of our results seem to cast doubt on the nature of these statistics as representative of the real economy. With actual collected data we expect noise; that is deviations from a trend caused by the interference of other variables. Such regularity as has been observed especailly with the revenue dataset calls into question the very validity of the data.

For now, this concludes our report. It will be necessary to go into the details of what we have investigated here. But for that we will require other sources of information to corroborate these results.