Laden der benötigten Libraries

Die folgenden Libaries werden geladen:

Plotly: Visualisierungslibrary für interaktive Plots und Tabellen
Tidyverse: Sammlung verschiedener Libraries zur Formatierung, Manipulation und Visualisierung von Daten
MC2D: Library zur Erstellung zweidimmensionaler Datasets durch Simulationen, ähnlich der Base R Funktionen rnorm und runif allerdings mit unterschiedlichsten Verteilungen

library(plotly)
library(tidyverse)
library(mc2d)

1. Eingabe der verwendeten Szeneriowerte

Zunächst gibt der Nutzer Schätzungen verschiedener Szenariowerte seiner Kosten und seines Kostenverhaltens in das Tool ein.

1.1 Eingabe von Inflation und Kostenwachstum

Als erstes muss der Nutzer eine Einschätzung der Inflation und des durchschnittlichen Kostenwachstums über die nächsten fünf Jahre treffen.

yearly_inflation = 1
yearly_cost_increase = 2

1.2 Eingabe der Schätzwerte für die verschiedenen Kostengruppen

Eingabe eines minimal, eines maximal und eines real erwarteten Schätzwertes für die jeweiligen Kostengruppen Materialkosten, Personalkosten, andere operative Kosten, Abschreibungen und Zinskosten durch den Nutzer. Desweitern muss der Nutzer für jede Kostengruppe eine der drei Verteilungen PERT Beta, Normal oder Triangular Verteilung wählen, welche das Risiko der jeweiligen Kostengruppen am besten beschreibt.

material_cost_min = 200000 
material_cost_base = 245000 
material_cost_max = 340000
material_cost_distribution = "PERT Beta" #Eingabe "Pert Beta", "Triangular" oder "Normal"

personnel_cost_min = 300000
personnel_cost_base = 340000
personnel_cost_max = 395000
personnel_cost_distribution = "PERT Beta" #Eingabe "Pert Beta", "Triangular" oder "Normal"

otheroperating_cost_min = 150000
otheroperating_cost_base = 160000
otheroperating_cost_max = 185000
otheroperating_cost_distribution = "PERT Beta" #Eingabe "Pert Beta", "Triangular" oder "Normal"

depreciation_amortization_min = 60000
depreciation_amortization_base = 70000
depreciation_amortization_max = 80000
depreciation_amortization_distribution = "PERT Beta" #Eingabe "Pert Beta", "Triangular" oder "Normal"

interests_cost_min = 30000
interests_cost_base = 35000
interests_cost_max = 42000
interests_cost_distribution = "PERT Beta" #Eingabe "Pert Beta", "Triangular" oder "Normal"

2. Simulation der Kostenverteilungen

Im zweiten Schritt werden anhand der vorher durch den Nutzer eingegeben Kostenparameter für die jeweiligen Kostengruppen Datensets simuliert. Pro Kostengruppe werden somit 10.000 Werte mit der festgelegten Verteilungsfunktion simuliert. Zudem wird die Volatilität des Datensets in Prozent errechnet.

if (material_cost_distribution == "PERT Beta") {
  material_cost_data = rpert(10000, min=material_cost_min, mode=material_cost_base, max=material_cost_max, shape=1)
} else if (material_cost_distribution == "Normal") {
  material_cost_data = rnorm(1000, mean=mean(c(material_cost_min, material_cost_base, material_cost_max)), sd=sd(c(material_cost_min, material_cost_base, material_cost_max)))
} else {
  material_cost_data = rtriang(1000, min=material_cost_min, mode=material_cost_base, max=material_cost_max)
  
}

material_cost_vol = sd(material_cost_data) / mean(material_cost_data)

material_cost_hist = plot_ly(x=~material_cost_data, type="histogram",  marker = list(color = 'rgba(39, 43, 48, 0.65)', line = list(color = 'rgb(39, 43, 48)', width = 1))) %>%  layout(title = "Material Costs Distribution",
            yaxis = list(title = "Frequency",
                         zeroline = FALSE),
            xaxis = list(title = "Material Cost [€]",
                         zeroline = FALSE))
material_cost_hist

3. Simulation der Zeitreihen

Im dritten Schritt wird zunächst für jede Kostengruppe ein Startwert der Zeitreihe aus der Verteilung im Bereich zwischen dem \(35^{ten}\) und dem \(65^{ten}\) Perzentil gezogen um Extremwerte als Aufsatzpunkt zu vermeiden (\(T_{1}\)). Darauffolgend werden pro Kostengruppe 400 Simulationen für die restlichen 19 Quartale errechnet. Zur Simulation wird ein Wienerprozess verwendet der mit der folgenden Formel beschrieben werden kann: \[Kosten_{t} = Kosten_{t-1} * (1 + \frac{Drift}{4} + {Random * Volatilität})\]

Die errechnten Werte der Pfade werden im gleichen Zug in einer 400 x 20 Matrix abgelegt. Mit Hilfe der Matrix werden nun verschiedene Szenarien als Vektoren extrahiert. Bei den Szenarien handelt es sich um Best Case Szenario (minimale Zeitreihe), Base Case Szenario (durchschnittliche Zeitreihe), Worst Case Szenario (maximale Zeitreihe) als auch die Zeitreihen des \(10^{ten}\), \(25^{ten}\), \(75^{ten}\) und \(75^{ten}\) Perzentil.

Notation:

\(Drift = jährliche Inflation + jährliches Kostenwachstum\)

\(Random = Zufällige\;Gleikommazahl\;zwischen-0.5\;und\;0.5\)

\(Volatilität = Volatilität\;des\;jweiligen\;Kostendatensets\)

drift = (yearly_cost_increase + yearly_inflation ) / 100

material_cost_t1 = sample((material_cost_data[material_cost_data >= quantile(material_cost_data, 0.35) & material_cost_data <= quantile(material_cost_data, 0.65)]), 1)

quartersCount = 20
numberIterations = 400

material_cost_paths = matrix(material_cost_t1, numberIterations, quartersCount)
material_cost_av = rep(0,quartersCount)
material_cost_avg = rep(0,quartersCount)
material_cost_minv = rep(0,quartersCount)
material_cost_maxv = rep(0,quartersCount)
material_cost_q10 = rep(0,quartersCount)
material_cost_q25 = rep(0,quartersCount)
material_cost_q75 = rep(0,quartersCount)
material_cost_q90 = rep(0,quartersCount)

for (d in 1:quartersCount ) {
  for (i in 1:numberIterations ) {
    if (d>1) 
      material_cost_paths[i,d] = material_cost_paths[i,d-1] * (1 + drift/4 + (runif(1, -0.5, 0.5) * material_cost_vol))
  }
  material_cost_av[d]=median(material_cost_paths[,d])
  material_cost_avg[d]=mean(material_cost_paths[,d])
  material_cost_minv[d]=min(material_cost_paths[,d])
  material_cost_maxv[d]=max(material_cost_paths[,d])
  material_cost_q10[d]=quantile(material_cost_paths[,d], probs=0.1)
  material_cost_q25[d]=quantile(material_cost_paths[,d], probs=0.25)
  material_cost_q75[d]=quantile(material_cost_paths[,d], probs=0.75)
  material_cost_q90[d]=quantile(material_cost_paths[,d], probs=0.90)
}

4. Aggregation der Kostengruppen

Im vierten Schritt werden die Matrizen der einzelnen Kostengruppen zu den Gesamtkosten aggregiert. Außerdem werden für die Gesamtkosten ebenfalls die selben Szenarien wie für die Kostengruppen ermittelt.

total_cost_t1 = material_cost_t1 + personnel_cost_t1 + otheroperating_cost_t1 + depreciation_amortization_t1 + interests_cost_t1

total_cost_paths = material_cost_paths + personnel_cost_paths + otheroperating_cost_paths + depreciation_amortization_paths + interests_cost_paths

total_cost_av = rep(0,quartersCount)
total_cost_avg = rep(0,quartersCount)
total_cost_minv = rep(0,quartersCount)
total_cost_maxv = rep(0,quartersCount)
total_cost_q10 = rep(0,quartersCount)
total_cost_q25 = rep(0,quartersCount)
total_cost_q75 = rep(0,quartersCount)
total_cost_q90 = rep(0,quartersCount)

for (d in 1:quartersCount) {
  total_cost_av[d]=median(total_cost_paths[,d])
  total_cost_avg[d]=mean(total_cost_paths[,d])
  total_cost_minv[d]=min(total_cost_paths[,d])
  total_cost_maxv[d]=max(total_cost_paths[,d])
  total_cost_q10[d]=quantile(total_cost_paths[,d], probs=0.1)
  total_cost_q25[d]=quantile(total_cost_paths[,d], probs=0.25)
  total_cost_q75[d]=quantile(total_cost_paths[,d], probs=0.75)
  total_cost_q90[d]=quantile(total_cost_paths[,d], probs=0.90)
}

5. Visualsierung der Ergebnisse mit Plotly

Im letzen Schritt werden die Simulationen der einzelnen Kostengruppen in einem Plotly-Liniendiagram visualisiert und die verschiedenen Szenarien in Plotly-Tabellen dargestellt.

5.1 Erstellen der Linien-Diagramme

x_data = factor(c("Q1.20XX", "Q2.20XX", "Q3.20XX", "Q4.20XX", "Q1.20XX+1", "Q2.20XX+1", "Q3.20XX+1", "Q4.20XX+1", "Q1.20XX+2", "Q2.20XX+2", "Q3.20XX+2", "Q4.20XX+2", "Q1.20XX+3", "Q2.20XX+3", "Q3.20XX+3", "Q4.20XX+3",  "Q1.20XX+4", "Q2.20XX+4", "Q3.20XX+4", "Q4.20XX+4"))
x_data = factor(x_data, levels =c("Q1.20XX", "Q2.20XX", "Q3.20XX", "Q4.20XX", "Q1.20XX+1", "Q2.20XX+1", "Q3.20XX+1", "Q4.20XX+1", "Q1.20XX+2", "Q2.20XX+2", "Q3.20XX+2", "Q4.20XX+2", "Q1.20XX+3", "Q2.20XX+3", "Q3.20XX+3", "Q4.20XX+3",  "Q1.20XX+4", "Q2.20XX+4", "Q3.20XX+4", "Q4.20XX+4"))
x_axis = list(title="Quarter", fond=list(size=12))
y_axis = list(title="Costs per quarter [€]", fond=list(size=12))


line_plot_material_cost = plot_ly()
line_plot_material_cost = plot_ly(x = ~x_data, y = ~material_cost_avg, type="scatter", mode="lines", name="Base Case Scenario", line=list(color='rgb(0,0,255)', width=2))
line_plot_material_cost = line_plot_material_cost %>% add_trace(y= ~material_cost_minv, mode="lines", name="Best Case Scenario", line=list(color='rgb(0,255,0)', width=2))
line_plot_material_cost = line_plot_material_cost %>% add_trace(y= ~material_cost_maxv, mode="lines", name="Worst Case Scenario", line=list(color='rgb(255,0,0)', width=2))
line_plot_material_cost = line_plot_material_cost %>% layout(title="5 Year Cost Simulation \n of Material Costs", legend = list(orientation = 'h', font=list(size=6)), xaxis=x_axis, yaxis=y_axis)

for (i in 1:numberIterations) {
  line_plot_material_cost = line_plot_material_cost %>% add_trace(y= material_cost_paths[i,], mode="lines", name= i, line=list(color='rgb(140,140,140)', width=0.5), showlegend=FALSE)
} 
line_plot_material_cost

5.2 Erstellen der Szenariotabellen

table_material_cost = data.frame(material_cost_minv,material_cost_avg,material_cost_maxv,material_cost_q10, material_cost_q25, material_cost_q75, material_cost_q90, row.names=x_data)
names(table_material_cost) = c("Best Case Scenario", "Base Case Scenario", "Worst Case Scenario", "10th Quantile", "25th Quantile", "75th Quantile", "90th Quantile")
table_material_cost = plot_ly(type='table', header = list(values = c("<b>Quarter</b>", names(table_material_cost)), align = c('left', rep('center', ncol(table_material_cost))), line = list(width = 1, color = 'black'),
                      fill = list(color = 'rgb(39, 43, 48)'), font = list(family = "Arial", size = 14, color = "white")),
                      cells = list(values = rbind(rownames(table_material_cost), t(as.matrix(unname(round(table_material_cost, digits=0))))),
                      align = c('left', rep('center', ncol(table_material_cost))), line = list(color = "black", width = 1), fill = list(color = c('rgb(39, 43, 48)', 'rgba(39, 43, 48, 0.65)')), font = list(family = "Arial", size = 12, color = c("white","black"))))
table_material_cost

Exkurs: Bootstrapping

Die folgende Datensetgenerierung mit der Bootstrapping Methode zeigt, dass ein solcher Ansatz mit der Eingabe von nur drei Parametern pro Kostengruppe nicht zielführend ist, um auf Basis der Distribution einen Forecast zu erstellen.

material_cost_data_bootstrap = sample(c(material_cost_min, material_cost_base, material_cost_max),10000, replace = TRUE)

material_cost_bootstrap_hist = plot_ly(x=~material_cost_data_bootstrap, type="histogram",  marker = list(color = 'rgba(39, 43, 48, 0.65)', line = list(color = 'rgb(39, 43, 48)', width = 1))) %>%  layout(title = "Material Costs Distribution Bootstrapping",
            yaxis = list(title = "Frequency",
                         zeroline = FALSE),
            xaxis = list(title = "Material Cost [€]",
                         zeroline = FALSE))
material_cost_bootstrap_hist

Monte Carlo Cost Simulation

Aleksei Dubroskii, Nils Gimpl, David Weking

6/4/2020