Decision Analysis in R

Prereqs

This tutorial requires the following packages (DecisionAnalysis will load dependent packages data.tree and gridExtra:

tidyverse: Contains the functions to clean and tidy data along with the common visualization package ggplots2.
data.tree: A general purpose package that allows for the ability build and display hierarchal data.
gridExtra: Allows user to arrange multiple grid-based plots into a single output.
knitr: Produces tables using the kable() function.
DecisionAnalysis: Aides in the multi objective decision analysis process.

library(tidyverse)
library(data.tree)
library(gridExtra)
library(knitr)
library(DecisionAnalysis)

Data

The tutorial uses a dataset that came from NFLSavant.com. This dataset consists of NFL Combine data from 1999 to 2015 and contains almost 5000 players (observations) and 26 demographic or event scores (variables). Below is a sample of the dataset:

NFLcombine <- data.frame(NFLcombine, package = "DecisionAnalysis")
kable(tail(NFLcombine))

	year	name	firstname	lastname	position	heightfeet	heightinches	heightinchestotal	weight	fortyyd	twentyss	vertical	broad	bench	round	college	pick	pickround	picktotal	package
4942	1999	Mark Word	Mark	Word	OLB	6	5	77	258	4.80	4.33	31.5	115	16	0	NA	NA	0	0	DecisionAnalysis
4943	1999	Daren Yancey	Daren	Yancey	DT	6	6	78	303	5.13	0.00	26.5	97	0	6	Brigham Young	19(188)	19	188	DecisionAnalysis
4944	1999	Craig Yeast	Craig	Yeast	WR	5	8	68	164	4.47	4.14	32.5	112	0	4	Kentucky	3(98)	3	98	DecisionAnalysis
4945	1999	Ryan Young	Ryan	Young	OT	6	6	78	335	5.57	0.00	0.0	0	20	7	Kansas State	17(223)	17	223	DecisionAnalysis
4946	1999	Peppi Zellner	Peppi	Zellner	DE	6	5	77	246	4.72	4.49	35.5	122	20	4	Fort Valley State	37(132)	37	132	DecisionAnalysis
4947	1999	Amos Zereoue	Amos	Zereoue	RB	5	8	68	203	4.53	4.09	0.0	0	0	3	West Virginia	34(95)	34	95	DecisionAnalysis

Overview

Multi-Objective Decision Analysis (MODA) is a process for making decisions when there are very complex issues involving multiple criteria and multiple parties who may be deeply affected by the outcomes of the decisions.

MODA allows for the selection of a best solution amongst a pool of available alternatives through value trade-offs and factor weighting. Individuals are then able to evaluating each alternative to help decide on a recommendation.

Methodology

MODA is often applied through a ten step process:

Problem Identification.
Identifying and Structuring Objectives.
Measuring the Achievement of Objectives.
Single Attribute Value Functions.
Multi-Attribute Value Functions.
Alternative Generation and Screening.
Alternative Scoring.
Deterministic Sensitivity.
Sensitivity Analysis.
Communicating Results.

Test Dataset

The dataset was reduced to something usable (say quarterbacks from 2011 who had a Wonderlic score) and only certain variables were retained (name, height, weight, forty yard dash, shuttle sprint, vertical jump, broad jump, Wonderlic, and draft round).

Name	Height	Weight	Forty	Shuttle	Vertical	Broad	Wonderlic	Round
Greg McElroy	74	220	4.84	4.45	33.0	107	43	7
Blaine Gabbert	76	234	4.61	4.26	33.5	120	42	1
Christian Ponder	74	229	4.63	4.09	34.0	116	35	1
Ricky Stanzi	76	223	4.87	4.43	32.5	110	30	5
Andy Dalton	74	215	4.83	4.27	29.5	106	29	2
Ryan Mallett	79	253	5.24	NA	24.0	NA	26	3
Cam Newton	77	248	4.56	4.18	35.0	126	21	1
Jake Locker	75	231	4.51	4.12	35.0	115	20	1

Value Hierarchy

A value hierarchy is a way to depict what is important to the decision maker(s) when choosing from the list of alternatives. Objectives are the evaluation considerations that are deemed to be important. Each objective is broken down until it can be measured by a single evaluation measure.

branches<- as.data.frame(matrix(ncol=4,nrow=7))
names(branches)<-c("Level1","Level2","Level3","leaves")
branches[1,]<-rbind("QB","Elusiveness","Speed","Forty")
branches[2,]<-rbind("QB","Elusiveness","Agility","Shuttle")
branches[3,]<-rbind("QB","Size","Height","Height")
branches[4,]<-rbind("QB","Size","Weight","Weight")
branches[5,]<-rbind("QB","Intelligence","","Wonderlic")
branches[6,]<-rbind("QB","Strength","Explosiveness","Vertical")
branches[7,]<-rbind("QB","Strength","Power","Broad")

DecisionAnalysis::value_hierarchy_tree(branches$Level1,branches$Level2,branches$Level3,
leaves=branches$leaves, nodefillcolor = "LightBlue", leavesfillcolor = "Blue", leavesfontcolor = "White")

Value Measures

Taking the evaluation measures that were determined from the value hierarchy, high and low bounds are determined for each criteria. End points are limited to those that fell within the “acceptable” region. This allows us to convert raw data into a criteria score in the next step.

Below shows the table of value measures for out test data set:

Value	Low	High	Measurment
Height	68.0	82.0	Height(in)
Weight	185.0	275.0	Weight(lbs)
Forty	4.3	5.4	Time(sec)
Shuttle	3.8	4.9	Time(sec)
Vertical	21.0	40.0	Height(in)
Broad	90.0	130.0	Distance(in)
Wonderlic	0.0	50.0	Score

Single Attribute Value Function

Single Value Attribute Functions (SAVF) are is used to calculate an individual criteria score from the raw data. The three types of SAVFs are exponential, linear, and categorical. The SAVFs can be either increasing or decreasing.

The bisection technique was used for the linear and exponential SAVFs. To find the bisection, or mid-value point, the decision maker is asked to identify the halfway mark for each value measurement. Below is an example of the three plots.

a1 <- DecisionAnalysis::SAVF_exp_plot(90, 0, 120, 150)
a2 <- DecisionAnalysis::SAVF_linear_plot(10, 0, 20, 100, FALSE)
a3 <- DecisionAnalysis::SAVF_cat_plot(c("Tom", "Bill", "Jerry"), c(0.1, 0.25, 0.65))

gridExtra::grid.arrange(a1, a2, a3, ncol = 2)

SAVF Matrix

For our test data set, the exponential SAVF was used with the mid point of each criteria being the mean value of all drafted quarterbacks. The exponential SAVFs were calculated for each criteria using the custom SAVF_exp_score function then put into a matrix using cbind.

Below is the SAVF matrix for the test data set:

Height <- round(SAVF_exp_score(qb.data$Height , 68, 75.21, 82, TRUE), 3)
Weight <- round(SAVF_exp_score(qb.data$Weight, 185, 224.34, 275, TRUE), 3)
Forty <- round(SAVF_exp_score(qb.data$Forty, 4.3, 4.81, 5.4, FALSE), 3)
Shuttle <- round(SAVF_exp_score(qb.data$Shuttle, 3.8, 4.3, 4.9, FALSE), 3)
Vertical <- round(SAVF_exp_score(qb.data$Vertical, 21, 32.04, 40, TRUE), 3)
Broad <- round(SAVF_exp_score(qb.data$Broad, 90, 111.24, 130, TRUE), 3)
Wonderlic <- round(SAVF_exp_score(qb.data$Wonderlic, 0, 27.08, 50, TRUE), 3)

SAVF_matrix = cbind(qb.data$Name, Height, Weight, Forty, Shuttle, 
                    Vertical, Broad, Wonderlic)
#SAVF_matrix[is.na(SAVF_matrix)] <- 0

kable(SAVF_matrix, caption = "SAVF Scores")

SAVF Scores
	Height	Weight	Forty	Shuttle	Vertical	Broad	Wonderlic
Greg McElroy	0.414	0.45	0.473	0.366	0.553	0.395	0.839
Blaine Gabbert	0.557	0.607	0.688	0.537	0.582	0.726	0.817
Christian Ponder	0.414	0.552	0.669	0.7	0.611	0.621	0.664
Ricky Stanzi	0.557	0.485	0.446	0.383	0.525	0.469	0.56
Andy Dalton	0.414	0.391	0.482	0.528	0.367	0.37	0.539
Ryan Mallett	0.775	0.8	0.128	NA	0.117	NA	0.478
Cam Newton	0.629	0.751	0.737	0.613	0.67	0.888	0.38
Jake Locker	0.485	0.574	0.786	0.671	0.67	0.596	0.36

Multi Attribute Value Function

The final step in determining the alternative’s score is to calculate the Multi Attribute Value Function (MAVF) score. This can be done using a variety of different methods, the simplest being the use of a weight vector that multiplies each attribute’s SAVF by some relative measure of importance. The weights vector is normalized so that the sum of weights is equal to one. The weights for the test set is below:

branches<- as.data.frame(matrix(ncol=5,nrow=7))
names(branches)<-c("Level1","Level2","Level3","leaves","weights")
branches[1,]<-rbind("QB","Elusiveness","Speed","Forty","0.092")
branches[2,]<-rbind("QB","Elusiveness","Agility","Shuttle","0.138")
branches[3,]<-rbind("QB","Size","Height","Height","0.096")
branches[4,]<-rbind("QB","Size","Weight","Weight","0.224")
branches[5,]<-rbind("QB","Intelligence","","Wonderlic","0.07")
branches[6,]<-rbind("QB","Strength","Explosiveness","Vertical","0.152")
branches[7,]<-rbind("QB","Strength","Power","Broad","0.228")

DecisionAnalysis::value_hierarchy_tree(branches$Level1,branches$Level2,branches$Level3,
leaves=branches$leaves,weights=branches$weights, nodefillcolor = "LightBlue", leavesfillcolor = "Blue", leavesfontcolor = "White")

MAVF Scores

The MAVF scores were calculated using the MAVF_Scores function which take the SAVF matrix and multiplies each SAVF score by the associated weight and summing all weighted scores for each alternative returning a single alternative score.

For example, taking Cam Newton from the test data set:

(0.096)(0.63)+(0.224)(0.75)+(0.092)(0.737)+(0.14)(0.613)+(0.15)(0.67)+(0.23)(0.89)+(0.07)(0.38) = 0.712

Below shows all values from our test data set using the MAVF_Scores function:

Height <- round(SAVF_exp_score(qb.data$Height , 68, 75.21, 82, TRUE), 3)
Weight <- round(SAVF_exp_score(qb.data$Weight, 185, 224.34, 275, TRUE), 3)
Forty <- round(SAVF_exp_score(qb.data$Forty, 4.3, 4.81, 5.4, FALSE), 3)
Shuttle <- round(SAVF_exp_score(qb.data$Shuttle, 3.8, 4.3, 4.9, FALSE), 3)
Vertical <- round(SAVF_exp_score(qb.data$Vertical, 21, 32.04, 40, TRUE), 3)
Broad <- round(SAVF_exp_score(qb.data$Broad, 90, 111.24, 130, TRUE), 3)
Wonderlic <- round(SAVF_exp_score(qb.data$Wonderlic, 0, 27.08, 50, TRUE), 3)

SAVF_matrix = cbind(Height, Weight, Forty, Shuttle, 
                    Vertical, Broad, Wonderlic)
weights = c(0.096, 0.224, 0.092, 0.138, 0.152, 0.228, 0.07)
names = qb.data$Name

MAVF <- DecisionAnalysis::MAVF_Scores(SAVF_matrix, weights, names)
knitr::kable(MAVF, digits = 4, row.names = FALSE, caption = "MAVF Scores")

MAVF Scores
Name	Score
Cam Newton	0.7119
Blaine Gabbert	0.6380
Jake Locker	0.6030
Christian Ponder	0.6025
Ricky Stanzi	0.4819
Greg McElroy	0.4674
Andy Dalton	0.4224
Ryan Mallett	0.3166

Breakout Graph

After the alternatives were scored, initial analysis is conducted to ensure the rankings are easily understandable and to see if there are any insights or improvements that can be identified. This is done by looking at the deterministic sensitivity of each alternative.

The value breakout graph allows for a quick and easy comparison of how each attribute affected the alternatives. Using the custom MAVF_breakout function the breakout graph below was created from the test data:

DecisionAnalysis::MAVF_breakout(SAVF_matrix, weights, names)

Sensitivity Analysis

Once it is concluded that the model is valid, sensitivity analysis is conducted to determine the impact on the rankings of alternatives to changes in the various assumptions of the model, specifically the weights. The weights represent the relative importance that is attached to each evaluation measure. The sensitivity analysis for the test set is below:

DecisionAnalysis::sensitivity_plot(SAVF_matrix, weights, qb.data$Name, 4)

Sensitivity For All Factors

References

Kirkwood, Craig W. Strategic Decision Making. Wadsworth Publishing Company, 1997.