What’s my potential space in this market?
An example with true market data
The table below contains data on various companies, including their unique ID, revenue, and market segment. This data provides the foundation for building a competitive graph, which can help visualize and analyze the relationships between these companies and their respective markets.
id | revenues | market_segment1 | market_segment2 | market_segment3 | market_segment4 | market_segment5 | market_segment6 | market_segment7 | market_segment8 | market_segment9 | market_segment10 | market_segment11 | market_segment12 | market_segment13 | market_segment14 | market_segment15 | market_segment16 | market_segment17 | market_segment18 | market_segment19 | market_segment20 |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
371 | 9.42000 | 0 | 0 | 0 | 0 | 0 | 0 | 1 | 0 | 0 | 0 | 0 | 1 | 0 | 1 | 1 | 1 | 1 | 1 | 0 | 1 |
324 | 13.71000 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 1 |
372 | 5.60000 | 0 | 0 | 0 | 0 | 0 | 1 | 0 | 1 | 0 | 0 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 1 | 1 |
267 | 27.58300 | 0 | 0 | 0 | 0 | 0 | 1 | 1 | 1 | 0 | 0 | 1 | 0 | 0 | 0 | 1 | 1 | 0 | 0 | 0 | 1 |
343 | 14.64000 | 1 | 1 | 1 | 0 | 0 | 0 | 1 | 1 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 1 |
10 | 109.19000 | 0 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 0 | 1 | 1 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 1 | 1 |
11 | 51.65000 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 1 |
15 | 32.83834 | 0 | 1 | 1 | 0 | 0 | 0 | 0 | 1 | 1 | 0 | 0 | 1 | 0 | 1 | 1 | 1 | 1 | 1 | NA | 1 |
361 | 30.33000 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 1 | 0 | 0 | 0 | 1 |
18 | 106.27800 | 0 | 1 | 0 | 1 | 0 | 0 | 0 | 1 | 1 | 0 | 0 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 1 | 1 |
Get the competitive graph
To design the competitive graph, we begin by utilizing the information we have on the companies and their respective markets. Specifically, we will focus on the market segments served by each company. Using this information, we will calculate a distance matrix using the dice metric. Next, we will invert this measure by subtracting it from 1, so that higher values correspond to stronger competitive relationships. We fill a missing with zero. This distance matrix will serve as the basis for constructing the adjacency matrix of our competitive graph.
<- data.matrix(example[, 3:22])
areas is.na(areas)] <- 0 ###MISSING IMPUTATION
areas[<- philentropy::distance(as.matrix(areas), method = "dice")
dist_mat <- 1 - dist_mat
dist_mat is.na(dist_mat)] <- 0 dist_mat[
After calculating the distance matrix, we must determine a threshold for converting these distances into the edges of our competitive graph. To do this, we set the threshold to the median distance value calculated thus far. We then substitute all distances above this threshold with one, and all those below with zero. This process results in the construction of our adjacency matrix, which forms the foundation for our competitive graph.
<- quantile(dist_mat, 0.5)
threshold <- dist_mat
adj_mat > threshold] <- 1 ###COMPETITION
adj_mat[dist_mat <= threshold] <- 0 ###NON-COMPETITION adj_mat[dist_mat
Next, we will use the igraph library to construct the graph from the adjacency matrix. We will ensure that the graph is undirected and will zero out the diagonal entries to eliminate any self-loops. Additionally, we will remove any character indexing to ensure that our graph is fully numerical and can be properly analyzed. We add standardized revenues as vertex attribute.
<- igraph::graph_from_adjacency_matrix(adj_mat, mode = "undirected", diag = FALSE, add.colnames = NA, add.rownames = NA)
competitive_graph
::V(competitive_graph)$standardized_revenues <- scale(example$revenues) igraph
Here you can see a small sample of our competitive graph in its glorious fabric1:
<- fabric_plotter(competitive_graph, n_samp = 16)
plot plot
What if we position a business here and not there?
We will use the spinner
2 package to train a Graph Neural Network (GNN) on the competitive graph to predict the standardized revenues. To optimize the model performance, we need to find the best hyper-parameters that minimize the loss function. After conducting several trials, we determined that the following hyper-parameters work well: 3 graph net layers, including 2 forward layers and tanh activation, message passing from context to edge to node, and aggregation by max value, embedding edge with 20 dimensions and context 10. With this configuration, we achieved a satisfactory (not too bad) result with the loss function.
library(spinner)
<- spinner(competitive_graph, target = "node", node_labels = "standardized_revenues", method = "null", direction = "undirected", edge_embedding_size = 20, context_embedding_size = 10, update_order = "cen", dnn_form = c(1024, 1024), dnn_drop = c(0.5, 0.5), mode = "max", dnn_activ = "tanh", n_layers = 3, holdout = 0.7) model
epoch: 10 Train loss: 0.7082435 Val loss: 0.8235077
epoch: 20 Train loss: 0.7316628 Val loss: 0.7713029
epoch: 30 Train loss: 0.7171735 Val loss: 0.732102
early stop at epoch: 31 Train loss: 0.7159846 Val loss: 0.7981409
epoch: 10 Train loss: 0.6425929 Val loss: 0.6087576
epoch: 20 Train loss: 0.641589 Val loss: 0.6828942
epoch: 30 Train loss: 0.6570954 Val loss: 0.7459118
early stop at epoch: 35 Train loss: 0.5542789 Val loss: 0.821987
epoch: 10 Train loss: 0.7955155 Val loss: 0.9537982
epoch: 20 Train loss: 0.8206618 Val loss: 0.8259481
epoch: 30 Train loss: 0.7490247 Val loss: 0.7744198
early stop at epoch: 32 Train loss: 0.7895444 Val loss: 0.9027098
epoch: 10 Train loss: 0.6783597 Val loss: 0.8640835
epoch: 20 Train loss: 0.6761863 Val loss: 0.8759136
epoch: 30 Train loss: 0.6406475 Val loss: 0.8711847
early stop at epoch: 31 Train loss: 0.6440219 Val loss: 0.899788
time: 217.06 sec elapsed
Now we can explore different what-if scenarios by considering the introduction of a new venture in specific positions of our competitive graph. The following example shows three different scenarios and the resulting predictions from our model in standardized revenues.
<- add.vertices(competitive_graph, 1)
new_graph
<- add.edges(new_graph, c(88, 1, 88, 10, 88, 40, 88, 20, 88, 12))
whatif_1 $pred_fun(whatif_1) model
$standardized_revenues
nodes standardized_revenues
[1,] 88 0.8741768
<- add.edges(new_graph, c(88, 30, 88, 45, 88, 61))
whatif_2 $pred_fun(whatif_2) model
$standardized_revenues
nodes standardized_revenues
[1,] 88 -2.164715
<- add.edges(new_graph, c(88, 45, 88, 72))
whatif_3 $pred_fun(whatif_3) model
$standardized_revenues
nodes standardized_revenues
[1,] 88 0.7093902
Our different what-if scenarios show that the competitive positioning represented in the graph plays a crucial role in shaping the potential market space available (as you can see looking at the predicted standardized revenues). The position of each competitor, their market share, and the intensity of competition (the edges among nodes) are all factors that influence the size and shape of the market. By understanding how these variables interact, we can gain valuable insights into the dynamics of the market and develop effective strategies to improve our competitive position.
Enzoi.