What’s my potential space in this market?

Author

Giancarlo Vercellino

Published

March 5, 2023

An example with true market data

The table below contains data on various companies, including their unique ID, revenue, and market segment. This data provides the foundation for building a competitive graph, which can help visualize and analyze the relationships between these companies and their respective markets.

id	revenues	market_segment1	market_segment2	market_segment3	market_segment4	market_segment5	market_segment6	market_segment7	market_segment8	market_segment9	market_segment10	market_segment11	market_segment12	market_segment14	market_segment15	market_segment16	market_segment17	market_segment18	market_segment19	market_segment20
371	9.42000	0	0	0	0	0	0	1	0	0	0	0	1	1	1	1	1	1	0	1
324	13.71000	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	1
372	5.60000	0	0	0	0	0	1	0	1	0	0	1	0	0	0	0	0	0	1	1
267	27.58300	0	0	0	0	0	1	1	1	0	0	1	0	0	1	1	0	0	0	1
343	14.64000	1	1	1	0	0	0	1	1	1	0	0	0	0	0	0	0	0	0	1
10	109.19000	0	1	1	1	1	1	1	1	0	1	1	1	0	0	0	0	0	1	1
11	51.65000	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	1
15	32.83834	0	1	1	0	0	0	0	1	1	0	0	1	1	1	1	1	1	NA	1
361	30.33000	0	0	0	0	0	0	0	1	0	0	0	0	0	0	1	0	0	0	1
18	106.27800	0	1	0	1	0	0	0	1	1	0	0	1	0	0	0	0	0	1	1

Get the competitive graph

To design the competitive graph, we begin by utilizing the information we have on the companies and their respective markets. Specifically, we will focus on the market segments served by each company. Using this information, we will calculate a distance matrix using the dice metric. Next, we will invert this measure by subtracting it from 1, so that higher values correspond to stronger competitive relationships. We fill a missing with zero. This distance matrix will serve as the basis for constructing the adjacency matrix of our competitive graph.

areas <- data.matrix(example[, 3:22])
areas[is.na(areas)] <- 0 ###MISSING IMPUTATION
dist_mat <- philentropy::distance(as.matrix(areas), method = "dice")
dist_mat <- 1 - dist_mat
dist_mat[is.na(dist_mat)] <- 0

After calculating the distance matrix, we must determine a threshold for converting these distances into the edges of our competitive graph. To do this, we set the threshold to the median distance value calculated thus far. We then substitute all distances above this threshold with one, and all those below with zero. This process results in the construction of our adjacency matrix, which forms the foundation for our competitive graph.

threshold <- quantile(dist_mat, 0.5)
adj_mat <- dist_mat
adj_mat[dist_mat > threshold] <- 1  ###COMPETITION
adj_mat[dist_mat <= threshold] <- 0 ###NON-COMPETITION

Next, we will use the igraph library to construct the graph from the adjacency matrix. We will ensure that the graph is undirected and will zero out the diagonal entries to eliminate any self-loops. Additionally, we will remove any character indexing to ensure that our graph is fully numerical and can be properly analyzed. We add standardized revenues as vertex attribute.

competitive_graph <- igraph::graph_from_adjacency_matrix(adj_mat, mode = "undirected", diag = FALSE, add.colnames = NA, add.rownames = NA)

igraph::V(competitive_graph)$standardized_revenues <- scale(example$revenues)

Here you can see a small sample of our competitive graph in its glorious fabric¹:

plot <- fabric_plotter(competitive_graph, n_samp = 16)
plot

What if we position a business here and not there?

We will use the spinner² package to train a Graph Neural Network (GNN) on the competitive graph to predict the standardized revenues. To optimize the model performance, we need to find the best hyper-parameters that minimize the loss function. After conducting several trials, we determined that the following hyper-parameters work well: 3 graph net layers, including 2 forward layers and tanh activation, message passing from context to edge to node, and aggregation by max value, embedding edge with 20 dimensions and context 10. With this configuration, we achieved a satisfactory (not too bad) result with the loss function.

library(spinner)

model <- spinner(competitive_graph, target = "node", node_labels = "standardized_revenues", method = "null", direction = "undirected", edge_embedding_size = 20, context_embedding_size = 10, update_order = "cen", dnn_form = c(1024, 1024), dnn_drop = c(0.5, 0.5), mode = "max", dnn_activ = "tanh", n_layers = 3, holdout = 0.7)

epoch:  10    Train loss:  0.7082435    Val loss:  0.8235077 
epoch:  20    Train loss:  0.7316628    Val loss:  0.7713029 
epoch:  30    Train loss:  0.7171735    Val loss:  0.732102 
early stop at epoch:  31    Train loss:  0.7159846    Val loss:  0.7981409 
epoch:  10    Train loss:  0.6425929    Val loss:  0.6087576 
epoch:  20    Train loss:  0.641589    Val loss:  0.6828942 
epoch:  30    Train loss:  0.6570954    Val loss:  0.7459118 
early stop at epoch:  35    Train loss:  0.5542789    Val loss:  0.821987 
epoch:  10    Train loss:  0.7955155    Val loss:  0.9537982 
epoch:  20    Train loss:  0.8206618    Val loss:  0.8259481 
epoch:  30    Train loss:  0.7490247    Val loss:  0.7744198 
early stop at epoch:  32    Train loss:  0.7895444    Val loss:  0.9027098 
epoch:  10    Train loss:  0.6783597    Val loss:  0.8640835 
epoch:  20    Train loss:  0.6761863    Val loss:  0.8759136 
epoch:  30    Train loss:  0.6406475    Val loss:  0.8711847 
early stop at epoch:  31    Train loss:  0.6440219    Val loss:  0.899788 
time: 217.06 sec elapsed

Now we can explore different what-if scenarios by considering the introduction of a new venture in specific positions of our competitive graph. The following example shows three different scenarios and the resulting predictions from our model in standardized revenues.

new_graph <- add.vertices(competitive_graph, 1)

whatif_1 <- add.edges(new_graph, c(88, 1, 88, 10, 88, 40, 88, 20, 88, 12))
model$pred_fun(whatif_1)

$standardized_revenues
     nodes standardized_revenues
[1,]    88             0.8741768

whatif_2 <- add.edges(new_graph, c(88, 30, 88, 45, 88, 61))
model$pred_fun(whatif_2)

$standardized_revenues
     nodes standardized_revenues
[1,]    88             -2.164715

whatif_3 <- add.edges(new_graph, c(88, 45, 88, 72))
model$pred_fun(whatif_3)

$standardized_revenues
     nodes standardized_revenues
[1,]    88             0.7093902

Our different what-if scenarios show that the competitive positioning represented in the graph plays a crucial role in shaping the potential market space available (as you can see looking at the predicted standardized revenues). The position of each competitor, their market share, and the intensity of competition (the edges among nodes) are all factors that influence the size and shape of the market. By understanding how these variables interact, we can gain valuable insights into the dynamics of the market and develop effective strategies to improve our competitive position.

Enzoi.

Footnotes

For the fabric layout, you should take a look here.↩︎
Spinner is an implementation of Graph Nets based on torch. For more info, here.↩︎