This notebook shows you how to estimate the Bass model with observed no. of adoptions, as well as how to make predictions with the Bass parameters \(\{p,q,M\}\). You may find some equations and mathematical details are included in the notebook. They are included solely for show-and-tell purpose. You can skip them if you want.
We first load the MSR package where you can find the
data and functions used in this notebook.
library(MSR)
There are three data frames we can use for the estimation of the Bass model.
Note that the three data frames have the same data
structure. They all contain the no. of “adoptions” at different
time periods. Or, in Bass model term, \(n(t)\). For the data frames, the first
variable “time” refers to different time periods in chronological order.
Depending on the product, the time periods are different. For
iphone data, the time periods are quarters of the year. For
red_bull_ads and sticky_notes app, the time
periods are days. The second variable no_of_adoptions
refers to the no. of sales or downloads or watches at each time
period.
# load data from the package
data(list = c("iphone","red_bull_ads","sticky_notes"))
# see the head of each data frame
head(iphone)
## time no_of_adoptions
## 1 Q2/07 270000
## 2 Q3/07 1119000
## 3 Q4/07 2315000
## 4 Q1/08 1703000
## 5 Q2/08 717000
## 6 Q3/08 6892000
head(red_bull_ads)
## time no_of_adoptions
## 1 8/29/2007 328040
## 2 8/30/2007 342780
## 3 8/31/2007 322680
## 4 9/1/2007 341925
## 5 9/2/2007 279350
## 6 9/3/2007 332700
head(sticky_notes)
## time no_of_adoptions
## 1 8/29/2007 2340934
## 2 8/30/2007 2404545
## 3 8/31/2007 2438134
## 4 9/1/2007 2373025
## 5 9/2/2007 2642917
## 6 9/3/2007 2312700
We now will use iphone data as an example to see how to
analyze the diffusion of iphones with the Bass model in 3 steps. You may
try the same procedure on red_bull_ads and
sticky_notes data.
A bit of recap of the estimation of Bass model. We derive the
estimation function from the Bass equation as below. With a bit of
algebra, we obtain a regression of the no. of adoptions on the
cumulative no. of adoptions and the squared cumulative no. of adoptions.
The mathematical derivation is shown below. These are added just for
reference purpose.
\[
\frac{n\left( t \right)}{M-N\left( t \right)}=p+q\frac{N\left( t
\right)}{M}
\\
\Rightarrow n\left( t \right) =\left[ p+\frac{qN\left( t \right)}{M}
\right] \left[ M-N\left( t \right) \right]
\\
\Rightarrow n\left( t \right) =a+bN\left( t \right) +c\left[ N\left( t
\right) \right] ^2
\] With a bit of algebra, we know that:
\[
\begin{cases}
a\,\,=pM\\
b=q-p\\
c=-\frac{q}{M}\\
\end{cases}\,\,AND\,\,\begin{cases}
M=\frac{-b\pm \sqrt{b^2-4ac}}{2c}\\
p=\frac{a}{M}\\
q=-Mc\\
\end{cases}\,\,
\] With the math derivation, we can now start our analysis.
estimate_bass function in MSRYou can use a function in the MSR package to estimate
the model. The function takes one input: the no. of adoptions at each
time ordered chronologically. It outputs the estimated Bass parameters
\(\{p,q,M\}\) as a vector. Please use
?estimate_bass or help(estimate_bass) to see
more details about the function.
# estimate the Bass parameters of iphone
bass.iphone <- estimate_bass(iphone$no_of_adoptions)
bass.iphone
## p q M
## 1.736705e-03 1.304279e-01 1.623974e+09
Note: for the interpretation of the results, you need to know:
Coincidance) higher than those of an iPad as the video is
more contagious.The first step is to run a regression with the Bass equation. We use
\(n(t)\) as the dependent variable, and
\(N(t)\) and \(N(t)^2\) as independent variables. However,
in the data we only have the variable n(t) or the no. of
adoptions at each time period. We first need to create a new variable of
cumulative adoptions or \(N(t)\). For
this, we use the cumsum() function from the base. Please
use ?cumsum to see how this function works.
# creating a variable of cumulative adoptions
iphone$cum_adoptions <- cumsum(iphone$no_of_adoptions)
iphone$cum_adoptions
## [1] 270000 1389000 3704000 5407000 6124000 13016000 17379000
## [8] 21172000 26380000 33747000 42484000 51236000 59634000 73736000
## [15] 89971000 108618000 128956000 146029000 183073000 218137000 244165000
## [22] 271075000 318864000 356294000 387535000 421332000 472357000 516076000
## [29] 551279000 590551000 665019000 726189000 773723000 821769000 896548000
## [36] 947741000 988140000
With the new variable, we can then run a regression to get the coefficients \((a,b,c)\). The estimating equation is described as above. The formulation of the model for the iphone data is like below:
# run a regression with the no. of adoptions as DV and the cumulative adoptions and its squared term as IVs
# the I(cum_adoptions^2) means we are adding the squared term of cum_adoptions to the regression.
mdl_iphone <- lm(no_of_adoptions ~ 1 + cum_adoptions + I(cum_adoptions^2),iphone)
summary(mdl_iphone)
##
## Call:
## lm(formula = no_of_adoptions ~ 1 + cum_adoptions + I(cum_adoptions^2),
## data = iphone)
##
## Residuals:
## Min 1Q Median 3Q Max
## -14154003 -3426209 -1046772 2410199 21584443
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 2.820e+06 2.222e+06 1.269 0.213
## cum_adoptions 1.287e-01 1.504e-02 8.559 5.35e-10 ***
## I(cum_adoptions^2) -8.031e-11 1.679e-11 -4.785 3.26e-05 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 8080000 on 34 degrees of freedom
## Multiple R-squared: 0.864, Adjusted R-squared: 0.856
## F-statistic: 108 on 2 and 34 DF, p-value: 1.858e-15
From the regression, we can obtain the \((a,b,c)\) as in the empirical Bass equation, where \(a\) is the intercept, \(b\) is the coefficient of the cumulative adoptions, and \(c\) is the coefficient of the squared cumulative adoptions.
a_iphone <- mdl_iphone$coefficients[1]
b_iphone <- mdl_iphone$coefficients[2]
c_iphone <- mdl_iphone$coefficients[3]
a_iphone
## (Intercept)
## 2820363
b_iphone
## cum_adoptions
## 0.1286912
c_iphone
## I(cum_adoptions^2)
## -8.031406e-11
Given the \((a,b,c)\) as in the empirical Bass equation, we can recover the Bass parameters \((p,q,M)\) with the equation shown above.
# obtaining M for iphones
# because two roots for the quadratic equation, we get the larger one (or the positive one) as the market size.
# For the max() function, see ?max for more details.
M_iphone <- max((-b_iphone-sqrt(b_iphone^2-4*a_iphone*c_iphone))/(2*c_iphone),
(-b_iphone+sqrt(b_iphone^2-4*a_iphone*c_iphone))/(2*c_iphone))
# given M_iphone, we calculate p and q
p_iphone <- a_iphone/M_iphone
q_iphone <- -c_iphone*M_iphone
Note: for the interpretation of the results, you need to know:
Coincidance) higher than those of an iPad as the video is
more contagious.predcit_bass function in the package MSRWith the estimated Bass parameters of iphone
para.iphone, you can make predictions of the cumulative no.
of adoptions \(N(t)\) of iphone. For
this, you can use the function predict_bass in the package.
The function takes into two inputs: T is the no. of time
periods of your predictions and bass.par is the vector of
Bass parameters in the order of \((p,q,M)\). Please use
?predict_bass or help(predict_bass) for more
details.
In the following example, we set T = 38:60 as the data
ends in quarter 37, and we predict the cumulative sales from quarter 38
onwards (until quarter 60). The Bass parameters are from the previous
estimation results bass.iphone.
# set the value of T
T <- 38:60
# do the prediction
N_iphone <- predict_bass(T,bass.iphone)
N_iphone
## [1] 1079193147 1126238861 1170894001 1212977303 1252369828 1289011098
## [7] 1322893267 1354054012 1382568778 1408542931 1432104245 1453396042
## [13] 1472571161 1489786845 1505200575 1518966777 1531234344 1542144862
## [19] 1551831422 1560417931 1568018803 1574738948 1580673993
Here, N_iphone gives you the cumulative no. of adoptions
of iphones at quarter 38, 39, …, 60. You can then use the results in
your report, e.g., creating a growth curve.
With the estimated values of \((p,q,M)\), we can then make predictions with the Bass model. For the predictions, you need another input \(T\) or the time periods you want to predict. When \(T\) is specified, you can use the functions that we discussed in the lecture to make predictions.
The equations are shown below. \(t\)
stands for the no. of a time period, e.g., time period \(1,2,3,\cdots\). Using the equation, we can
code the prediction of the cumulative no. of adoptions from quarter 38
to 60. The Bass parameters are from the above estimation, i.e.,
p_iphone, q_iphone and M_iphone.
\[N(t) =
M(1-\dfrac{p+q}{pe^{(p+q)t}+q})\]
# assume we are predicting from the quarter 38 to the quarter 60 for iphones.
# note that the data end at quarter 37.
# a vector of the no. of quarters
T <- 38:60
# getting cumulative no. of adoptions at each time period
N_iphone <- M_iphone*(1 -
(p_iphone+q_iphone)/
(p_iphone*exp((p_iphone+q_iphone)*T)+q_iphone))
N_iphone
## [1] 1079193147 1126238861 1170894001 1212977303 1252369828 1289011098
## [7] 1322893267 1354054012 1382568778 1408542931 1432104245 1453396042
## [13] 1472571161 1489786845 1505200575 1518966777 1531234344 1542144862
## [19] 1551831422 1560417931 1568018803 1574738948 1580673993
With the predicted values of N_iphone, you can put it in your report and present the results to your clients as in real practices (e.g., making a growth curve from quarter 38 to 60).