數據分析是現代經理人需要具備的職能,我們不只要會用數據來建模型,更要會使用模型來制訂策略。 在這個Workshop裡面,我們教大家:
使用 R語言 來建 聯合分析 (Conjoint Anlaysis)模型
使用聯合分析模型來進行 市場模擬 (Market Simulation)
實際操作 市場區隔 、 產品設計 與 產品線規劃
並練習在各種市場情境之下制訂 價格策略 與 競爭策略
課前我們會提供程式和資料,讓大家上課時在自己的電腦上做市場模擬和策略規劃。所以,還沒用過R和RStudio的同學,請在上課之前先到以下網址,自行下載、安裝這兩項免費軟體:
這不是R語言的簡介課程,同學並不需要會寫程式,但是需要具備一些在R的環境下做基本運算的能力。 如果你完全沒用過R,請你看一下 R 語言導論 這個簡短的免費線上課程,它可以很快的幫助你學會R的基本操作。
此外,上課時我們會使用到一些R Packages,上課之前請大家先用以下的指令將它們裝好:
install.packages("grDevices")
install.packages("conjoint")
install.packages("fpc")
install.packages("latex2exp")
install.packages("manipulate")
install.packages("rAmCharts")
install.packages("highcharter")
install.packages("doParallel")
install.packages("foreach")
上課時請攜帶自己的電腦,並儘量裝好上述軟體。
聯合分析 (Conjoint Analysis) 是一種很重要的市場調查工具, 它基本上分成幾個步驟,每一個步驟都有幾種不同的做法:
其中每一種做法都有相對應的R package, Chapman & Feit (2015) 這本新書裡面有以上各種模型的介紹和簡單的案例實作。 由於整體程序相當繁複,實務上通常會採用專門的整合式的工具軟體, 如SawTooth Software, 來做聯合分析。在這裡我們的重點擺在聯合分析在行銷策略上面的應用, 至於聯合分析的簡介,請自行參考下列教學影片:
或者參考 Applied conjoint analysis (Rao 2014), 同學可以到圖書館網站下載這本書。
以下,我們首先練習使用R和簡單的線性模型來做 聯合分析 :
The conjoint package
Estimate attribute importance 估計各屬性的重要性
Estimate part worths 估計各選項的平均價值
Individual’s utility coefficients 估計各選項對個人的價值
然後我們利用聯合分析的結果來制定各種 行銷策略 ,包括:
logit model)IIA principlelibrary(conjoint)
library(fpc)
library(grDevices)
library(latex2exp)
library(manipulate)
library(rAmCharts)
library(highcharter)
rm(list=ls(all=TRUE))
options(digits=4); par0 = par(cex=0.8)
col1 = c('magenta','steelblue','orange','green3','brown')
這是一個茶包的產品設計案例,在這個品項的4個主要 屬性(Attribute) 裡面,
bitter: low, medium, high
veriety: black, green, red
kind: bag, granu, lefty
aroma: yes, no
一共有11個 選項(Levels) :
levels = read.csv("./tea/levels.csv",stringsAsFactors=F)
levels
## levels
## 1 low
## 2 medium
## 3 high
## 4 black
## 5 green
## 6 red
## 7 bags
## 8 granu
## 9 leafy
## 10 yes
## 11 no
從這些設計選項之中研究人員設計出13張 卡片(Profiles) :
profiles = read.csv("./tea/profiles.csv")
profiles
## bitter variety kind aroma
## 1 3 1 1 1
## 2 1 2 1 1
## 3 2 2 2 1
## 4 2 1 3 1
## 5 3 3 3 1
## 6 2 1 1 2
## 7 3 2 1 2
## 8 2 3 1 2
## 9 3 1 2 2
## 10 1 3 2 2
## 11 1 1 3 2
## 12 2 2 3 2
## 13 3 2 3 2
並邀請100位 實驗對象(Respondents) , 請每一個人(依不同的卡片順序)對 每一張卡片(Full-Profile) 做 評分(Rating)
ratings = read.csv("./tea/ratings.csv")
dim(ratings); head(ratings)
## [1] 100 13
## prof1 prof2 prof3 prof4 prof5 prof6 prof7 prof8 prof9 prof10 prof11 prof12 prof13
## 1 8 1 1 3 9 2 7 2 2 2 2 3 4
## 2 0 10 3 5 1 4 8 6 2 9 7 5 2
## 3 4 10 3 5 4 1 2 0 0 1 8 9 7
## 4 6 7 4 9 6 3 7 4 8 5 2 10 9
## 5 5 1 7 8 6 10 7 10 6 6 6 10 7
## 6 10 1 1 5 1 0 0 0 0 0 0 1 1
Extract attribute names
(att = apply(profiles,2,max) )
## bitter variety kind aroma
## 3 3 3 2
Build a mapping between levels and attribute (for latter use)
(attl = unlist(sapply(1:length(att), function(x) rep(x , att[x]))) )
## [1] 1 1 1 2 2 2 3 3 3 4 4
計算各產品屬性的相對重要性
im = caImportance(y=ratings, x=profiles)
names(im) = names(att)
im
## bitter variety kind aroma
## 24.76 32.22 27.15 15.88
計算各設計選項的平均價值
apw = caUtilities(y=ratings, x=profiles, z=levels)
##
## Call:
## lm(formula = frml)
##
## Residuals:
## Min 1Q Median 3Q Max
## -5,189 -2,376 -0,751 2,213 7,513
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 3,5534 0,0907 39,18 < 2e-16 ***
## factor(x$bitter)1 0,2402 0,1325 1,81 0,07 .
## factor(x$bitter)2 -0,1431 0,1149 -1,25 0,21
## factor(x$variety)1 0,6149 0,1149 5,35 1,0e-07 ***
## factor(x$variety)2 0,0349 0,1149 0,30 0,76
## factor(x$kind)1 0,1369 0,1149 1,19 0,23
## factor(x$kind)2 -0,8898 0,1325 -6,72 2,8e-11 ***
## factor(x$aroma)1 0,4108 0,0849 4,84 1,5e-06 ***
## ---
## Signif. codes: 0 '***' 0,001 '**' 0,01 '*' 0,05 '.' 0,1 ' ' 1
##
## Residual standard error: 2,97 on 1292 degrees of freedom
## Multiple R-squared: 0,09, Adjusted R-squared: 0,0851
## F-statistic: 18,3 on 7 and 1292 DF, p-value: <2e-16
names(apw) = c('intercept', levels[,1])
apw
## intercept low medium high black green red bags granu
## 3.55336 0.24023 -0.14311 -0.09711 0.61489 0.03489 -0.64977 0.13689 -0.88977
## leafy yes no
## 0.75289 0.41078 -0.41078
這看起來很像線性迴歸,不過: 你可以看出它對各屬性選項的估值,和一般線性迴歸有什麼不同嗎?
把平均的屬性重要性和選項價值畫在一起
par(mfcol=c(1,2),cex=.8)
barplot(im,las=2,col=1:length(im),main="Attribute Importance")
barplot(apw[2:length(apw)],las=2,col=attl,main="Average Part Worths (utils)")
不過,聯合分析的重點不在於 族群的平均 ,而在於 個人的差異 !
Regardless of the design & model, Respondents’ Part Worths are the most imporatnce output of any conjoint analyses. Given a specific product, these coefficients predict the Utility of the respondents. Each respondent has a unique utility function, and usually the coefficients are organized as a matrix. In this session, we name the matrix as W .
W = caPartUtilities(y=ratings, x=profiles, z=levels)
dim(W); head(W)
## [1] 100 12
## intercept low medium high black green red bags granu leafy yes no
## [1,] 3.394 -1.517 -1.141 2.659 -0.475 -0.675 1.149 0.659 -1.517 0.859 0.629 -0.629
## [2,] 5.049 3.391 -0.695 -2.695 -1.029 0.971 0.057 1.105 -0.609 -0.495 -0.681 0.681
## [3,] 4.029 2.563 -1.182 -1.382 -0.248 2.352 -2.103 -0.382 -2.437 2.818 0.776 -0.776
## [4,] 5.856 -1.149 -0.025 1.175 -0.492 1.308 -0.816 -0.825 -0.149 0.975 0.121 -0.121
## [5,] 6.250 -2.333 2.567 -0.233 -0.033 -0.633 0.667 -0.233 -0.333 0.567 -1.250 1.250
## [6,] 1.578 -0.713 -0.144 0.856 1.456 -0.744 -0.713 0.656 -0.713 0.056 1.595 -1.595
Respondent \(i\)’s utility for product \(j\) defined by \((a_{1j},a_{2j},a_{3j},a_{4j})\) is:
\(u_{i,j} = intercept\left [i \right ] + \sum_{k=1}^4 W\left [i, a_{kj} \right ]\)
For example, respondent 6’s utility for product (low,black,lefty,no) is:
\(u_{6,(low,black,lefty,no)}=1.568-0.713+1.456+0.056+1.595=3.962\)
c(1,2,1,2) 你要如何計算這個產品對每一位受測者的價值呢?
我們可以定義一個 效用函數 UT(pd),用來計算產品pd對(100個)受測者的價值
UT = function(v) rowSums(W[, c(1, v + c(1,4,7,10))])
UT(c(1,2,1,2))
## [1] 1.232 11.197 7.786 5.069 4.301 -0.818 0.754 4.065 3.338 3.980 4.379 4.548 1.107
## [14] -1.124 0.229 1.239 6.317 0.525 4.307 0.007 4.607 6.221 0.418 3.741 8.318 1.232
## [27] 11.197 7.266 5.069 4.301 -0.818 0.754 4.065 3.338 4.113 4.332 4.294 1.265 -1.124
## [40] 0.663 1.446 6.317 3.165 4.307 1.086 3.394 6.221 2.300 -1.841 1.480 1.511 0.345
## [53] 5.772 1.514 3.011 6.103 2.194 6.700 1.706 8.255 7.428 3.659 0.938 4.203 5.400
## [66] 2.510 0.449 5.003 1.514 3.011 6.735 -0.324 1.232 11.197 7.786 5.069 4.301 -0.818
## [79] 1.554 1.232 11.197 7.617 5.069 4.113 1.265 -1.124 2.535 1.446 6.317 4.182 7.111
## [92] 0.354 4.307 1.903 4.607 5.672 2.211 4.978 4.451 4.504
list(c(1,1,1,1), c(2,2,2,2), c(1,2,1,2), c(2,1,2,1)), 你要如何比較這些產品的價值呢?
pds = list(c(1,1,1,1), c(2,2,2,2), c(1,2,1,2), c(2,1,2,1))
uts = sapply(pds, UT )
dim(uts); head(uts)
## [1] 100 4
## [,1] [,2] [,3] [,4]
## [1,] 2.690 -0.568 1.232 0.890
## [2,] 7.835 5.397 11.197 2.035
## [3,] 6.738 1.986 7.786 0.938
## [4,] 3.511 6.869 5.069 5.311
## [5,] 2.401 9.101 4.301 7.201
## [6,] 4.572 -1.618 -0.818 3.772
如果每一位消費者都購買對她最有價值的產品,這些產品的市占率分別是多大呢?
(tb = table(apply(uts, 1, which.max)))
##
## 1 2 3 4
## 59 15 19 7
amPie(data.frame(label=names(tb), value=as.vector(tb)),
inner_radius=50, depth=10, show_values=TRUE, legend=TRUE)
設定每一個選項的成本 costs,然後定義一個成本函數 PC(pd)
costs = list(bitter = c(0.3, 0.2, 0.1),
variety = c(0.9, 0.5, 0.2),
kind = c(0.5, 0.3, 0.6),
arom = c(0.7, 0.4) )
PC = function(pd) { 2.5 + sum(sapply(1:length(att), function(i) costs[[i]][pd[i]] )) }
PC(c(1,1,3,1))
## [1] 5
給定 產品 與 價格,我們可以算出 銷售量 、 營收 、 獲利 和 滲透率 :
Given price \(p\), marginal cost \(c\), and respondent \(i\)’s utility (\(u_i\)):
\(\left\{\begin{matrix} q(p) = \sum_i I(u_i > p) & quantity\\ r(p) = q(p) * p & revenue\\ \pi(p) = q(p) * (p - c) & profit \end{matrix}\right.\)
pqr( seq(p0,p1,0.1), PC(pd1), UT(pd1)) 給定產品pd1,在價格區間seq(p0,p1,0.1)之中,描出:
需求曲線: \(q(p)\)
營收曲線: \(r(p)\)
獲利曲線: \(\pi(p)\)
pqr = function(p, m, u) {
q = sapply(p, function(x) sum(u > x) )
pf = q * (p-m); ip = which.max(pf)
r = q * p; ir = which.max(r)
par(cex=0.7,mar=c(5,4,4,3))
plot(p,q,type='l',main="Demand Cruve",xlab="price",ylab="qty",ylim=c(0,80))
abline(v=p[ip],col='pink'); abline(v=p[ir],col='cyan')
abline(h=seq(20,80,20),lty=3,col='grey')
plot(p, r, type='l',main="Revenue",ylab="",ylim=c(0,300),font.lab=2,
xlab=sprintf("P = %.1f, Q = %d, R = %.1f, Pf = %.1f",
p[ir], q[ir], r[ir], pf[ir] ))
abline(v=p[ir],col='cyan');
abline(h=seq(50,250,50),lty=3,col='grey')
plot(p, pf, type='l',main="Profit",ylab="",ylim=c(0,110),font.lab=2,
xlab=sprintf("P = %.1f, Q = %d, R = %.1f, Pf =%.1f",
p[ip], q[ip], r[ip], pf[ip] ) )
abline(v=p[ip],col='pink');
abline(h=seq(20,100,20),lty=3,col='grey')
text(3.5,100,sprintf("MC = %.1f",m),pos=4,font=2)
}
我們可以把產品pd1價值的直方圖,和需求曲線、營收曲線、獲利曲線畫在一起
pd1 = c(1,1,3,1)
ut1 = UT(pd1)
par(mfcol=c(2,2),cex=0.7,mar=c(5,3,4,3))
hist(ut1,-5:13,main="Distribution of Utility",xaxt='n',col='gray',
border='white',xlab="Utility",ylab="",xlim=c(-4,14))
axis(1,at=seq(-4,14,2))
pqr(seq(3.5,11,0.1),PC(pd1),ut1)
為方便操作,我們還可以把這段程式改成一個市場模擬器。
source("sim1.R")
利用單一產品的市場模擬程式sim1.R: 根據 APW (see 1.5 above),什麼樣的產品規格和價格,會得到最佳的獲利和營收呢?
這真的是最佳的獲利和營收嗎?試著使用模擬程式,找尋最佳的獲利和營收。
你找出來的結果,與用APW找出來的結果是相同的嗎?
這個練習的策略意涵是什麼?
我們還可以寫段程式,把所有可能的產品(3x3x3x2=54)掃過一遍,把每一個可能產品的最佳營收和獲利都找出來。
pds = as.matrix(expand.grid(1:3,1:3,1:3,1:2))
X = t(apply(pds, 1, function (v) {
c = PC(v)
u = UT(v)
X = t( sapply(seq(3,10,0.1), function (p) {
q = sum(u > p)
c(p, q, q * p, q * (p - c)) }) )
c(mean(u), c, X[which.max(X[,3]),], X[which.max(X[,4]),])
}))
X = data.frame(cbind(pds,X))
colnames(X) = c('v1','v2','v3','v4', # product spec
'ut', # average utility of the product
'cost', # cost
'p1','q1','r1','pf1', # price, quantity, revenue, profit at max. revenue
'p2','q2','r2','pf2' # price, quantity, revenue, profit at max. porfit
)
head(X[order(- X$r1),],10) # Max Revenue
## v1 v2 v3 v4 ut cost p1 q1 r1 pf1 p2 q2 r2 pf2
## 21 3 1 3 1 5.235 4.8 5.0 62 310.0 12.4 7.0 27 189.0 59.4
## 19 1 1 3 1 5.572 5.0 3.9 74 288.6 -81.4 9.9 13 128.7 63.7
## 3 3 1 1 1 4.619 4.7 4.5 64 288.0 -12.8 5.8 41 237.8 45.1
## 1 1 1 1 1 4.956 4.9 4.2 66 277.2 -46.2 7.2 23 165.6 52.9
## 2 2 1 1 1 4.573 4.8 3.7 72 266.4 -79.2 5.7 32 182.4 28.8
## 20 2 1 3 1 5.189 4.9 4.3 61 262.3 -36.6 8.1 16 129.6 51.2
## 23 2 2 3 1 4.609 4.5 5.6 46 257.6 50.6 6.6 33 217.8 69.3
## 22 1 2 3 1 4.992 4.6 4.7 54 253.8 5.4 8.2 24 196.8 86.4
## 24 3 2 3 1 4.655 4.4 4.7 54 253.8 16.2 8.5 19 161.5 77.9
## 46 1 1 3 2 4.751 4.7 4.3 59 253.7 -23.6 7.5 20 150.0 56.0
head(X[order(- X$pf2),],10) # Max Profit
## v1 v2 v3 v4 ut cost p1 q1 r1 pf1 p2 q2 r2 pf2
## 22 1 2 3 1 4.992 4.6 4.7 54 253.8 5.4 8.2 24 196.8 86.4
## 49 1 2 3 2 4.171 4.3 5.1 47 239.7 37.6 8.9 17 151.3 78.2
## 24 3 2 3 1 4.655 4.4 4.7 54 253.8 16.2 8.5 19 161.5 77.9
## 51 3 2 3 2 3.833 4.1 5.2 44 228.8 48.4 7.0 25 175.0 72.5
## 27 3 3 3 1 3.970 4.1 6.0 33 198.0 62.7 8.6 16 137.6 72.0
## 23 2 2 3 1 4.609 4.5 5.6 46 257.6 50.6 6.6 33 217.8 69.3
## 50 2 2 3 2 3.787 4.2 5.4 42 226.8 50.4 7.2 23 165.6 69.0
## 54 3 3 3 2 3.149 3.8 5.6 28 156.8 50.4 6.7 23 154.1 66.7
## 19 1 1 3 1 5.572 5.0 3.9 74 288.6 -81.4 9.9 13 128.7 63.7
## 4 1 2 1 1 4.376 4.5 5.3 40 212.0 32.0 7.0 25 175.0 62.5
X,請試著: 在不虧本的前提之下,產生最大營收
在滲透率大於30%的前提之下,將獲利最大化
其實,現實的狀況是,你常常需要在目標不明確的狀況下做策略規劃:請試著自己設定營運目標
找到實現這一個目標的最佳策略
然後,跟大家分享你的目標和策略,和它們背後的邏輯
最後再想看看:老闆為什麼常常不告訴你明確的目標,就叫你做策略規劃呢?
在不虧本的前提之下,產生最大營收
subset(X[order(- X$r1),], pf1>0)[1:5,]
## v1 v2 v3 v4 ut cost p1 q1 r1 pf1 p2 q2 r2 pf2
## 21 3 1 3 1 5.235 4.8 5.0 62 310.0 12.4 7.0 27 189.0 59.4
## 23 2 2 3 1 4.609 4.5 5.6 46 257.6 50.6 6.6 33 217.8 69.3
## 22 1 2 3 1 4.992 4.6 4.7 54 253.8 5.4 8.2 24 196.8 86.4
## 24 3 2 3 1 4.655 4.4 4.7 54 253.8 16.2 8.5 19 161.5 77.9
## 49 1 2 3 2 4.171 4.3 5.1 47 239.7 37.6 8.9 17 151.3 78.2
在滲透率大於30%的前提之下,將獲利最大化
subset(X[order(- X$pf2),], q2>30)[1:5,]
## v1 v2 v3 v4 ut cost p1 q1 r1 pf1 p2 q2 r2 pf2
## 23 2 2 3 1 4.609 4.5 5.6 46 257.6 50.6 6.6 33 217.8 69.3
## 53 2 3 3 2 3.103 3.9 5.5 31 170.5 49.6 5.5 31 170.5 49.6
## 3 3 1 1 1 4.619 4.7 4.5 64 288.0 -12.8 5.8 41 237.8 45.1
## 5 2 2 1 1 3.993 4.4 4.6 50 230.0 10.0 5.6 32 179.2 38.4
## 2 2 1 1 1 4.573 4.8 3.7 72 266.4 -79.2 5.7 32 182.4 28.8
將所有產品的最佳營收和獲利畫出來
df = data.frame(revenue=c(X$r1,X$r2),profit=c(X$pf1,X$pf2),
p=c(X$p1,X$p2),q=c(X$q1,X$q2),
opt=c(rep('opt.revenue',nrow(X)),rep('opt.profit',nrow(X))),
lab=rep(apply(X[,1:4],1,paste0,collapse=''),2) )
hchart(df, "scatter", x=revenue, y=profit, group=opt, lab, p, q) %>%
hc_add_theme(hc_theme_flatdark()) %>%
hc_tooltip(headerFormat = "",valueDecimals=1,borderWidth=2,
hideDelay=100,useHTML=T,padding=3,
pointFormat="<center><b>({point.lab})</b></center> price: {point.p}<br>
qty: {point.q}<br> RV: {point.x}<br> PF: {point.y}") %>%
hc_colors(hex_to_rgba(c('green','orange'), alpha = 0.65)) %>%
hc_legend(floating=T,align='left',verticalAlign='bottom')
根據這個圖:
請大家檢討一下你們的策略,你會想要調整你的策略嗎?如何調整呢?
在多重目標的情境之下,什麼樣的策略才是合理的策略?合理的策略要有什麼條件呢?
你可以圖中辨識出哪一些是產品是「合理」的嗎?
Within R, segmenatation is simply a function call
set.seed(123)
S = kmeans(W,3)$cluster
table(S)
## S
## 1 2 3
## 34 28 38
這裡的區隔變數是什麼?
除此之外,還有哪一些常用的區隔變數呢?
它們是最理想的區隔變數嗎?為什麼?
sapply(1:max(S), function(i) colMeans(W[S==i,]))
## [,1] [,2] [,3]
## intercept 4.46106 4.0602 2.36787
## low 0.44326 0.2668 0.03905
## medium 0.29612 -0.7620 -0.08003
## high -0.73918 0.4952 0.04103
## black -0.26476 -0.1239 1.94629
## green 1.10582 0.4047 -1.19582
## red -0.84115 -0.2808 -0.75058
## bags -0.03909 -0.3239 0.63403
## granu -0.40962 -2.0666 -0.45229
## leafy 0.44915 2.3904 -0.18176
## yes -0.08138 0.8737 0.51003
## no 0.08138 -0.8737 -0.51003
APWPS的定義是什麼? 它的策略意涵是什麼?
Let’s build a function Seg(k, seed)
Seg = function(k, seed, seeding=F) {
P = matrix(rep(0, k*4 ), ncol=4)
U = matrix(rep(0, k*nrow(W)), ncol=k )
lx = c(1,2,2)
for(i in 2:k) lx = c(lx, (i-1)*2 + c(1,2,2) )
h = rep.int(1,k)
if(k==3) h[1]=2 else if(k>=4) h[1]=3
layout(matrix(c(1,2,2,2+lx),k+1,3,byrow=T),heights=h )
sd = ifelse(seeding, sample.int(1000,1), seed)
set.seed(sd)
S = kmeans(W,k)$cluster
n = as.vector(table(S))
# cat(k, 'segments, seed =', sd, ', N =', n, '\n')
m = apply(W[,2:ncol(W)], 2, function(x) tapply(x ,S, mean))
par(mar=c(2,1,2,1))
pie(table(S),radius=1,col=col1[1:k],font=2)
dc = discrcoord(W, S)$proj[,1:2]
par(mar=c(1,2,1,2))
plot(dc[,1],dc[,2],type='p',col=col1[S],pch=19,cex=1.5)
for(i in 1:k) {
par(mar=c(2,2,2,1))
barplot(m[i,],las=2,axes=F,axisnames=F,col=col1[i],
width=.5,space=1,border=NA)
abline(h=0, v=c(0,3,6,9,11)+0.25 )
P[i,1] = which.max(m[i,1:3])
P[i,2] = which.max(m[i,4:6])
P[i,3] = which.max(m[i,7:9])
P[i,4] = which.max(m[i,10:11])
U[,i] = rowSums(W[, c(1, P[i,]+c(1,4,7,10)) ])
mtx = sapply(1:k, function (k) sapply(-4:13, function(x)
sum(U[S==k,i] >= x & U[S==k,i] < x+1 )) )
mtx = t(as.matrix(mtx))
par(cex=0.6,mar=c(2,3,2,2))
barplot(mtx,las=2,ylim=c(0,30),col=col1)
abline(h=seq(5,25,5),col='lightgrey',lty=3)
z = paste(P[i,],collapse=', ')
text(0,28,sprintf("Produnt %d: {%s}, %.1f%%",
i,z,100*n[i]/nrow(W)),cex=1.2,pos=4,col=col1[i])
}}
As shown bloew, Seg(k, seed) makes k segments with seed, and plots:
a pie charts that shown the propotions of segments
a scatter chart that marks every respondents on the Reduced Product Attribute Space
Seg(3,779)
在 RStudio 裡面,做模擬程式其實是很容易的:
manipulate( Seg(k, 123, seeding),
k = slider(2,5,2,step=1),
seeding = button("Reset Seed") )
試使用這個模擬程式,回答下列問題:
同一族群的消費者,是否會有相似的價值判斷呢?為甚麼? 你可以從從儀錶板上面觀察到這種現象嗎?
分別在\(k=2,3\),找出你覺得最好的區隔?把seed記下來,大家來比看看。
從儀錶板上面,你如何判斷某一區隔的優劣呢?
適合做產品設計的市場區隔,也會適合拿來做訂價策略、通路策略、媒體策略嗎?
價值係數之外,受測者的人口統計或生活型態變數對我們的行銷策略會有什麼幫助? 我們要如何利用這些資訊呢?
如果市場上有多餘一項產品,消費者做決定就需要同時考慮兩個條件
Given products \(j \in \left \{ 1,2 \right \}\), prices \(p_j\), costs \(c_j\), and respondent \(i\)’s utilities \(u_{i,j}\) :
\(\left\{\begin{matrix} q_j(p_j,p_{j'}) = \sum_i I(u_{i,j} > p_j \wedge u_{i,j} - p_j > u_{i,j'} - p_{j'} ) & quantity\\ r_j(p_j,p_{j'}) = q_j(p_j,p_{j'}) * p_j & revenue\\ \pi_j(p_j,p_{j'}) = q_j(p_j,p_{j'}) * (p_j - c_j) & profit \end{matrix}\right.\)
p1 = seq(4,12,0.1); p2 = seq(4,12,0.1)
pd1 = c(1,1,3,1); pd2 = c(3,2,3,1)
mc1 = PC(pd1); mc2 = PC(pd2)
u1 = UT(pd1); u2 = UT(pd2)
x1 = 7.9; x2 = 6.5
s1 = u1 - x1; s2 = u2 -x2
dcs = rep(4, nrow(W))
dcs[s1 > 0 & s1 > s2] = 1
dcs[s2 > 0 & s2 > s1] = 2
dcs[s1 > 0 & s1 == s2] = 3
Helper function for histograms
clr = c('magenta','blue','gold','gray')
Hist = function(id, p, u, pd1, pd2, dcs) {
mtx = sapply(1:max(dcs), function (d) sapply(-4:13, function(x)
sum(u[dcs==d] >= x & u[dcs==d] < x+1 )) )
mtx = t(as.matrix(mtx))
colnames(mtx) = -4:13
par(cex=0.7,mar=c(3,3,2,2))
barplot(mtx,ylim=c(0,30),col=c(clr[1:2],'pink','gray'),cex.axis=0.8,
width=0.8,space=0.25,border=NA)
abline(v = p + 4.1, col=clr[id])
abline(h=seq(5,25,5),col='lightgrey',lty=3)
if(id==1) pdi=pd1 else pdi=pd2
tz = sprintf("Dist of Utility (%d): %s", id, paste(pdi,collapse=',') )
text(0,28,tz,col=clr[id],pos=4,font=2)
}
Helper function for profit charts
Profit = function (id, p, px, m, u, s2, flag=F) {
q = sapply(p, function(x) sum(u > x & u - x > s2) )
pf = (p-m)*q
pst = p[which.max(pf)]
qx = sum(u > px & u - px > s2) + 0.5*sum(u > px & u - px == s2)
rx = qx * px
pfx = qx * (px-m)
par(cex=0.7,mar=c(3,3,3,2))
plot(p, pf, type='l',ylab="",ylim=c(0,100),xlab="",main=
sprintf("mc = %.1f\nProfit = %.1f @ %.1f",m,pfx,px),
col.main=clr[id], cex.axis=0.8, cex.main=1)
abline(v=px,col=clr[id])
abline(v=pst,lty=2)
abline(h=c(20,40,60,80),col='lightgray',lty=2)
text(max(p)-0.5, 103,TeX(sprintf("%.1f^*",pst)),pos=1)
if(flag) points(max(p)-0.6, 100, pch=19, col='green')
return( c(pfx,rx,pst) )
}
Plot the two-product simulation dashboard
layout(matrix(c(1,2,3,4,5,7,6,7),4,2,byrow=T),
widths=c(1,1.5),heights=c(1,1,0.8,1.2))
f1 = Profit(1, p1, x1, mc1, u1, s2)
Hist(1,x1,u1,pd1,pd2,dcs)
f2 = Profit(2, p2, x2, mc2, u2, s1)
Hist(2,x2,u2,pd1,pd2,dcs)
pene = 100 * sum(dcs!=4)/nrow(W)
tpro = sum(dcs==1)*(x1 - mc1) + sum(dcs==2)*(x2 - mc2)
trev = sum(dcs==1)*x1 + sum(dcs==2)*x2
tb = table(dcs)
par(mar=c(1,1,2,1))
pie(tb,paste0(as.character(tb),'%'),col=clr[as.integer(names(tb))],
main=sprintf("Panetration = %.1f%%",pene),
cex.main=1.1,font=2,cex=1,radius=0.9)
par(mar=c(3,4,3,2),cex=0.7)
bp = cbind(c(f1[1],f1[2]-f1[1],0,0),c(0,0,f2[1],f2[2]-f2[1]))
barplot(bp,names=c('prod-1','prod-2'),main="Revenue & Profit (Margin)",
ylim=c(0,280),col=c(clr[1],'pink',clr[2],'cyan'),
cex.main=1.1,border=NA )
abline(h=seq(50,250,50),lty=3,col='gray')
for(t in list(c(0.7,f1[1],f1[1]),c(0.7,f1[2],f1[2]),
c(1.9,f2[1],f2[1]),c(1.9,f2[2],f2[2]) ) )
text(t[1],t[2]-10,t[3],font=2,pos=3)
text(0.7,250,sprintf("(%.1f%%)",100*f1[1]/f1[2]),font=2,pos=3)
text(1.9,250,sprintf("(%.1f%%)",100*f2[1]/f2[2]),font=2,pos=3)
par(mar=c(6,3,6,2))
plot(u1,u2,pch=20,col=clr[dcs],cex=2,ylab="",xlab="",main=
sprintf("\n\nUtility Space\nTotal Revenue (%.1f) & Profit (%.1f)",
trev,tpro),cex.main=1.1)
abline(v=x1,col=clr[1]);abline(h=x2,col=clr[2])
lines(x=c(x1,x1+10),y=c(x2,x2+10),col='green')
source("sim3.R")
PD1=c(1,1,3,1),它的價格設在$7.5 :
如果我們推出相同的產品,價格設在 $7.0,結果是?
如果我們把價格升高到$7.4,結果會比較好嗎?
如果我們賣$7.4,對手會如何回應呢?
這個賽局最後的結果會是如何?
如果兩個廠商的邊際成本不一樣,結果會是如何?
這段練習有哪些的策略意涵?
假如市場上有兩個競爭產品PD1=c(3,3,3,2) & PD2=c(1,3,3,2) :
均衡時雙方的營收和獲利是?
假如你是PD1,在均衡的狀況之下你會想要調整價格嗎?你會調到什麼價格呢?
假如你是PD2,你會如何回應PD1的價格調整呢?
如果雙方的策略目標都是利潤最大化,結果是如何?
如果PD1策略目標是市占率最大化,結果會一樣嗎?
想看看,這段練習的策略意涵是什麼?
最後我們做一個開放性的練習,假如市場上已有一個競爭產品PD1=c(3,3,3,2) @ $7.5 : 你要用什麼產品和價格進入市場?為什麼?
你的策略目標是什麼?
你的產品規格、價格? 營收、獲利、滲透率?
你確定這是達成你的策略目標最好的策略嗎?你如何才能確定?
看到這個結果之後,你會想要調整你的策略目標嗎?如何調整?
從這個練習裡面,我們學到什麼?
像新產品設計一樣,我們也可用自動模擬程式幫助我們規劃產品線。給定一對產品:
p1 = seq(4,12,0.1); p2 = seq(4,12,0.1)
pd1 = c(1,2,3,1); pd2 = c(3,2,3,1)
mc1 = PC(pd1); mc2 = PC(pd2)
u1 = UT(pd1); u2 = UT(pd2)
q1 = q2 = f0 = f1 = f2 = matrix(rep(0, length(p1)*length(p2)), nrow=length(p1))
for(i in 1:length(p1)) for(j in 1:length(p2)) {
s1= u1 - p1[i]; s2 = u2 - p2[j]
q1[i,j] = sum( s1 > 0 & s1 >= s2 )
q2[i,j] = sum( s2 > 0 & s1 < s2 )
f1[i,j] = (p1[i]-mc1) * q1[i,j]
f2[i,j] = (p2[j]-mc2) * q2[i,j]
f0 = f1 + f2 }
m0 = max(f0); w0 = which(f0 == max(f0), arr.ind = T)
p1s = p1[w0[1]]; p2s = p2[w0[2]]
par(mfcol=c(1,1),mar=c(3,3,3,4))
filled.contour(p1, p2, f0, color=terrain.colors, nlevels=8,
plot.title=title(main="\nPrice Space",
xlab='Price 1',ylab='Price 2',cex.main=1),
key.title=title(main="\nTotal Profit",cex.main=0.8),
key.axes=axis(4, cex.axis=0.8),
plot.axes={
axis(1, cex.axis=0.8); axis(2, cex.axis=0.8)
lines(p1[apply(f1,2,which.max)],p2,col=clr[1])
lines(p1,p2[apply(f2,1,which.max)],col=clr[2])
text(8.5,12,TeX(sprintf("$\\pi^*(%.1f,%.1f)=%.1f$",
p1s,p2s,max(f0))),pos=1)
points(p1s,p2s,pch=19)
})
如果這兩個產品相互競爭,結果會是?
那如果這兩個產品屬於同一家公司呢?
在這個例子裏面,總共有有多少可能的雙產品組合呢?
(3*3*3*2)*(3*3*3*2-1)/2
## [1] 1431
將所有產品組合掃過一遍,然後將最佳的營收和獲利畫出來:
#source('scan.R')
load('df.Rdata')
hchart(df,"scatter",x=revenue,y=profit,group=opt,lab,p1,p2,q,radius=3) %>%
hc_add_theme(hc_theme_flatdark()) %>%
hc_tooltip(headerFormat = "",valueDecimals=1,borderWidth=2,
hideDelay=100,useHTML=T,padding=3,
pointFormat="<center>{point.lab}</center> Price: {point.p1},{point.p2}<br>
Q'ty: {point.q}<br> Rev.: {point.x}<br> Profit: {point.y}") %>%
hc_colors(hex_to_rgba(c('green','orange'), alpha = 0.5)) %>%
hc_legend(floating=T,align='left',verticalAlign='bottom')
The product pair for maximun aggregated profit is:
p1 = seq(4,12,0.1); p2 = seq(4,12,0.1)
pd1 = c(1,2,3,1); pd2 = c(3,3,3,2)
mc1 = PC(pd1); mc2 = PC(pd2)
u1 = UT(pd1); u2 = UT(pd2)
q1 = q2 = f0 = f1 = f2 = matrix(rep(0, length(p1)*length(p2)), nrow=length(p1))
for(i in 1:length(p1)) for(j in 1:length(p2)) {
s1= u1 - p1[i]; s2 = u2 - p2[j]
q1[i,j] = sum( s1 > 0 & s1 >= s2 )
q2[i,j] = sum( s2 > 0 & s1 < s2 )
f1[i,j] = (p1[i]-mc1) * q1[i,j]
f2[i,j] = (p2[j]-mc2) * q2[i,j]
f0 = f1 + f2 }
m0 = max(f0); w0 = which(f0 == max(f0), arr.ind = T)
p1s = p1[w0[1]]; p2s = p2[w0[2]]
par(mfcol=c(1,1),mar=c(3,3,3,4))
filled.contour(p1, p2, f0, color=terrain.colors, nlevels=8,
plot.title=title(main="\nPrice Space",
xlab='Price 1',ylab='Price 2',cex.main=1),
key.title=title(main="\nTotal Profit",cex.main=0.8),
key.axes=axis(4, cex.axis=0.8),
plot.axes={
axis(1, cex.axis=0.8); axis(2, cex.axis=0.8)
lines(p1[apply(f1,2,which.max)],p2,col=clr[1])
lines(p1,p2[apply(f2,1,which.max)],col=clr[2])
text(8.5,12,TeX(sprintf("$\\pi^*(%.1f,%.1f)=%.1f$",
p1s,p2s,max(f0))),pos=1)
points(p1s,p2s,pch=19)
})
這種現象代表什麼? 它的策略意涵是?
The TV dataset is the outcome (part worths matrix for 352 respondent) of a mLogit CBC shared by SawTooth Software. There five attributes:
Brand: JVC, RCA, Sony. It helps to evaluate brand equities
Sound: Mono, Sterio, Sround.
CB (Channel Block): nCB, CB
PIP (Picture in Picture): nPIP, PIP
Price: 300, 350, 400, 450. It helps to evaluate part worths in monetary unit.
pw = read.csv("./tv/tv.csv")
levels = names(pw)
att = c(3,3,3,2,2,4)
names(att) = c("Brand","Screen","Sound","CB","PIP","Price")
levcol = c(1,1,1,2,2,2,3,3,3,4,4,5,5,6,6,6,6)
apw = apply(pw,2,mean)
ppw = 200 * apw[1:13]/(apw[14]-apw[17])
im = c(apw[3]-apw[1], apw[6]-apw[4], apw[9]-apw[7],
apw[11]-apw[10], apw[13]-apw[12], apw[14]-apw[17])
im = 100 * im/sum(im)
names(im) = names(att)
par(mar=c(5,3,4,2),mfrow=c(1,3))
barplot(im,las=2,col=1:length(im),main="Attribute Importance (%)")
abline(h=seq(5,25,5),lty=3,col='grey')
barplot(apw,las=2,col=levcol,main="Part Worths (utils)")
abline(h=seq(-1.5,1.0,0.5),lty=3,col='grey')
barplot(ppw,las=2,col=levcol,main="Part Worths ($)",ylim=c(-120,100))
abline(h=seq(-100,100,50),lty=3,col='grey')
如果把價格也當做一個選項,我們就可以把 效用 的單位轉成 貨幣 單位。
跟JVC相比,消費者平均願意為SONY這個品牌多付多少錢呢?
除了「功能」之外,我們可以把「情感」或「社會」性的屬性,也拿來用聯合分析做產品設計嗎?
除了產品設計之外,你可以舉出聯合分析的其他應用嗎?
你可以講出聯合分析的(全面性的)應用範圍嗎?
In this CBC, the respondents’ part worths are estimated by Multinominal Logit model. mLogit is similar to Logistic, but it acts on target variables of more than two levels. As explained in 1.6, given a specific product, respondent’s utility is still the sum of part worths.
我們一樣是用線性模型估計選項對每一位受測者的效用:
pds = list(c(1,2,1,2,1,1),c(2,1,1,2,2,2),c(3,2,2,1,2,3),c(3,3,3,2,1,4))
ut = sapply(pds, function(pd) rowSums(pw[ pd + c(0,3,6,9,11,13) ]))
head(ut)
## [,1] [,2] [,3] [,4]
## [1,] -0.135 -1.841 2.931 -0.427
## [2,] -2.998 2.629 1.934 0.745
## [3,] 0.473 0.928 -0.411 -1.518
## [4,] 0.534 -0.563 1.148 -1.535
## [5,] -5.063 -2.452 2.753 3.242
## [6,] -3.593 -4.841 2.637 3.054
But instead of making a discrete purchase decision by comparing the utilities, the mLogit model estimates the probability that respondent \(i\) buys product \(j\) as:
\(P_{i,j} = exp(u_{i,j}) / \sum_j exp(u_{i,j})\)
但是,我們不再假設消費者一定會選效用最高的產品,而是依據產品的相對價值,估計每一位消費者會購買各產品(或不買)的機率:
pb = t(apply(ut, 1, function (u) exp(u)/sum(exp(u)) ))
head(pb)
## [,1] [,2] [,3] [,4]
## [1,] 0.0427638 0.0077655 0.9175 0.03193
## [2,] 0.0021753 0.6043557 0.3016 0.09185
## [3,] 0.3199120 0.5042368 0.1322 0.04369
## [4,] 0.3022983 0.1009287 0.5586 0.03818
## [5,] 0.0001529 0.0020819 0.3793 0.61849
## [6,] 0.0007815 0.0002244 0.3968 0.60216
The probabilities for respondent \(i\) to buy product \(j\) is pb[i,j]. Therefore, the expected quantity for each product can be estimated as the sum of the probabilities.
把受測族群的機率加總起來,就是各產品的期望需求數量
(qty = colSums(pb) )
## [1] 40.88 81.04 112.25 117.83
然後,再換算成各產品的市佔率
100 * qty/sum(qty)
## [1] 11.61 23.02 31.89 33.48
為了訂價的連續性,我們常常會需要對價格係數做迴歸(或內、外插):
#utP = function(p) {
# if (p >= 300 & p < 350) ((350-p)*pw$p300 + (p-300)*pw$p350)/50
# else if(p >= 350 & p < 400) ((400-p)*pw$p350 + (p-350)*pw$p400)/50
# else if(p >= 400 & p <= 450) ((450-p)*pw$p400 + (p-400)*pw$p450)/50 }
p1 = c(300,350,400,450)
coef = apply(pw[,14:17], 1, function(x) lm(ut~.,data.frame(ut=x,p1,p1^2,p1^3))$coef )
utP = function(p) t(coef) %*% c(1, p, p^2, p^3)
So we can estimate the utilities, probabilities, and market shares by price , rather than by price level .
pdp = list(c(1,2,1,2,1,325),c(2,1,1,2,2,399),c(3,2,2,1,2,410),c(3,3,3,2,1,450))
ut = sapply(pdp, function(x) rowSums(pw[ x[1:5] + c(0,3,6,9,11) ]) + utP(x[6]))
pb = t(apply(ut, 1, function (u) exp(u)/sum(exp(u)) ))
(qty = colSums(pb) )
## [1] 47.87 55.39 117.98 130.76
100 * qty/sum(qty)
## [1] 13.60 15.74 33.52 37.15
Define the marginal costs of parts
costs = rbind(c(60,70,70),c(50,70,120),c(0,20,80),c(10,50,0),c(0,50,0))
MC = function(pd) sum(sapply(1:5, function(i) costs[i, pd[i]] ))
Define the market as a list of products
pdp = list(c(1,2,2,1,2,0),
c(2,1,1,2,2,300),
c(3,2,2,1,2,350),c(3,3,3,2,1,400))
plst = seq(300,450)
Then we can simulate how the outcome varies with the price of the first product
mx = sapply(plst, function(x) {
pdp[[1]][6] = x
ut = sapply(pdp, function(z) {rowSums(pw[ z[1:5] + c(0,3,6,9,11) ]) + utP(z[6])})
pb = t(apply(ut, 1, function (u) exp(u)/sum(exp(u)) ))
qty = colSums(pb)
100 * qty / sum(qty) })
par(mfrow=c(1,2),cex=0.8)
profit = mx[1,] * (plst - MC(pdp[[1]]))
w = which.max(profit); pst=plst[w]
plot(plst,mx[1,],type='l',col=1,ylim=c(0,45),main="Market Share",
ylab="",xlab="Price")
abline(v=pst,col='pink' )
for(i in 2:nrow(mx)) lines(plst,mx[i,],col=i)
plot(plst,profit,type='l',ylim=c(0,3000),ylab="",xlab="Price",
main=sprintf("Profit (%.1f @ %d)",profit[w],pst))
abline(v=pst,col='pink' )
黑色產品藉由降價所獲得的市占主要來自哪一個產品? 為什麼?
source("sim4.R")
end of script